Graham: Difference between revisions

Jump to navigation Jump to search
no edit summary
m (update links)
No edit summary
Line 9: Line 9:
| Availability: In production since June 2017
| Availability: In production since June 2017
|-
|-
| Login node: '''graham.alliancecan.ca'''
| Login node: <b>graham.alliancecan.ca</b>
|-
|-
| Globus endpoint: '''computecanada#graham-globus'''
| Globus endpoint: <b>computecanada#graham-globus</b>
|-
|-
| Data transfer node (rsync, scp, sftp,...): '''gra-dtn1.alliancecan.ca'''
| Data transfer node (rsync, scp, sftp,...): <b>gra-dtn1.alliancecan.ca</b>
|}
|}


Line 49: Line 49:


<!--T:43-->
<!--T:43-->
On or after the Removal Date we will follow up with the Contact to confirm if the exception is still required.
On or after the removal date we will follow up with the contact to confirm if the exception is still required.


<!--T:41-->
<!--T:41-->
Line 61: Line 61:
{| class="wikitable sortable"
{| class="wikitable sortable"
|-
|-
| '''Home space'''<br />133TB total volume ||
| <b>Home space</b><br />133TB total volume ||
* Location of home directories.
* Location of home directories.
* Each home directory has a small, fixed [[Storage and file management#Filesystem_quotas_and_policies|quota]].  
* Each home directory has a small, fixed [[Storage and file management#Filesystem_quotas_and_policies|quota]].  
Line 67: Line 67:
* Has daily backup.
* Has daily backup.
|-
|-
| '''Scratch space'''<br />3.2PB total volume<br />Parallel high-performance filesystem ||
| <b>Scratch space</b><br />3.2PB total volume<br />Parallel high-performance filesystem ||
* For active or temporary (<code>/scratch</code>) storage.
* For active or temporary (<code>/scratch</code>) storage.
* Not allocated.
* Not allocated.
Line 73: Line 73:
* Inactive data will be purged.
* Inactive data will be purged.
|-
|-
|'''Project space'''<br />16PB total volume<br />External persistent storage
|<b>Project space</b><br />16PB total volume<br />External persistent storage
||
||
* Allocated via [https://alliancecan.ca/en/services/advanced-research-computing/accessing-resources/rapid-access-service RAS] or [https://alliancecan.ca/en/services/advanced-research-computing/accessing-resources/resource-allocation-competition RAC].
* Allocated via [https://alliancecan.ca/en/services/advanced-research-computing/accessing-resources/rapid-access-service RAS] or [https://alliancecan.ca/en/services/advanced-research-computing/accessing-resources/resource-allocation-competition RAC].
Line 104: Line 104:


<!--T:45-->
<!--T:45-->
Graham has dedicated visualization nodes available at '''gra-vdi.alliancecan.ca''' that allow only VNC connections. For instructions on how to use them, see the [[VNC]] page.
Graham has dedicated visualization nodes available at <b>gra-vdi.alliancecan.ca</b> that allow only VNC connections. For instructions on how to use them, see the [[VNC]] page.


=Node characteristics= <!--T:5-->
=Node characteristics= <!--T:5-->
Line 139: Line 139:


<!--T:7-->
<!--T:7-->
Best practice for local on-node storage is to use the temporary directory generated by [[Running jobs|Slurm]], <tt>$SLURM_TMPDIR</tt>. Note that this directory and its contents will disappear upon job completion.
Best practice for local on-node storage is to use the temporary directory generated by [[Running jobs|Slurm]], <code>$SLURM_TMPDIR</code>. Note that this directory and its contents will disappear upon job completion.


<!--T:38-->
<!--T:38-->
Line 163: Line 163:


<!--T:50-->
<!--T:50-->
'''The nodes are available to all users with a maximum job duration of seven days.'''  
<b>The nodes are available to all users with a maximum job duration of seven days.</b>  


<!--T:51-->
<!--T:51-->
Line 169: Line 169:


<!--T:52-->
<!--T:52-->
'''Important''': You should scale the number of CPUs requested, keeping the ratio of CPUs to GPUs at 3.5 or less on 28 core nodes.  For example, if you want to run a job using 4 GPUs, you should request '''at most 14 CPU cores'''.  For a job with 1 GPU, you should request '''at most 3 CPU cores'''.    Users are allowed to run a few short test jobs (shorter than 1 hour) that break this rule to see how your code performs.
<b>Important</b>: You should scale the number of CPUs requested, keeping the ratio of CPUs to GPUs at 3.5 or less on 28 core nodes.  For example, if you want to run a job using 4 GPUs, you should request <b>at most 14 CPU cores</b>.  For a job with 1 GPU, you should request <b>at most 3 CPU cores</b>.    Users are allowed to run a few short test jobs (shorter than 1 hour) that break this rule to see how your code performs.


<!--T:65-->
<!--T:65-->
rsnt_translations
53,756

edits

Navigation menu