Niagara: Difference between revisions

Jump to navigation Jump to search
283 bytes added ,  6 years ago
m
no edit summary
mNo edit summary
mNo edit summary
Line 5: Line 5:
{| class="wikitable"
{| class="wikitable"
|-
|-
| Expected availability: '''Testing and configuration: March 2018. 2018 RACs will be implemented in April, 2018.'''
| Expected availability: April 2018
|-
| Login node: '''niagara.computecanada.ca'''
|-
| Globus endpoint: '''TBA'''
|-
| System Status Page: '''https://wiki.scinet.utoronto.ca/wiki/index.php/System_Alerts'''
|}
|}


<!--T:2-->
<!--T:2-->
Niagara is a homogeneous cluster, owned by the [https://www.utoronto.ca/ University of Toronto] and operated by [https://www.scinethpc.ca/ SciNet], intended to enable large parallel jobs of 1024 cores and more. It was designed to optimize throughput of a range of
Niagara is a homogeneous cluster, owned by the [https://www.utoronto.ca/ University of Toronto] and operated by [https://www.scinethpc.ca/ SciNet], intended to enable large parallel jobs of 1040 cores and more. It was designed to optimize throughput of a range of
scientific codes running at scale, energy efficiency, and network and storage performance and capacity.  
scientific codes running at scale, energy efficiency, and network and storage performance and capacity.  


<!--T:4-->
<!--T:4-->
The user experience on Niagara will be similar to that on Graham
The user experience on Niagara will be similar to that on Graham
and Cedar, but specific instructions on how to use the Niagara cluster
and Cedar, but slightly different. Specific instructions on how to use the Niagara cluster will be available in April 2018.
are still in preparation, given that details of the setup are still in
flux at present (February 2018).


<!--T:5-->
<!--T:5-->
Line 36: Line 40:
* 256 TB burst buffer (Excelero + IBM Spectrum Scale).
* 256 TB burst buffer (Excelero + IBM Spectrum Scale).
* No local disks.
* No local disks.
* No GPUs.
* Rpeak of 4.61 PF.
* Rpeak of 4.61 PF.
* Rmax of 3.0 PF.
* Rmax of 3.0 PF.
Line 43: Line 48:
{| class="wikitable sortable"
{| class="wikitable sortable"
|-
|-
| '''Home space''' <br />Parallel high-performance filesystem (IBM Spectrum Scale) ||
| '''Home space''' <br />600TB total volume<br />Parallel high-performance filesystem (IBM Spectrum Scale) ||
* Location of home directories.
* Location of home directories.
* Available as the <code>$HOME</code> environment variable.
* Available as the <code>$HOME</code> environment variable.
* Each home directory has a small, fixed [[Storage and file management#Filesystem_Quotas_and_Policies|quota]].  
* Each home directory has a small, fixed [[Storage and file management#Filesystem_Quotas_and_Policies|quota]] of 100GB.  
* Not allocated, standard amount for each user. For larger storage requirements, use scratch or project.
* Not allocated, standard amount for each user. For larger storage requirements, use scratch or project.
* Has daily backup.
* Has daily backup.
|-
|-
| '''Scratch space'''<br />7PB total volume<br />Parallel high-performance filesystem (IBM Spectrum Scale)||
| '''Scratch space'''<br />6PB total volume<br />Parallel high-performance filesystem (IBM Spectrum Scale)||
* For active or temporary (<code>/scratch</code>) storage (~ 80 GB/s).
* For active or temporary (<code>/scratch</code>) storage (~ 80 GB/s).
* Available as the <code>$SCRATCH</code> environment variable.
* Available as the <code>$SCRATCH</code> environment variable.
Line 59: Line 64:
| '''Burst buffer'''<br />256TB total volume<br />Parallel extra high-performance filesystem (Excelero+IBM Spectrum Scale)||
| '''Burst buffer'''<br />256TB total volume<br />Parallel extra high-performance filesystem (Excelero+IBM Spectrum Scale)||
* For active fast storage during a job (160GB/s, and very high IOPS).
* For active fast storage during a job (160GB/s, and very high IOPS).
* Data will be purged very frequently (i.e. soon after a job has ended).
* Not fully configured yet, but data will be purged very frequently (i.e. soon after a job has ended) and space on this storage tier will not be RAC allocatable.
* Not allocated.
|-
|-
|'''Project space'''<br />2PB total volume.<br />Parallel high-performance filesystem (IBM Spectrum Scale||
|'''Project space'''<br />3PB total volume.<br />Parallel high-performance filesystem (IBM Spectrum Scale||
* Allocated via [https://www.computecanada.ca/research-portal/accessing-resources/resource-allocation-competitions/ RAC].
* Allocated via [https://www.computecanada.ca/research-portal/accessing-resources/resource-allocation-competitions/ RAC].
* For active but low data turnover storage and relatively fixed datasets
* For active but low data turnover storage and relatively fixed datasets
Line 81: Line 85:
<!--T:11-->
<!--T:11-->
The Niagara cluster has an EDR Infiniband network in a so-called
The Niagara cluster has an EDR Infiniband network in a so-called
'Dragonfly+' topology, with four wings.  Each wing (of 375 nodes) has
'Dragonfly+' topology, with four wings.  Each wing of maximually 432 nodes (i.e., 17280) has
1-to-1 connections.  Network traffic between wings is done through
1-to-1 connections.  Network traffic between wings is done through
adaptive routing, which alleviates network congestion.
adaptive routing, which alleviates network congestion and yields an effective blocking of 2:1 between nodes of different wings.


=Node characteristics= <!--T:12-->
=Node characteristics= <!--T:12-->
Line 89: Line 93:
<!--T:13-->
<!--T:13-->
* CPU: 2 sockets with 20 Intel Skylake cores (2.4GHz, AVX512), for a total of 40 cores per node
* CPU: 2 sockets with 20 Intel Skylake cores (2.4GHz, AVX512), for a total of 40 cores per node
* Computational perfomance: 3 TFlops (theoretical maximum)
* Computational performance: 3 PFlops (LINPACK), 4.61 PFlops theoretical peak.
* Network connection: 100Gb/s EDR  
* Network connection: 100Gb/s EDR Dragonfly+
* Memory: 202 GB (188 GiB) of RAM, i.e., a bit over 4GiB per core.
* Memory: 202 GB (188 GiB) of RAM, i.e., a bit over 4GiB per core.
* Local disk: none.
* Local disk: none.
cc_staff
142

edits

Navigation menu