Cedar: Difference between revisions

Jump to navigation Jump to search
25 bytes added ,  5 years ago
Marked this version for translation
(expansion complete, remove mention of it)
(Marked this version for translation)
Line 65: Line 65:
=Node types and characteristics= <!--T:6-->
=Node types and characteristics= <!--T:6-->


<!--T:28-->
Cedar has a total of 58,416 CPU cores for computation, and 584 GPU devices.  
Cedar has a total of 58,416 CPU cores for computation, and 584 GPU devices.  


Line 88: Line 89:
|}
|}


<!--T:29-->
Note that the amount of available memory is less than the "round number" suggested by the hardware configuration. For instance, "base" nodes do have 128 GiB of RAM, but some of it is permanently occupied by the kernel and OS. To avoid wasting time by swapping/paging, the scheduler will never allocate jobs whose memory requirements exceed the amount of "available" memory shown above.
Note that the amount of available memory is less than the "round number" suggested by the hardware configuration. For instance, "base" nodes do have 128 GiB of RAM, but some of it is permanently occupied by the kernel and OS. To avoid wasting time by swapping/paging, the scheduler will never allocate jobs whose memory requirements exceed the amount of "available" memory shown above.


Line 96: Line 98:
Most applications will run on either Broadwell or Skylake nodes, and performance differences are expected to be small compared to job waiting times. Therefore we recommend that you do not select a specific node type for your jobs. If it is necessary, use <code>--constraint=skylake</code> or <code>--constraint=broadwell</code>. See [[Running_jobs#Specifying_a_CPU_architecture|Specifying a CPU architecture]].
Most applications will run on either Broadwell or Skylake nodes, and performance differences are expected to be small compared to job waiting times. Therefore we recommend that you do not select a specific node type for your jobs. If it is necessary, use <code>--constraint=skylake</code> or <code>--constraint=broadwell</code>. See [[Running_jobs#Specifying_a_CPU_architecture|Specifying a CPU architecture]].


== Performance ==<!--T:17-->
== Performance == <!--T:17-->
Theoretical peak double precision performance of Cedar is 936 teraflops for CPUs, plus 2,744 for GPUs, yielding over 3.6 petaflops of theoretical peak double precision performance. 22 fully connected "islands" of 32 base or large nodes each have 1024 cores in a fully non-blocking topology (Omni-Path fabric), with each island designed to yield over 30 teraflops of double-precision performance (measured with high performance LINPACK). There is a 2:1 blocking factor between the 1024 core islands.  
Theoretical peak double precision performance of Cedar is 936 teraflops for CPUs, plus 2,744 for GPUs, yielding over 3.6 petaflops of theoretical peak double precision performance. 22 fully connected "islands" of 32 base or large nodes each have 1024 cores in a fully non-blocking topology (Omni-Path fabric), with each island designed to yield over 30 teraflops of double-precision performance (measured with high performance LINPACK). There is a 2:1 blocking factor between the 1024 core islands.  


rsnt_translations
53,464

edits

Navigation menu