Code and job migration from legacy systems

From CC Doc
Jump to: navigation, search

Other languages:

Introduction

New national clusters are not identical to any of the legacy clusters so users will have to modify their job submission scripts, and possibly their code. This page outlines the differences, and provides links to various detailed docs.

Quite a lot of the documentation is still under development by the Research Support National Team, especially as we don't yet have the actual machines available for experimentation and testing. However most of the basics are in good shape, and updates are continuing. Full documentation should be ready when the new clusters are made available.

Useful links

Checklist

  1. Have a quick read of the Compute Canada Getting Started guide.
  2. Take the opportunity to clean up your existing storage (see General directives for migration).
  3. Copy your source to the home directory on the new clusters.
  4. Copy input data to the project space on the new clusters.
  5. Review software suite requirements, and modify your jobs/scripts/applications as necessary.
  6. Re-compile as necessary for the new architectures.
  7. Change job scripts to reflect the new scheduler.
  8. Run test suites

For details see the sections below.

Login differences

The new national clusters use a central authentication system based on Compute Canada identifiers. The regional authentication systems are almost identical but some users may have a different username/password; see User Accounts and Groups for details.

Getting started on the new clusters

The Compute Canada Getting Started guide provides a short and straightforward overview of the new clusters. All users (even experienced ones!) are encouraged to have a quick look. You will find access and login instructions there, and links to more detailed pages.

Data migration

See General directives for migration for an overview of data transfer techniques.

Each new cluster has three storage spaces available for computational purposes, plus a backup space.

  • High performance scratch space: as with most legacy clusters you should be running your jobs from this space. Please note that it is unallocated and purged, so you should ensure that important data and results are copied to the project space (see below).
    • Many legacy clusters had rather informal purge approaches so you may have become lax in your own processes. The new clusters will have strict procedures!
  • Persistent project space: this space is mounted on the compute nodes so your jobs do have access to it; the project space is allocated and backed up to tape.
  • Small, persistent, backed-up home space.

So our usual recommendation is:

  • Keep source and scripts on the home space.
  • Keep input data on the project space.
  • Run jobs from the scratch space.
  • Move important results and output data back to the project space.

For details see the National Data Cyberinfrastructure.

Standard software

The new clusters (Cedar and Graham) will use a distributed filesystem (CVMFS) to provide the same software on both clusters; see the current list of installed open-source software.

Those of you who use standard software packages should have a look at the list. If you would like a package installed please email support@computecanada.ca.

Generally we have installed the most up-to-date version of all packages. There can sometimes be inconsistencies between versions that may affect your job scripts and configuration. If you use older versions you should check the package documentation for details and run your test suite to ensure compatibility. If necessary Compute Canada can install older versions (support@computecanada.ca).

Access with modules

Software packages and libraries use an updated version of the module configuration system similar to that used by most legacy clusters. There will be differences in the precise calls necessary to link libraries or use packages: see Using Modules for details.

Compilation

You may need to re-compile or re-install your software if you compiled it yourself. This can be a time-consuming process. Note however that most dependencies can be installed as modules by our staff. Please feel free to contact support@computecanada.ca if you need something to be installed.

You may need to link libraries from various software packages. As noted above you should use Modules for your setup, and be aware that the specifics may have changed; see the Programming Guide for language/compiler details, parallelization, debugging, building etc.

Job submission

The new clusters use the open-source Slurm job scheduler. Almost certainly you will have to modify your existing scripts from the legacy clusters which mostly used the Torque/MOAB suite; see Running jobs for details on Slurm jobs.