Code and job migration from legacy systems
Contents
Introduction
New national clusters are not identical to any of the legacy clusters so users will have to modify their job submission scripts, and possibly their code. This page outlines the differences, and provides links to various detailed docs.
Quite a lot of the documentation is still under development by the Research Support National Team, especially as we don't yet have the actual machines available for experimentation and testing. However most of the basics are in good shape, and updates are continuing. Full documentation should be ready when the new clusters are made available.
Useful links
- WestGrid Migration Overview: some WestGrid clusters were the first to be defunded in 2017 so this home page for migration has lots of information and links, with frequent updates;
- Recommendations for migration to new systems: overall discussion of the new clusters and recommendations for migration;
- Compute Canada Getting started guide;
- User Accounts and Groups: a few users will be affected by the new central Compute Canada authentication system;
- Cedar: tech specs;
- Graham: tech specs;
- National Data Cyberinfrastructure: project space, scratch space, home space and backup storage;
- General directives for migration: instructions for copying data to the new clusters;
- List of installed open-source software (growing rapidly);
- Programming Guide: languages, compilers, parallelization, debugging, building;
- Visualization: resources, ParaView, VisIT, seminars and events.
Checklist
- Have a quick read of the Compute Canada Getting started guide.
- Take the opportunity to clean up your existing storage (see General directives for migration).
- Copy your source to the home directory on the new clusters.
- Copy input data to the project space on the new clusters.
- Review software suite requirements, and modify your jobs/scripts/applications as necessary.
- Re-compile as necessary for the new architectures.
- Change job scripts to reflect the new scheduler.
- Run test suites
For details see the sections below.
Login differences
The new national clusters use a central authentication system based on Compute Canada identifiers. The regional authentication systems are almost identical but some users may have a different username/password; see User Accounts and Groups for details.
Getting started on the new clusters
The Compute Canada Getting started guide provides a short and straightforward overview of the new clusters. All users (even experienced ones!) are encouraged to have a quick look. You will find access and login instructions there, and links to more detailed pages.
Data migration
See General directives for migration for an overview of data transfer techniques.
Each new cluster has three storage spaces available for computational purposes, plus a backup space.
- High performance scratch space: as with most legacy clusters you should be running your jobs from this space. Please note that it is unallocated and purged, so you should ensure that important data and results are copied to the project space (see below).
- Many legacy clusters had rather informal purge approaches so you may have become lax in your own processes. The new clusters will have strict procedures!
- Persistent project space: this space is mounted on the compute nodes so your jobs do have access to it; the project space is allocated and backed up to tape.
- Small, persistent, backed-up home space.
So our usual recommendation is:
- Keep source and scripts on the home space.
- Keep input data on the project space.
- Run jobs from the scratch space.
- Move important results and output data back to the project space.
For details see the National Data Cyberinfrastructure.
Standard software
The new clusters (Cedar and Graham) will use a distributed filesystem (CVMFS) to provide the same software on both clusters; see the current list of installed open-source software.
Those of you who use standard software packages should have a look at the list. If you would like a package installed please email support@computecanada.ca.
Generally we have installed the most up-to-date version of all packages. There can sometimes be inconsistencies between versions that may affect your job scripts and configuration. If you use older versions you should check the package documentation for details and run your test suite to ensure compatibility. If necessary Compute Canada can install older versions (support@computecanada.ca).
Access with modules
Software packages and libraries use an updated version of the module
configuration system similar to that used by most legacy clusters. There will be differences in the precise calls necessary to link libraries or use packages: see Using Modules for details.
Compilation
You may need to re-compile or re-install your software if you compiled it yourself. This can be a time-consuming process. Note however that most dependencies can be installed as modules by our staff. Please feel free to contact support@computecanada.ca if you need something to be installed.
You may need to link libraries from various software packages. As noted above you should use Modules for your setup, and be aware that the specifics may have changed; see the Programming Guide for language/compiler details, parallelization, debugging, building etc.
Job submission
The new clusters use the open-source Slurm job scheduler. Almost certainly you will have to modify your existing scripts from the legacy clusters which mostly used the Torque/MOAB suite; see Running jobs for details on Slurm jobs.
CernVM File System