- 1 Introduction
- 2 Licensing
- 3 Documentation
- 4 Configuring your own license file
- 5 Cluster Batch Job Submission
- 6 Site Specific Usage
- 7 Additive Manufacturing
Compute Canada is a hosting provider for ANSYS . This means that we have ANSYS software installed on our clusters, but we do not provide a generic license accessible to everyone. However, many institutions, faculties, and departments already have licenses that can be used on our cluster. Once the legal aspects are worked out for licensing, there will be remaining technical aspects. The license server on your end will need to be reachable by our compute nodes. This will require our technical team to get in touch with the technical people managing your license software. In some cases, this has already been done. You should then be able to load the ANSYS modules, and it should find its license automatically. If this is not the case, please contact our Technical support, so that we can arrange this for you.
Available modules are: fluent/16.1, ansys/16.2.3, ansys/17.2, ansys/18.1, ansys/18.2, ansys/19.1, ansys/19.2, ansys/2019R2, ansys/2019R3.
The full ANSYS documentation (for the latest version) can be accessed by following these steps:
- connect to gra-vdi.computecanada.ca with tigervnc as described in VDI Nodes
- open a terminal window and start workbench:
- module load CcEnv StdEnv ansys
- in the upper pulldown menu click the sequence:
- Help -> ANSYS Workbench Help
- once the ANSYS Help page appears click:
Configuring your own license file
Our module for ANSYS is designed to look for license information in a few places. One of those places is your home folder. If you have your own license server, you can write the information to access it in the following format:
setenv("ANSYSLMD_LICENSE_FILE", "<port>@<hostname>") setenv("ANSYSLI_SERVERS", "<port>@<hostname>")
put this file in the folder $HOME/.licenses/. Before an ANSYS license server can be reached from any Compute Canada system firewall configuration changes will likely need to be made, please contact our Technical support to arrange this. To setup your ansys.lic file for the non-free CMC or free SHARCNET license server use the settings in the following table:
In some situations you may also need to obtain an XML file from the institution which operates the license server in order to ensure that ANSYS on the Compute Canada clusters gives priority to the right kind of license. For example to choose a research license instead of a teaching license, a file with name like license.preferences.xml would be placed into directory $HOME/.ansys/v195/licensing/ assuming you are using the ansys/2019R3 module.
Cluster Batch Job Submission
The ANSYS software suite comes with multiple implementations of MPI to support parallel computation. Unfortunately, none of them supports our Slurm scheduler. For this reason, we need special instructions for each ANSYS package on how to start a parallel job. In the sections below, we give examples of submission scripts for some of the packages. If one is not covered and you want us to investigate and help you start it, please contact our Technical support.
Typically you would use the following procedure for running Fluent on one of the Compute Canada clusters:
- Prepare your Fluent job using Fluent from the "ANSYS Workbench" on your Desktop machine up to the point where you would run the calculation.
- Export the "case" file "File > Export > Case..." or find the folder where Fluent saves your project's files. The "case" file will often have a name like FFF-1.cas.gz.
- If you already have data from a previous calculation, which you want to continue, export a "data" file as well (File > Export > Data...) or find it the same project folder (FFF-1.dat.gz).
- Transfer the "case" file (and if needed the "data" file) to a directory on the project or scratch filesystem on the cluster. When exporting, you save the file(s) under a more instructive name than FFF-1.* or rename them when uploading them.
- Now you need to create a "journal" file. It's purpose is to load the case- (and optionally the data-) file, run the solver and finally write the results. See examples below and remember to adjust the filenames and desired number of iterations.
- Adapt the Fluent jobscript below to your needs.
- After running the job you can download the "data" file and import it back to Fluent with File > import > Data....
#!/bin/bash #SBATCH --account=def-group # specify some account #SBATCH --time=00-06:00 # Time limit dd-hh:mm #SBATCH --nodes=2 # Number of compute nodes #SBATCH --cpus-per-task=32 # Number of cores per node #SBATCH --ntasks-per-node=1 # Do not change #SBATCH --mem=0 # All memory on full nodes module load ansys/2019R3 slurm_hl2hl.py --format ANSYS-FLUENT > machinefile NCORE=$((SLURM_NTASKS * SLURM_CPUS_PER_TASK)) fluent 3d -t $NCORE -cnf=machinefile -mpi=intel -affinity=0 -g -i fluent_3.jou
; EXAMPLE FLUENT JOURNAL FILE ; =========================== ; lines beginning with a semicolon are comments ; Read input file (FFF-in.cas): /file/read-case FFF-in ; Run the solver for this many iterations: /solve/iterate 1000 ; Overwrite output files by default: /file/confirm-overwrite n ; Write final output file (FFF-out.dat): /file/write-data FFF-out ; Write simulation report to file (optional): /report/summary y "My_Simulation_Report.txt" ; Exit fluent: exit
; EXAMPLE FLUENT JOURNAL FILE ; =========================== ; lines beginning with a semicolon are comments ; Read compressed input files (FFF-in.cas.gz & FFF-in.dat.gz): /file/read-case-data FFF-in.gz ; Write a compressed data file every 100 iterations: /file/auto-save/data-frequency 100 ; Retain data files from 5 most recent iterations: /file/auto-save/retain-most-recent-files y ; Write data files to output sub-directory (appends iteration) /file/auto-save/root-name output/FFF-out.gz ; Run the solver for this many iterations: /solve/iterate 1000 ; Write final compressed output files (FFF-out.cas.gz & FFF-out.dat.gz): /file/write-case-data FFF-out.gz ; Write simulation report to file (optional): /report/summary y "My_Simulation_Report.txt" ; Exit fluent: exit
; EXAMPLE FLUENT JOURNAL FILE FOR TRANSIENT SIMULATION ; ==================================================== ; lines beginning with a semicolon are comments ; Read only the input case file: /file/read-case "FFF-transient-inp.gz" ; For continuation (restart) read in both case and data input files: ;/file/read-case-data "FFF-transient-inp.gz" ; Write a data (and maybe case) file every 100 time steps: /file/auto-save/data-frequency 100 /file/auto-save/case-frequency if-case-is-modified ; Retain only the most recent 5 data (and maybe case) files: ; [saves disk space if only a recent continuation file is needed] /file/auto-save/retain-most-recent-files y ; Write to output sub-directory (appends flowtime and timestep) /file/auto-save/root-name output/FFF-transient-out-%10.6f.gz ; ##### settings for Transient simulation : ###### ; Set the magnitude of the (physical) time step (delta-t) /solve/set/time-step 0.0001 ; Set the number of time steps for a transient simulation: /solve/set/max-iterations-per-time-step 20 ; Set the number of iterations for which convergence monitors are reported: /solve/set/reporting-interval 1 ; ##### End of settings for Transient simulation. ###### ; Initialize using the hybrid initialization method: /solve/initialize/hyb-initialization ; Perform unsteady iterations for a specified number of time steps: /solve/dual-time-iterate 1000 ; Write final case and data output files: /file/write-case-data "FFF-transient-out.gz" ; Write simulation report to file (optional): /report/summary y "Report_Transient_Simulation.txt" ; Exit fluent: exit
Fluent Journal files can include basically any command from Fluent's Text-User-Interface (TUI); commands can be used to change simulation parameters like temperature, pressure and flow speed. With this you can run a series of simulations under different conditions with a single case file, by only changing the parameters in the Journal file. Refer to the Fluent User's Guide for more information and a list of all commands that can be used.
#!/bin/bash #SBATCH --account=def-group # specify some account #SBATCH --time=00-06:00 # Time limit dd-hh:mm #SBATCH --nodes=2 # Number of compute nodes #SBATCH --cpus-per-task=32 # Number of cores per node #SBATCH --ntasks-per-node=1 # Do not change #SBATCH --mem=0 # All memory on full nodes module load ansys/2019R3 nodes=$(slurm_hl2hl.py --format ANSYS-CFX) cfx5solve -def YOURFILE.def -start-method "Intel MPI Distributed Parallel" -par-dist $nodes <other options>
Note that you may get the following errors in your output file : /etc/tmi.conf: No such file or directory. They do not seem to affect the computation.
Site Specific Usage
The Sharcnet ANSYS license supports a total of 25 running jobs (25 aa_r task) consuming upto 384 hpc cores (384 aa_r_hpc) with unlimited numerical problem size. The license can be used by any Compute Canada user on any Compute Canada system for the purpose of publishable academic research. Individual users are limited to running upto 2 jobs with 64 hpc cores. The license is made available on a first come first serve basis. Should there be a large number of users on a given day, be aware some jobs may fail to start due to insufficient tokens being available at runtime. Such jobs will need to be resubmitted at a later time. If guaranteed (dedicated) token access is required for your research, open a ticket and request a quote for the quantity of tokens needed. Prices would be at cost plus applicable taxes. On 31may2020 the Sharcnet license was upgraded from a CFD (Research CFD) only license to a MCS (Multiphysics Campus Solution) license with the following ANSYS Academic Research products: HF, EM, Electronics HPC, Mechanical and CFD. Researchers wanting to learn ANSYS or work with small (numerical problem size limits) ANSYS simulations may use the (250 aa_t_a) features by special arrangement. Please note that LS-DYNA and Lumerical are not included under the current Sharcnet license. Tokens for these products maybe purchased for dedicated use by submitting a problem ticket directed to Sharcnet.
License Server File
To use the Sharcnet ansys license configure your ansys.lic file as follows unless you are running on a Sharcnet system such as graham or gra-vdi:
[gra-login1:~/.licenses] cat ansys.lic setenv("ANSYSLMD_LICENSE_FILE", "email@example.com") setenv("ANSYSLI_SERVERS", "firstname.lastname@example.org")
Query License Server
ssh graham.computecanada.ca module load ansys
- Check the total number of ANSYS Academic Research licenses (tasks) in use by all users (maximum 25 jobs running):
lmutil lmstat -c $ANSYSLMD_LICENSE_FILE -a | grep "Users of aa_r"
- Check the number of ANSYS hpc licenses in use by all users (maximum 640total-256reserved = 384hpc cores running):
lmutil lmstat -c $ANSYSLMD_LICENSE_FILE -a | grep "Users of aa_r_hpc"
- Check the number of ANSYS Academic Research licenses (tasks) in use by your username (maximum 2 jobs running):
lmutil lmstat -c email@example.com -a | grep ", s" | grep -v licenses | grep $USER | wc -l
- Check where your ANSYS Academic Research licenses are being used if any:
lmutil lmstat -c firstname.lastname@example.org -a | grep ", s" | grep -v licenses | grep $USER
If you discover any licenses unexpectedly in use, in particular due to ansys not exiting cleanly on gra-vdi possibly triggered by a network outage or a file system issue, then reconnect to the same node [gra-vdi3 or gra-vdi4], open a terminal window and run the following command to terminate the rogue processes
pkill -e -u $USER -f "ansys" after which your licenses should be freed.
- connect to gra-vdi.computecanada.ca with TigerVNC
module load SnEnv ansys
- Press y then
enterto accept the two conditions.
enterto use the sharcnet license.
Note running cfx provides options to start the gui for:
1) CFX-Launcher (cfx5 -> cfx5launcher) 2) CFX-Pre (cfx5pre) 3) CFD-Post (cfdpost -> cfx5post) 4) CFX-Solver (cfx5solve)
To get started configure your
~/.licenses/ansys.lic file to point to a license server that has a valid ANSYS Mechanical License. This must be done on all systems where you plan to run the software.
To enable ANSYS Additive Manufacturing in your project do the following 3 steps:
- connect to gra-vdi.computecanada.ca with TigerVNC
- module load CcEnv StdEnv ansys/2019R3
- cd to the directory where your test.wbpj file is located
On a cluster:
- connect to a cluster compute node with TigerVNC
- module load ansys/2019R3
- cd to the directory where your test.wbpj file is located
- click Extensions -> Install Extension
- specify the following /path/to/AdditiveWizard.wbex then click Open: /cvmfs/restricted.computecanada.ca/easybuild/software/2017/Core/ansys/2019R3/v195/aisol/WBAddins/MechanicalExtensions/AdditiveWizard.wbex
- click Extensions -> Manage Extensions and tick Additive Wizard then click Close
ANSYS Additive Manufacturing can be run in Gui Mode on gra-vdi with upto 8cores for 24hours as follows:
On Gra-vdias described above in
- click File -> Open and select test.wbpj then click Open
- click View -> reset workspace if you get a grey screen
- start Mechanical, Clear Generated Data, tick Distributed, specify Cores
- click File -> Save Project -> Solve
- open another terminal and run:
top -u $USER
- kill rogue processes from previous runs if required:
pkill -e -u $USER -f "ansys"
To submit an Additive job to a cluster queue, you must first prepare your additive simulation to run on Compute Canada clusters. To do this open then save your simulation (on gra-vdi OR the cluster you are working on in a salloc session) to initialize the projects internal path configuration as described above in the
Enable Additive section. Next create a slurm script in the directory where your project file is located (similar to one below) and submit it to the queue by doing:
sbatch script.txt Be sure that value of
--ntasks in the slurm script matches the Cores value last set in Mechanical in particular if moving the project to a different cluster. To change the Cores value on a cluster without opening your simulation follow the "Open Mechanical on login node" section found near the bottom of this page.
#!/bin/bash #SBATCH --account=def-account #SBATCH --time=00-06:00 # Time (DD-HH:MM) #SBATCH --ntasks=8 # Number of cores #SBATCH --mem-per-cpu=2G # Memory per core unset SLURM_GTIDS rm -f test_files/.lock module load ansys/2019R3 export KMP_AFFINITY=balanced export I_MPI_HYDRA_BOOTSTRAP=ssh export PATH=/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/bin:$PATH runwb2 -B -F test.wbpj -E "Update();Save(Overwrite=True)"
For parametric studies change
UpdateAllDesignPoints() in the last line of your slurm script. For initial performance testing one can avoid the solution from being written by specifying
Overwrite=False in the slurm script so further runs to be conducted without needing to reopen the simulation in workbench (and mechanical) to clear the solution and recreate the design points. Another option is to create a replay script once and for all in workbench to perform these tasks then run it on the cluster between runs as follows. The replay file can be used in different directories by changing its internal FilePath setting accordingly.
module load ansys/2019R3 rm -f test_files/.lock runwb2 -R myreplay.wbjn
Once your additive job has been running for a few minutes a snapshot of its resource utilization on the compute node(s) can be obtained with the following the srun command. Sample output corresponding to the above 8core submission script as as follows where it can be noticed that two nodes were selected by the schedular:
[demo@gra-login1:~] srun --jobid=jobnumber top -bn1 -u $USER | grep R | grep -v top PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 22843 demo 20 0 2272124 256048 72796 R 88.0 0.2 1:06.24 ansys.e 22849 demo 20 0 2272118 256024 72822 R 99.0 0.2 1:06.37 ansys.e 22838 demo 20 0 2272362 255086 76644 R 96.0 0.2 1:06.37 ansys.e PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4310 demo 20 0 2740212 271096 101892 R 101.0 0.2 1:06.26 ansys.e 4311 demo 20 0 2740416 284552 98084 R 98.0 0.2 1:06.55 ansys.e 4304 demo 20 0 2729516 268824 100388 R 100.0 0.2 1:06.12 ansys.e 4305 demo 20 0 2729436 263204 100932 R 100.0 0.2 1:06.88 ansys.e 4306 demo 20 0 2734720 431532 95180 R 100.0 0.3 1:06.57 ansys.e
After a job completes its elapsed time can be found from the "Job Wall-clock time" output from the
seff jobid. One can use this value to perform scaling tests. If the Wall-clock time decreases by ~50% when the number of cores are doubled (for example from "#SBATCH --ntasks=8" to "#SBATCH --ntasks=16") further core doubling increasements can be investigated. While jobs may run faster when the number of cores is increased, the wait time will also increase significantly unless the research group has a RAC award.
Open mechanical on login node:
This procedure explains howto initialize your mechanical environment on a cluster by opening the simulation on a cluster login node. If the simulation requires more than 8GB which is the typical login node memory limit than a cluster compute node will need to be used. When a simulation is moved to a different cluster the project will need to be opened and saved again if the path and directory location have changed.
* Login to a cluster login node with TigerVNC * Open a terminal window in vncviewer and run: [demo@beluga3:~]
module load ansys/2019R3[demo@beluga3:~]
runwb2o start Mechanical by clicking Component Systems -> Mechanical Model -> Model o under Solve for My Computer enter Cores: 8 o under Solve for My Computer tick Distributed o quit Mechanical by clicking File -> Close Mechanical o quit Workbench by clicking File -> Exit (do not save the current project)
Message Passing Interface