AI and Machine Learning: Difference between revisions

Jump to navigation Jump to search
Tips about data storage
(Marked this version for translation)
(Tips about data storage)
Line 53: Line 53:
<!--T:10-->
<!--T:10-->
Compute Canada provides a wide range of storage options to cover the needs of our very diverse users. These storage solutions range from high-speed temporary local storage to different kinds of long-term storage, so you can choose the storage medium that best corresponds to your needs and usage patterns. Please refer to our documentation on [[Storage and file management]].
Compute Canada provides a wide range of storage options to cover the needs of our very diverse users. These storage solutions range from high-speed temporary local storage to different kinds of long-term storage, so you can choose the storage medium that best corresponds to your needs and usage patterns. Please refer to our documentation on [[Storage and file management]].
Here are some tips:
* If your dataset is around 10 GB* or below, it can probably fit in memory, depending on how much memory your job has. You should not read the data from disk during your machine learning task.
* If your dataset is around 100 GB* or below, it can fit in the local storage of the compute node; please transfer it there at the beginning of the job. A temporary directory is available for each job at $SLURM_TMPDIR. An example is given in [[Tutoriel_Apprentissage_machine/en|our tutorial]]. A caveat of local node storage is that another job might be using it fully, leaving you no space (we currently studying this problem).
* If your dataset is larger, you may have to leave it in the shared storage. You can leave your datasets permanently in your project space. Scratch space can be faster, but it is not for permanent storage. Also, all shared storage (home, project, scratch) are for storing and reading large chunks of data at low frequencies / large intervals (1 second or more).


=== Datasets containing lots of small files (e.g. image datasets) === <!--T:11-->
=== Datasets containing lots of small files (e.g. image datasets) === <!--T:11-->
cc_staff
353

edits

Navigation menu