AI and Machine Learning: Difference between revisions

Jump to navigation Jump to search
no edit summary
No edit summary
No edit summary
Line 3: Line 3:
<translate>
<translate>
<!--T:1-->
<!--T:1-->
To get the most out of our clusters for machine learning applications, special care must be taken. A cluster is a complicated beast that is very different from your local machine that you use for prototyping. Notably, a cluster uses a distributed filesystem, linking many storage devices seamlessly. Accessing a file on <tt>/project</tt> may <i>feel the same</i> as accessing one from the current node, but under the hood, these two IO operations have very different performance implications. In short, you need to [[#Managing_your_datasets|choose wisely where to put your data]].
To get the most out of our clusters for machine learning applications, special care must be taken. A cluster is a complicated beast that is very different from your local machine that you use for prototyping. Notably, a cluster uses a distributed filesystem, linking many storage devices seamlessly. Accessing a file on <code>/project</code> may <i>feel the same</i> as accessing one from the current node, but under the hood, these two IO operations have very different performance implications. In short, you need to [[#Managing_your_datasets|choose wisely where to put your data]].


<!--T:2-->
<!--T:2-->
Line 19: Line 19:


<!--T:4-->
<!--T:4-->
Python is very popular in the field of machine learning. If you (plan to) use it on our clusters, please refer to [[Python|our documentation about Python]] to get important information about Python versions, virtual environments on login or on compute nodes, <tt>multiprocessing</tt>, Anaconda, Jupyter, etc.
Python is very popular in the field of machine learning. If you (plan to) use it on our clusters, please refer to [[Python|our documentation about Python]] to get important information about Python versions, virtual environments on login or on compute nodes, <code>multiprocessing</code>, Anaconda, Jupyter, etc.


=== Avoid Anaconda === <!--T:21-->
=== Avoid Anaconda === <!--T:21-->
Line 65: Line 65:
<!--T:13-->
<!--T:13-->
* filesystem [[Storage and file management#Filesystem_quotas_and_policies|quotas]] on our clusters limit the number of filesystem objects;
* filesystem [[Storage and file management#Filesystem_quotas_and_policies|quotas]] on our clusters limit the number of filesystem objects;
* your software could be significantly slowed down from streaming lots of small files from <tt>/project</tt> (or <tt>/scratch</tt>) to a compute node.
* your software could be significantly slowed down from streaming lots of small files from <code>/project</code> (or <code>/scratch</code>) to a compute node.


<!--T:14-->
<!--T:14-->
rsnt_translations
53,731

edits

Navigation menu