Transferring data: Difference between revisions

Jump to navigation Jump to search
no edit summary
(Marked this version for translation)
No edit summary
Line 56: Line 56:
<br clear="all"/>
<br clear="all"/>
===Rsync=== <!--T:12-->
===Rsync=== <!--T:12-->
[https://en.wikipedia.org/wiki/Rsync Rsync] is a popular tool for ensuring that two separate datasets are the same but can be quite slow if there are a lot of files or there is a lot of latency between the two sites, i.e. they are geographically apart or on different networks. Running rsync will check the modification time and size of each file, and will only transfer the file if one or the other does not match. If you expect modification times not to match on the two systems you can use the "-c" option, which will compute checksums at the source and destination, and transfer only if the checksums do not match.  
[https://en.wikipedia.org/wiki/Rsync Rsync] is a popular tool for ensuring that two separate datasets are the same but can be quite slow if there are a lot of files or there is a lot of latency between the two sites, i.e. they are geographically apart or on different networks. Running <code>rsync</code> will check the modification time and size of each file, and will only transfer the file if one or the other does not match. If you expect modification times not to match on the two systems you can use the <code>-c</code> option, which will compute checksums at the source and destination, and transfer only if the checksums do not match.  


<!--T:26-->
<!--T:26-->
Some users have encountered trouble when using <code>rsync</code> to transfer directories into <code>/project</code>. The <code>-p</code> and <code>-g</code> flags (or equivalently <code>--perms</code> and <code>--group</code>) may cause the group ownership of files to be incorrectly set. Since quotas in <code>/project</code> are enforced based on group ownership this in turn may lead to <code>Disk quota exceeded</code>. The frequently-used <code>-a</code> or <code>--archive</code> flag implies <code>-p</code> and <code>-g</code>. Best advice is to use Globus instead of rsync, but if you choose to use rsync, do not use <code>-p</code> and <code>-g</code>.
When transferring files into the <code>/project</code> file systems, do not use <code>-p</code> and <code>-g</code> flags (or <code>-a</code>, which implies those two). The quotas in <code>/project</code> are enforced based on group ownership, and thus preserving the group ownership will lead to the <code>Disk quota exceeded</code> error message.
Therefore '''<code>rsync -rltv ...</code>''' is a good replacement for <code>rsync -av ...</code> when syncing files into the <code>/project</code> filesystem.
 
If you are using <code>-a</code> when transferring files into the <code>/project</code> file systems, you can add <code>--no-g --no-p</code> to your options, like so
rsync -avz --no-g --no-p FOLDER graham.computecanada.ca:projects/def-professor/
or avoid using <code>-a</code> altogether
rsync -rltv FOLDER graham.computecanada.ca:projects/def-professor/


===Using checksums to check if files match=== <!--T:13-->
===Using checksums to check if files match=== <!--T:13-->
cc_staff
317

edits

Navigation menu