Transferring data: Difference between revisions

Jump to navigation Jump to search
no edit summary
No edit summary
No edit summary
Line 24: Line 24:


<!--T:35-->
<!--T:35-->
Note: If you want to transfer files between another of our clusters and Niagara use the SSH agent forwarding flag, <code>-A</code> when logging into another cluster. For example, to copy files to Niagara from Cedar use:
Note: If you want to transfer files between another of our clusters and Niagara use the SSH agent forwarding flag <code>-A</code> when logging into another cluster. For example, to copy files to Niagara from Cedar use:


<!--T:36-->
<!--T:36-->
Line 60: Line 60:
| File size is different || A quick test. If the file size has changed then its contents must have changed, and it will be re-transferred.
| File size is different || A quick test. If the file size has changed then its contents must have changed, and it will be re-transferred.
|-
|-
| Modification time is newer || This will check the file's recorded modification time and only transfer the file if it is newer on the source than the destination. If you want to depend on this it is important to check the "preserve source file modification times" option when initiating a Globus transfer.
| Modification time is newer || This will check the file's recorded modification time and only transfer the file if it is newer on the source than the destination. If you want to depend on this, it is important to check the "preserve source file modification times" option when initiating a Globus transfer.
|}
|}


Line 68: Line 68:
<br clear="all"/>
<br clear="all"/>
===Rsync=== <!--T:12-->
===Rsync=== <!--T:12-->
[https://en.wikipedia.org/wiki/Rsync Rsync] is a popular tool for ensuring that two separate datasets are the same but can be quite slow if there are a lot of files or there is a lot of latency between the two sites, i.e. they are geographically apart or on different networks. Running <code>rsync</code> will check the modification time and size of each file, and will only transfer the file if one or the other does not match. If you expect modification times not to match on the two systems you can use the <code>-c</code> option, which will compute checksums at the source and destination, and transfer only if the checksums do not match.  
[https://en.wikipedia.org/wiki/Rsync Rsync] is a popular tool for ensuring that two separate datasets are the same but can be quite slow if there are a lot of files or there is a lot of latency between the two sites, i.e. they are geographically apart or on different networks. Running <code>rsync</code> will check the modification time and size of each file, and will only transfer the file if one or the other does not match. If you expect modification times not to match on the two systems, you can use the <code>-c</code> option, which will compute checksums at the source and destination, and transfer only if the checksums do not match.  


<!--T:26-->
<!--T:26-->
Line 104: Line 104:


<!--T:18-->
<!--T:18-->
It is possible that the <code>find</code> command will crawl through the directories in a different order resulting in a lot of false differences so you may need to run <code>sort</code> on both files before running diff such as:
It is possible that the <code>find</code> command will crawl through the directories in a different order, resulting in a lot of false differences so you may need to run <code>sort</code> on both files before running diff such as:


<!--T:19-->
<!--T:19-->
Line 116: Line 116:


<!--T:22-->
<!--T:22-->
For example you can connect to a remote machine at <code>ADDRESS</code> as user <code>USERNAME</code> with SFTP to transfer files like so:
For example, you can connect to a remote machine at <code>ADDRESS</code> as user <code>USERNAME</code> with SFTP to transfer files like so:


<!--T:23-->
<!--T:23-->
Line 157: Line 157:


<!--T:29-->
<!--T:29-->
SCP supports an option, <code>-r</code>, to recursively transfer a set of directories and files. We '''recommend against using <code>scp -r</code>''' to transfer data into <code>/project</code> because the setgid bit is turned off in the created directories, which may lead to <code>Disk quota exceeded</code> or similar errors if files are later created there (see [[Frequently_Asked_Questions#Disk_quota_exceeded_error_on_.2Fproject_filesystems | Disk quota exceeded error on /project filesystems]]).
SCP supports the option <code>-r</code> to recursively transfer a set of directories and files. We '''recommend against using <code>scp -r</code>''' to transfer data into <code>/project</code> because the setgid bit is turned off in the created directories, which may lead to <code>Disk quota exceeded</code> or similar errors if files are later created there (see [[Frequently_Asked_Questions#Disk_quota_exceeded_error_on_.2Fproject_filesystems | Disk quota exceeded error on /project filesystems]]).


<!--T:33-->
<!--T:33-->
rsnt_translations
53,756

edits

Navigation menu