Using FTP & Aspera to access a download account
Why have I been provided a download account?
The dataset/s that you have requested are not yet availible to download using the EGA Download streamer client or you are having technical issues using the EGA Download streamer. As a result, the files have been added to a download account for you to access using FTP or Aspera. All files in your download account* are encrypted using either GnuPG (.gpg), Bcrypt (.bfe) or .cip and will need to be decrypted using an encryption key.
*PLEASE NOTE: The download account 'ega-box-233' contains public and open access files and, as a result, the files are not encrypted. You do NOT need to request encryption keys for files found within this download acount.
Downloading from Aspera Server. The CatDV Worker can be used to download media from an Aspera Server site. In order to do this Aspera Connect needs to be installed to provide the command interface to interact with the Aspera Server, this can be downloaded from the Aspera Website here. It is extremely non-intuitive, but you must download the Aspera Connect browser plugin, and the CLI utility ascp is included in that package. If you still have issues, I have it in my Dropbox. Update: Aspera now maintains this shell script as a separate file. You should now be able to do. HCC / packages / aspera-cli 3.9.1 0 IBM Aspera Command-Line Interface (the Aspera CLI) is a collection of Aspera tools for performing high-speed, secure data transfers from the command line.
Use Post-it® Notes anywhere and anytime. Post-it® App brings the simplicity of Post-it® Notes to your Mac, iPhone and iPad. Whether you use Post-it® Notes for teamwork and collaboration, or for reminders and personal note taking, Post-it® App helps you keep the momentum going. Organize your notes better with folders. Create as many folders and subfolders as you need, to separate notes from different courses and school years for example. You can even back up your notes to Google Drive, OneDrive, or Dropbox so your notes are always safe. Use Continuity Camera to Scan Docs or Capture Photos. MacOS Mojave has introduced a. Take notes wherever you go When you add your internet accounts to Notes, you can keep your notes with you no matter which device you’re using. So, you can save that team roster on your Mac, then have it handy on your iPad when you’re on.
What is the difference between FTP and Aspera?
FTP, or File Transfer Protocol, is a standard network protocol for exchanging files across the internet. Aspera is a faster alternative to FTP and provides greater user control enabling individual transfer rates and bandwidth sharing to be set.
How do I download using FTP?
All users may access their download accounts using FTP. See this this comparison of FTP software for a list of FTP clients.
Please check with your systems department to ensure that your firewall is able to access the IP address:220.127.116.11
The following examples represent a non-exhaustive list of freely downloadable clients:
UNIX users - command line ftp or GUI FileZilla
Windows/MAC - GUI FileZilla
How do I use an FTP client to access my FTP account?
Using ftp to access ega-box-xx
Using FileZilla to access ega-box-xx
Start FileZilla and click on 'Site Manager' under the 'File' menu.
Click 'New Site' in the resulting pop-up window and complete the details under the 'General' tab only, as shown below (MAC version 18.104.22.168):
Then click on ‘Connect’ to access your FTP account.
How do I download data using Aspera?
If downloading outside of your firewall, you may need to check that this activity does not conflict with the Data Access Agreement; if in doubt pleas revert to using ftp.
Please note that currently Aspera does not support proxy authentication, if this proves to be an issue please try speaking to your systems department to solve the problem.
Aspera is a commercial file transfer protocol that may provide faster transfer speeds than ftp especially over longer distances.
Operating System: Windows XP / 2003 / Vista / 2008 / 7 / 8, Mac OS Intel 10.5 / 10.6 / 10.7 / 10.8 / Linux
The Aspera ascp command line client can be downloaded here. Please select *Aspera Connect*.
The ascp command line client is distributed as part of the aspera connect high-performance transfer browser plug-in and is free to use.
You don't have to register in order to download the Browser Plug-in and the download is free of charge.
The location of the 'ascp' program in the filesystem:
Mac: on the desktop go cd /Applications/Aspera Connect.app/Contents/Resources/ there you'll see the command line utilities where you're going to use 'ascp'.
Windows: the downloaded files are a bit hidden. For instance in Windows 7 the ascp.exe is located in the users home directory in: AppDataLocalProgramsAsperaAspera Connectbinascp.exe
Linux: should be in your user's home directory, cd /home/username/.aspera/connect/bin/ there you'll see the command line utilities where you're going to use 'ascp'.
Using ascp from the command line
Use the command line below to copy files or directories (SRC) to your favoured destination (DST):
./ascp -P33001 -O33001 -QT -L- -l 1000M [email protected]:<SRC> <DST>
e.g. To download all data from ega-box-01 to your current directory:
./ascp -P33001 -O33001 -QT -L- -l 1000M [email protected]:. .
(Ensure that the ascp.exe file, located AsperaAspera Connectbin, is in the directory from which the command line is run)
Options for ascp:
- -Q Enables the fair transfer policy, which ensures that the available bandwidth is shared amongst other traffic and transfers at a fair rate
- -T Disables encryption for maximum throughput
- -L Creates an error log
- -k2 Enables fatal transfer restarts
- -l Maximum bandwidth
Check the command line transfer usage for more configuration details
How do I obtain an encryption key to decrypt files?
Keys can be obtained by contacting [email protected] and are provided to access all files that you have been permitted to access.
You are NOT required to request encryption keys for the download account 'ega-box-233' as the files within this download account are NOT encrypted.
Keys are delievered by three methods:
1) Post (FREE) - Please provide postal address
2) Courier - Please provide address and courier account number and we will arranage a courier on your behalf.
3) Using a secondary email address.
There are several secure options for transferring files to and from Biowulf and Helix. Detailed setup & usage instructions for each method are below.No matter how you transfer data in and out of the systems, be aware that PII and PHI data cannot be stored or transferred into the NIH HPC systems.
Globus is a service that makes it easy to move, sync, and share large amounts of data. It is the recommended way to transfer data to and from the HPC systems.
Globus will manage file transfers, monitor performance, retry failures, recover from faults automatically when possible, and report the status of your data transfer. Globus uses GridFTP for more reliable and high-performance file transfer, and will queue file transfers to be performed asynchronously in the background.
Setting up a Globus account, transferring and sharing data
Interactive Data Transfers should be performed on helix.nih.gov, the designated system for interactive data transfers and large-scale file manipulation. (An interactive session on a Biowulf compute node is also appropriate). Such processes should not be run on the Biowulf login node. For example, tarring and gzipping a large directory, or rsyncing data to another server, are examples of such interactive data transfer tasks.
The HPC System Directories, which include /home, /data, and /scratch, can be mounted to your local workstation if you are on the NIH network or VPN, allowing you to easily drag and drop files between the two places. Note that this is most suitable for transferring small file. Users transferring large amounts of data to and from the HPC systems should continue to use scp/sftp/globus.
Mounting your HPC directories to your local system is particularly userful for viewing HTML reports generated in the course of your analyses on the HPC systems. For these cases, you should be able to navigate to and select the desired html file to open them in your local system's web browser.
Directions for Locally mounting HPC System Directories
Directions for transferring data between your NIH Box or NIH OneDrive and HPC systems can be foundon our box/onedrive page.
Download from winscp.net and install it. Administrator privilege may be needed.
To open WinSCP, click on the search icon at the bottom left corner on your desktop. Type 'winscp', double click on it to open.
Select 'sFTP', fill the host name as helix.nih.gov, your NIH login username and password, then click 'Login'.
Click 'Yes'. This window only show up the first time you use WinSCP.
Click 'continue' in the authentication bar.
The left panel shows the directories on your desktop PC and the right panel shows your directories on Biowulf.
Click on the 'Preference' icon and browse through the tags to get an idea of all the options available.
To locate the file source and destination, simply use the two drop down boxes. Drag and drop files or folders to start transfer.
Fugu is a graphical frontend to the commandline Secure File Transfer application (SFTP). SFTP is similar to FTP, but unlike FTP, the entire session is encrypted, meaning no passwords are sent in cleartext form, and is thus much less vulnerable to third-party interception. Fugu allows you to take advantage of SFTP's security without having to sacrifice the ease of use found in a GUI. Fugu also includes support for SCP file transfers, and the ability to create secure tunnels via SSH.
Download Fugu from the U. Mich. Fugu website.. For OSX 10.5 and above, download from cnet.com.
Doubleclick on the downloaded Fugu_xxxx.dmg file to open. A small window with the Fugu icon will appear,
Grab the fish and copy it to your Applications folder, your Desktop and/or your Dock.
Start Fugu by clicking on the Fugu icon. In the box for 'Connect to:', enter 'helix.nih.gov' and click 'Connect'. Enter your NIH Login password when requested. You should now see a window with one pane listing files on your local desktop machine, and the other pane listing files in your Biowulf/Helix account space.
Both psftp and pscp are run through the Windows console (Command Prompt in start menu), and require the directory to the PuTTY executables be included in the Path environment variable. This can be done transiently through the console:
or permanently through the System Control Panel (see here for more information).
Secure Copy (pscp) is a command line mechanism for copying files to and from remote systems.
From the console, type 'pscp'. This will bring up a help menu showing all the options for pscp.
To copy a file from the local Windows machine to a user's home directory on Helix, type
You will be prompted for your NIH login password, then the file will be copied.
To do the reverse, i.e. copy a remote file from helix to the local Windows machine, type
(you must include a '.' to retain the same filename, or explicitly give a name for the remotefile copy).
Secure FTP (psftp) allows for interactive file transfers between machines in the same way as good old FTP (non-secure) did.
From the console, type 'psftp'. This will start a sFTP session, but it will complain that no connection has been made. To transfer a local file to helix, at the psftp prompt type:
You will again be prompted for a password.
Once a session to helix has been established, the standard FTP commands can be used.
For even more information, see https://www.chiark.greenend.org.uk/~sgtatham/putty/
scp is a secure, encrypted way to transfer files between machines. It is available on Macs and Unix/Linux machines. Transfers should not be performed on the Biowulf login node, as they will be subject to automatic termination if they use more than a little CPU, memory or walltime. Instead, use Helix for interactive data transfers. Since Helix and Biowulf share the same /home and /data areas, any files you transfer to Helix will also be available on Biowulf in the same path.
To transfer a file from your local machine to the HPC systems (Helix/Biowulf), any of the following will work:
- On your desktop, 'push' the file to Helix.e.g.
- On Helix, 'pull' the file from your desktop.e.g.
- On a Biowulf interactive session on a compute node (e.g. cn2323), pull the file via Helix. e.g.
- On Helix, 'push' the file to your desktop.e.g.
- On a Biowulf interactive session ona compute node (e.g. cn2323), push the file via Helix. e.g.
All of the above methods will avoid use of the Biowulf login node.
If your Helix account is locked due to inactivity, you can unlock it yourself at the Dashboard.
bbcp is a high-performance version of scp which can provide significantly increased file transfer performance over scp. Biowulf staff have observed over 220 MB/s over 10G links. bbcp is a peer-to-peer application. You invoke bbcp on the source machine and in response a bbcp process is started on the target machine.
Before you can use the bbcp utility it must be installed on both the local and remote systems. bbcp is already available on Helix (but not on Biowulf). To download a pre-compiled bbcp program for your Unix, Linux or MacOS X (x86_darwin_100) go here.
NOTE for Mac users: When the bbcp file is downloaded from the x86_darwin_100 folder, the name may need to be changed from bbcp.txt to bbcp, and the permissions may need to be changed to allow execution. This can be done using the Terminal application like this:
Also, to make the bbcp command 'universal' on your Mac, you can move it to the /usr/local/bin directory on your desktop:
The syntax is identical to scp, but there are some differences in its operation, some of which are described here:
- By default bbcp will not overwrite an existing file. Use the -f switch to force a file to be overwritten.
- By default bbcp does not report data transfer rates. Use the -v switch to see rates.
- bbcp uses a non-standard network port, so if you are initiating a copy from outside of the NIHnet firewall, you should use the -z switch (see the bbcp web site for an explanation).
Other useful switches are -h to get help and -r for recursive copies. There are many other features which are documented here.
Example: to download a file from Helix to your desktop machine, use the following command on Helix:
As with scp, bbcp will prompt you for your password before transferring the file; in this case the password of your account on your desktop system.
Example: to transfer a file from your Mac on your home network (and not on the NIH VPN) to Helix, open the Terminal app on your Mac and type
This will overwrite myfile on Helix if it already exists.
You may want to automatically transfer your generated results back to your local system at the end of a Biowulf batch job.
Command-line transfer as part of a batch jobBiowulf batch jobs run on the Biowulf compute nodes which are on a private network. Therefore you cannot directly scp from a Biowulf compute node to your local system. The recommended way to automatically transfer files at the end of a batch job is a Globus command line transfer.
First you should get familiar with the Globus command-line interface.
Then add something like the following at the end of your Biowulf batch job: The output from the last line of this batch script, which will appear in the usual slurm-#####.out output file, will be a Globus task id of the form
- s3cmd to transfer to/from AWS S3.
- To/from Amazon AWSThe Amazon AWS Command Line Interface (CLI) allows you to transfer data via the command line. The utility is already installed on the HPC systems and can be accessed with 'module load aws'. Helix, the interactive data transfer system, is the best place to use this. Sample session:
- The Google Cloud SDK allows you to transfer data using the command line. The SDK is already installed on the HPC systems and can be accessedwith 'module load google-cloud-sdk'. Helix, the interactive data transfer system, is the best place to use this. Sample session:Note: the -m flag is for multithreading/multi-processing. The number of threads/processes is set by the flags parallel_thread_count andparallel_process_count in your boto config file. You can find the appropriate config file by typing Recommended values are:
Some sources of biological data have specialized tools for file transfer.
Aspera Command Line Mac Download Mac
NCBI makes a large amount of data available through the NCBI ftp site, and also provides most or all of the same data on their Aspera server. Aspera is a commercial package that has considerably faster download speeds than ftp. More details in the NCBI Aspera Transfer Guide.
Note that SRA or dbGaP downloads are better done via the SRAtoolkit.You can use the Aspera command-line client (ascp
Aspera Connect Client Download) on Helix to download data from NCBI directly into your Biowulf/Helix account space. Aspera transfers can put a heavy I/O load on the Biowulf login node, and will not work from the Biowulf compute nodes, so please perform all Aspera transfers on Helix, the interactive file transfer system.
You do not need to load any modules. The 'ascp' command is available on Helix by default. If desired, you can set an alias for ascp that includes the key, e.g
Sample session (user input in bold):
If your download stops before completion, you can use the -k2 flag to resume transfers without re-downloading all the data. e.g.In the example above, the client skips over the files that had previously been transferred, and will download only the remaining files.
Typical file transfer rates from the NCBI server are 400 - 500 Mb/s, so '-l500M' is the recommended value.Data transfer by this method will be slower than using the command-line client on Helix, but may be more convenient for smaller transfers. You will need to download the free Aspera client browser plugin, install it on your desktop browser, and download the data to a Helix/Biowulf data area that is mapped onto your desktop system.
- Download the Aspera Connect browser plugin from the Aspera website and install on your Mac, Windows, or Linux system.
- Map your Helix /data or /scratch area on your desktop system as described in the section above on Mapped Network Drive.
- Start up Aspera Connect on your Mac, Windows or Linux system. Go to Preferences->Network, and set the connection speed to the maximum value. In our tests, the actual typical download speed to a desktop system is 50 - 100 Mb/s.
- Point your browser to the NCBI Aspera server and select the directory or files you want to download. Select your Helix data or scratch areas as the download target area. You can monitor the download in the Aspera transfer manager window.
By clicking on the icon in the transfer manager window, you can open the Transfer Monitor which will show a more detailed graph of the transfer rate
On Helix or Biowulf, use ftp ftp.ncbi.nlm.nih.gov to access the NCBI ftp site. Sample session (user input in bold):
You do not need to load any modules. The 'ascp' command is available on Helix by default. However you need to get the private SSH key file from NCBI. Sample session (user input in bold):
If your download stops before completion, you can use the -k2 flag to resume transfers without re-downloading all the data.
Aspera Command Line Download
By design, the Biowulf cluster is not connected to the internet. However, files can be transferred to and from the cluster using a Squid proxy server. Click on the link below for more details on how to use the proxy server.A proxy server has been set up so that the compute nodes can download data from hosts on the internet. The proxy server will handle a limited set of protocols: http, https, rsync, ftp. Any other program that uses one of the following environment variables will also work. This includes programs such as wget, curl, lftp, rsync, and git.
- wget example:
- curl example:
- lftp example:
- rsync example:
Note: rsync from the compute nodes with the SSH protocol is not supported. Only the rsync protocol is supported (notice the double colon in the command above). Therefore, the following will not work:
- git clone example:
Note that git requests to 'git://some/URL' will not work. Due to the protocol limitations on the proxy server, the URL has to be 'http://some/URL' or 'https://some/URL'.
- NCBI applications such as SRA-toolkit, NCBI-ngs, ngs-bam, ncbi-vdb, Entrez Direct and related applications such as hisat have been configured to automatically download data from NCBI as necessary. See the application page for details.
The rate of data transfer is only an issue for data amounts greater than 256MB. For amounts less than this, any application will suffice. To optimize transfer rates for large amounts of data, use less demanding encryption ciphers, such as blowfish or arcfour, and try to transfer the data when the network is less busy (before 10 am and after 6 pm). Also use the most appropriate application based on the table below.
Ibm Aspera Download
The HPC Staff has compared the applications and our results are below. For the most part we recommend using Globus for most transfers. scp is the default and best option for Linux/Unix machines.
Aspera Connect Download Windows 10
|All platforms||Globus||Best for very large files (> 256MB). Clients for all platforms, web-based. Notifications sent on completion.||The client must first be installed on the desktop.|
|Windows||WinSCP||Much faster transfer rates than PuTTY-pscp/psftp||Cumbersome user interface for changing local and remote directories.|
|pscp/psftp||Direct command line control over process.||Need to run through the command prompt, slowest transfer rates seen.|
|Mapped Network Drive||Convenient.||Fairly slow transfer rates, especially very large files.|
|Macs||bbcp,scp,sftp||Can be used for scripting & automatic file transfers, fastest transfer rates||non-GUI interface.|
|Fugu||Easy to configure and use.||Slower than command-line.|
|Mapped Network Drive||Convenient drag-and-drop.||Fairly slow transfer rates, especially for large files.|
|Linux/Unix||scp,sftp||Same as for Macs.||Same as for Macs.|
|bbcp||Fastest transfer rate.|