Globus - high speed data transfer

Globus is a software tool to transfer files across the web in a reliable, high-performance and secure way. It provides fault-tolerant, fire-and-forget data transfer using simple web (or command line) interfaces. It is appropriate for transferring very large files and datasets.

We have a Globus Connect endpoint (currently) named umnphys#data which lets you access Physics data storage.

How to use Globus

First, create a Globus account

First you need to create a free, Globus account:

  • Point your browser at http://www.globus.org and click Sign Up.
  • On the Create an Account page, fill in the information (your name, email address, username, password, etc.) and read the terms, then click Register.
  • You will receive an email with a link which you need to follow to confirm your new Globus account.

Transferring files between Physics and other endpoints

Before you can connect to our endpoint with Globus, we first need to record your “certificate subject” in our database to link it to your Physics account. You can obtain this information from the CILogon web site - CILogon is the service which globus uses to connect with the university authentication infrastructure.

  • Use your web browser to visit https://cilogon.org
  • Choose “University of Minnesota” as your Identity Provider, and click the “Log on” button, which takes you to the standard UMN login page.
  • The page will then display information, including the certificate subject, which looks something like: /DC=org/DC=cilogon/C=US/O=University of Minnesota/CN=Your Name A12345
  • Copy this into the “Edit Physics directory information” page at MyPhys, in the “Globus CILogon certificate” field. Make sure you copy the entire subject line, starting from “/DC=org…”. Our system should then update our endpoint account mapping within about 30 minutes.

Note that this particular process is specific to our system in Physics; other places may handle it differently. For the MSI endpoint, you should send your certificate details and MSI account name to help@msi.umn.edu

To transfer files between the Physics cluster and other endpoints such as MSI.

  • Point your browser at http://www.globus.org, click Sign In, and click Transfer Files. A web page with two Endpoint fields will display.
  • In one Endpoint field, pull down the expand menu and select our endpoint, umnphys#data.
    • Remember you have to register your certificate (as described above) before you can connect to the Physics endpoint.
    • You can also type into the field to filter the endpoints
    • On selecting the endpoint, you may get redirected to the standard University of Minnesota web login (unless you already have a session active). Log in, after which you are returned to the Globus file transfer page.
    • By default it will select your physics home directory, though this shouldn't used for large bulk transfers!
    • You then need to update the “Path” field to point at the desired physics data directory (for example, /data/gammaray) - the path /data itself is not a writable location.
    • The contents of that directory will then display
  • In the other Endpoint field, pull down the expand menu and select the appropriate site - for example, msihpc#panfs to connect to the main MSI storage.

To transfer files between endpoints, select a file or directory from each list, then click one of the large arrow buttons to tell Globus the desired direction of the transfer.

You will receive an automatic email from Globus Notification (notify@globus.org) when the file transfer has completed. To have Globus show you the status and history of your file transfers, from its Go To pull-down menu, select View Transfers.

Globus Connect software for personal machines

If you want to transfer files to or from your local desktop (or laptop) machine, you need to download and install the Globus Connect software (one-time). You will need to do this step on each computer whose files you want to transfer using Globus.

After that, the procedure is similar to that described above. If you are running Globus Connect, the first name in the list of available endpoints is your local computer, though you can select any site where you have an account.

Additional notes

Registering your certificate for the Physics endpoint subscribes you to a mailing list globus-users@physics.umn.edu, where we'll send any information on changes to the system (the list is also open for subscribers to post discussion items). As usual for automatic physics lists, if you don't want to see these messages you can set it to “nomail” in MyPhys.

The transfer speed will depend greatly on the nature of the data set. For example, the number of parallel streams initiated is dependent on the size of the files being transferred. 2 streams for files less than 50MB, 4 streams for files between 50Mb and 250MB and 8 streams for files >250MB.

To use UDT instead of TCP as the tranfer protocol, put 'useudtplease' in the 'Label this Transfer' box (this may be useful for very distant transfers with high latency (BDP) on the network path).

