Skip to content Skip to main navigation Report an accessibility issue
High Performance & Scientific Computing

Data Transfer



Introduction

The ISAAC Secure Enclave offers services for two type of systems to support sensitive research and data: (i) Windows Virtual machines (VMs) (ii) HPSC cluster. In this section we will learn how to transfer data from Secure Enclave to any other machine authorized by Office of Research and OIT Security.

Data transfer on the Secure Enclave is performed with the Globus, which is  data transfer service and uses GridFTP protocol, an extension of FTP (File Transfer Protocol). GridFTP is a protocol defined by Global Grid Forum Recommendation GFD.020 and the following IETF standards (in RFC documents): RFC 959 (ftp), RFC 2228 (ftp security extensions), RFC 2389 (feature negotiation), RFC 3659 (Extensions), and RFC 4217 (TLS). Key features of GridFTP are:

  • Performance – the Globus protocol supports parallel transfer streams and multi-node transfers to achieve high performance.
  • Checkpointing – the Globus protocol requires that the server send restart markers to support checkpoints of files to improve performance over marginal network connections.
  • Third-party transfers – The FTP protocol that Globus is based separates control and data channels, enabling third-party transfers which is the transfer of data between endpoints, controlled by a third control host host.
  • Security – Provides strong security on both control and data channels. Control channel is encrypted by default. Data channel is authenticated by default with optional integrity protection and encryption which is required for the Secure Enclave and enforced by our Globus subscription.

The data transfer command and control mechanism uses the Globus cloud-based services to communicate with Globus endpoints (also called collections) to provide authentication and access control, get file listings, manage files (delete, rename, etc.), and transfer files between endpoints. The traditional file transfer tools such as SFTP, SCP or other utilities are not the preferred, approved, and documented way of transferring files and should not be used on the Secure Enclave. Please be aware that all data transfer operations should be done on the Secure Enclave Data Transfer Nodes (DTN). There are two DTNs: one for the Secure Enclave VMs and one for the HPSC cluster. The Globus endpoint names for these DTNs are given below:

  • For Windows VM: UTK Secure Enclave
  • For HPSC cluster: SIP ENCLAVE STORAGE

Network requirements for data transfer

The computers to (from) which the data needs to be transferred from (to) Secure Enclave are required to be in the University of Tennessee Knoxville’s (UTK) network. For the computers outside of the UTK network, please use Secure pulse VPN and connect to UTK network prior to access the Secure Enclave or get authorization to transfer data to (from) computer/server outside the University network by submitting a Service request and approve from the concerned authorities.

As Globus is a web-based data transfer application, which uses the location of computers (source or destination), also called as endpoints to transfer data, therefore, the users do not need to connect to VPN as long as the location of the systems to (from) which data to be transferred from (to) Secure Enclave are inside the UTK’s network. For example: users are not required to connect to Secure pulse VPN if they intend to transfer data from ISAAC Open Enclave HPSC cluster or Secure Enclave VM to Secure Enclave HPSC cluster or vice versa because all these systems are already in the UTK’s network.

However, if a user intend to transfer data to (from) a computer outside UTK’s network from (to) one of the Secure Enclaves and is not connected to VPN, then initiating a transfer of data to (from) the Secure Enclave using Globus will create the corresponding file/folder and it will look like it was successful. However, the file will be empty and have a size of 0 kB. This is because the access for the command protocol to the Globus cloud is open but the connections to the Globus endpoint file transfer ports are only allowed from approved external endpoints and the endpoints from University IP addresses including the VPN.

To learn how to setup and configure the VPN on your device, please review OIT’s VPN User Guide.

Important Note: Please make sure that you restart the Globus application on your computer after connecting to University’s network through VPN.

Data Transfer using Globus

As discussed above, there are two separate endpoints to transfer data to/from Secure Enclave Windows VM and HPSC cluster. The step by step process of data transfer for each of these systems is given below:

Windows Virtual Machines

The process of transferring data to/from Windows VM can be divided into two parts:

Part 1:

The first part involves the authentication on the data transfer node (sie-dtn) and grant access to the directory from where the data in the Secure Enclave VM will be transferred. The steps to grant the access to this data directory are given below:

  • Open web browser and log in to Citrix. Use your University credentials and choose appropriate Domain. Press enter or click on Log on button.
Figure 3.1: Initial login interface of Citrix
  • After clicking Log on button, below window will appear. Authenticate with DUO push or request a passcode.
Figure 3.2: Two step verification window to login in to Secure Enclave
  • Once logged in, click on the APPS menu and Open Putty Secure Enclave.
Figure 3.3: The different applications in Citrix environment
  • Through Putty, login into Secure Enclave DTN with a hostname of sie-dtn. Click open and enter the requested credentials.
Figure 3.4: SSH to Secure Enclave DTN
  • After successfully logging in to Secure Enclave DTN, type kinit command and press enter. This command authorizes the access to the directory in Windows VM from (to) where the files can be transferred. In Windows VM, this path is D:\Globus (Double click on Data (D) drive on VM’s Desktop). Note that it is this directory in the VM, which is mapped to the /VMname.tennessee.edu/out directory in the Globus interface.
Figure 3.5: Authorize the transfer of data from Windows VM

Part 2:

After authorizing data transfer on DTN, the next step is to log in to Globus web based interface using your organizational credentials. Below are the important steps:

  • Navigate to Globus website and click on the Login button on the top right corner of the page. Find “University of Tennessee” from the drop-down menu under the organizational login and click Continue
Figure 3.6: Organizational login window of Globus web interface
  • You will be prompted back to a familiar authentication page requesting your University netID and password. Enter your credentials and click LOGIN
Figure 3.7: Central Authentication Service page to authenticate the credentials
  • Select one of the options in the below window for the two step verification of your University credentials.
Figure 3.8: Two step verification step
  • After the successful authentication, you will be prompted to the Globus File Manager page. There are three small panel options on the top right corner of the page. Click on the middle two panel to set the view of the file manager to two panel as we would be exchanging files between two computers.

Figure 3.9: Globus File manager
  • Before initiating any transfer, endpoints are needed to be configured on your machine or any other machine to/from which you want to transfer the data from/to HPSC cluster. The ISAAC Open Enlave Data transfer page provide the step by step guide to configure the endpoint on any machine. Please note that Secure Enclave VM has already a configured endpoint named UTK Secure Enclave.
  • In the collection box on the left pane, search for the endpoint name of your machine, which you may have given while installing Globus and creating the endpoint as mentioned above. In the right pane search the UTK Secure Enclave endpoint. Note that due to security reasons you will be asked to authenticate. You should see both the endpoints and the corresponding folder/files in each of the panes as shown in Figure 3.10.
  • After the successful authentication, you will see a bunch of folders under the endpoint UTK Secure Enclave. Each of these folders corresponds to a separate Windows VM owned by different users as shown in Figure 3.10. The trailing end of each VM is “utk.tennessee.edu”.
Figure 3.10: Globus web interface after logging into the machines.
  • You need to recognize your VM name and double click on it to get to the out directory. The contents of this directory are mapped to the ones in the D:\Globus in Windows VM. You can verify this by logging into your Windows VM and going to D:\Globus.
Figure 3.11: Selection of a file from the left pane (local endpoint) to transfer to right pane (VM endpoint)
  • To transfer data from any authorized machine (your local machine or a server) to your Windows VM, select the name of the file/folder in you local machine/server (left endpoint) and click the blue “Start”. We have encircled this “Start” button as shown below.
Figure 3.12: Process of transferring data from local machine endpoint to Secure Enclave Windows VM.
  • After clicking “Start button”, you will see a green banner on the top right corner (top right). Clicking on “View Details” will show you the overview of the submitted transfer request.
Figure 3.13: An overview of the submitted request to transfer data using Globus.

Important Note

Please note that the files in the D:\Globus folder on Windows VM should not be kept there permanently. After exchanging the files, please move/remove them from D:\Globus folder.

HPSC cluster

As discussed in File Systems, data on Secure Enclave HPSC cluster is stored in encrypted and unencrypted mode. The process of data transfer from each of these storage spaces is discussed one by one.

Transferring Data in Unencrypted Space

The data in the unencrypted storage space can be transferred to any authorized local machine or server by following the below steps:

  • Start with the initial steps of logging in to Globus through your University credentials as explained in the Part 2 of the data transfer from Windows VM. Follow the steps until you see Figure 3.9.
  • To transfer data from (to) your local machine to (from) HPSC cluster, search for the endpoint of your local machine in the left pane of Globus File manager (Make sure that Globus is running on your machine). In the right pane, search the endpoint SIP ENCLAVE STORAGE. Note that you may need to authenticate with your University credentials. The resulting interface should look like Figure 3.14.

Figure 3.14: Globus File Manager panels after connecting to different endpoints
  • In right pane of Figure 3.14, there are two directories lustre and nics, which are unencrypted storage spaces of Secure Enclave cluster and the third one projects is encrypted, which we will discuss in the next section.
  • To transfer data from (to) your unencrypted Lustre storage space, you need to change to the directory where you are granted permission to alter the files. Usually it is /lustre/sip/proj/<project_account>/UTnetId.
  • To transfer data to (from) your local machine to (from) HPSC cluster, click the file/folder on your local machine (in Globus endpoint) and click the encircled blue “Start” button or “Transfer or Sync to..” button as shown in Figure 1.13.
Figure 3.15: File transfer using Globus web interface
  • Clicking “Start button”, you will see a green banner on the top right corner as shown in Figure 3.13 (top right) where you can view the details of the submitted transfer request.

Transferring Data in Unencrypted Space

To transfer data stored in the encrypted storage space, we need to perform few additional steps to decrypt the data before using Globus to start the data transfer. These steps are outlined below:

  • Repeat steps described in Figure 3.1 to Figure 3.3 and open Putty Secure Enclave application.
  • Login to Secure Enclave HPSC cluster using the hostname sip-login1.tennessee.edu or sip-login2.tennessee.edu in the field Host Name and click Open.
Figure 3.16: Login to Secure Enclave cluster using Putty application in Citrix
  • Enter the University NetID and password. Authenticate with Duo/Passcode two step authentication process as described in Part 1 of Data transfer in Windows Virtual Machines.
Figure 3.17: Welcome screen of Secure Enclave HPSC cluster
  • Execute sipmount command on Secure Enclave login node as described below:
 $ sudo sipmount  <project_account>
  • Replace the <project_account> argument with your project identifier, such as UTK-9999. You can determine the name of the projects to which you belong in the User Portal. More information is available in the Navigating the User Portal document.
  • Enter your University credentials followed by a Duo push to authenticate the mounting of encrypted storage space.
  • Verify if the encrypted space is mounted successfully by using “df -h” command. You should see the one extra line for encryted space at the end of the output of “df -h” as shown below.
 $ sudo sipmount <project_account>
 Filesystem                                 Size  Used Avail Use%  Mounted on
 ----                                       ---  ---   ---  ---   /
 --------                                   ---  ---   ---  ---    /dev
 -do-

 encfs                                      1.2P  124T  1.1P  11% /projects/<project_name>
  • Return to the Globus File Manager and navigate to the /projects/<project_account> directory. Its contents should be visible. If not, wait approximately five minutes, then refresh the directory.
Figure 3.17: Globus two pane file manager window to transfer data between two endpoints.
  • Transfer data between the computers as described in Figure 3.15.

After you complete your data transfers, you may unmount the encrypted space on the Secure Enclave cluster. Use the sipumount command to unmount this space as described below

 $ sudo sipumount  <project_account>

Its syntax and usage is the same as the sipmount command. If you do not unmount the encrypted space, it will automatically be unmounted after 15 minutes. For more information, please refer to the File Systems document.

Transferring Data to External Globus Endpoints

Data transfer from the Secure Enclave to external non-UT Globus endpoints is only allowed after authorization. These external endpoints must be authorized before they can be used. If you have an external Globus endpoint that you would like to be allowed to transfer data to/from the Secure Enclave, please submit a service request to the OIT Help Desk for the HPSC service with the request details (hostname, IP address, external Globus administrator contact, and external organization security contact).

Preparing and Organizing Data for Transfer

Globus can manage transfers of collections of files within subdirectories automatically. You may want to try using this capability by asking Globus to transfer an entire directory as a test to get famliar with it then make use of this capability for transfer of large data collections of large numbers of files on the Secure Enclave file systems.

Before you initiate data transfers to or from the Secure Enclave endpoints, you could consider preparing the data you wish to transfer by aggregating multiple files with tar and compressing it. When you aggregate data, several files and directories can be added to the same file. When you compress data, you reduce its total size. Both methods reduce the total amount of data that must be sent across the network and make it easier for you to organize the data you wish to transfer. At the time of this writing, the tar and zip utilities are the best methods for data archiving and compression for Secure Enclave users across Linux, MacOS, and Windows.

When you prepare your data, please avoid using a login node. Instead, use the SIP’s DTN (data transfer node). Figure 1.1 in the Introduction shows how to access the DTN.

Using the tar Utility

The tar (tape archiver) utility uses simple command syntax and allows large amounts of data to be aggregated into the same archive. Linux, MacOS, and updated Windows 10 systems can use tar. Older Windows systems will be limited to the zip utility.

To create a tar archive, execute tar czvf <archive-name> <dir-to-archive>. Replace the <archive-name> argument with the name of the new archive. Be sure to follow the name with the .tar.gz extension, as in my_archive.tar.gz. Replace the <dir-to-archive> argument with the directory you wish to place within the archive. If the directory you intend to archive is not within your working directory, specify the relative or absolute path to it. By default, tar will recursively place the directory and its contents into the new archive as shown below

[user@sip-dtn1 ~]$ tar czvf new_archive.tar.gz Documents
Documents/
Documents/IntroUnix.pdf
Documents/JobSubData.zip
Documents/MATLAB/
Documents/Scripts.zip
Documents/PyLists.py

After the archive is created, execute ls -l to verify that the archive exists. You can view its contents with the tar tvf <archive-name> command. You may then transfer the archive using Globus. Please refer to the Configuring Globus section to learn how to configure it for your system.

On the remote system, execute tar xvf <archive-name> to extract the contents of the archive. The files will be extracted into your working directory.

Using the zip Utility

On older Windows systems, the zip utility should be used to archive and compress your data on the SIP.

To create a zip archive on the SIP, execute zip -r <archive-name>.zip <dir-to-archive>. Be sure that the directory you wish to archive is in your working directory. Otherwise, specify the relative or absolute path to the directory you wish to archive. Replace the <archive-name> argument with the name of the new zip archive. You may or may not include the .zip file extension to the archive’s name; if you do not, the zip utility will add it automatically. Replace the <dir-to-archive> argument with the directory you wish to place in the zip archive. The -r option ensures that the directory and its contents are archived and compressed as described below

[user@sip-dtn1 ~]$ zip -r Documents Documents
  adding: Documents/ (stored 0%)
  adding: Documents/IntroUnix.pdf (deflated 4%)
  adding: Documents/MATLAB/ (stored 0%)
  adding: Documents/PyLists.py (deflated 61%)

After the zip archive has been created, execute ls -l in the directory from which you created it to ensure the archive exists. It will appear with the name you gave to the archive followed by the .zip extension.

With the zip archive created and verified, transfer it to your system using Globus. Please refer to the Configuring Globus section to learn how to use it on your system. Once you transfer the zip archive to your system, open the File Explorer and navigate to the directory in which you placed the archive. Right-click on the archive and select the “Extract All…” option in the submenu. Figure 2.3 shows where to locate this option. Specify the directory in which the contents should be extracted, then select “Extract.” You may then open the archive and peruse its contents.