1 TRANSFERRING FILES BETWEEN THE MARVIN CLUSTER AND OTHERS CLOUD SYSTEMS

 

The commands; rclone - rsync are tool used for cloud storage management. It is a command line to sync files and directories with cloud storage systems such as Google Drive, Amazon Drive, S3, B2 etc. The command rclone can be invoked in one of three different modes:

  • Copy; mode to just copy new/changed files.

  • Sync; mode to make a directory identical, only works in one direction.

  • Check mode to check for file hash equality.

 

This command (rclone) is available on Marvin cluster, the module is called rclone/1.36 .

 

1.1 Configuration of rclone

The first step is to configure rclone to work with the transfer partner. Below we give two examples - one for Google Drive and Dropbox.

 

Example 1: Configuration for transferring files to/from the UPF Google Drive storage  

A few notes about this storage:

  • Through a partnership between the UPF and Google for Education, U faculty, staff and students have access to unlimited storage in Google Drive

  • For more secure storage and transfer options pertaining  to storing sensitive data, and personal information, consult this link.

 

Step 1:

Login to Marvin:

$ ssh [username]@marvin.s.upf.edu

 

Step 2:

Get a interactive session:

$ interactive

 

Step 3:

Load rclone module:

$ module load rclone

 

Step 4:

Configuring rclone and setting up remote access to your Google drive, using command:

$ rclone config

You can select one of the options (here we show how to setup a new remote)

 

2017/04/24 10:21:00 Config file "/homes/users/test/.rclone.conf" not found - using defaults

No remotes found - make a new one

n) New remote

s) Set configuration password

q) Quit config

           n/s/q> n

 

You enter n for a new remote connection and give it a name. (what ever you want)

 

name> upf_drive

 

Then you choose the type of storage for which you are setting up the remote (here we show the method for setting up a remote for google drive which is option 7)

 

Type of storage to configure.

Choose a number from below, or type in your own value

1 / Amazon Drive

  \ "amazon cloud drive"

2 / Amazon S3 (also Dreamhost, Ceph, Minio)

  \ "s3"

3 / Backblaze B2

  \ "b2"

4 / Dropbox

  \ "dropbox"

5 / Encrypt/Decrypt a remote

  \ "crypt"

6 / Google Cloud Storage (this is not Google Drive)

  \ "google cloud storage"

7 / Google Drive

  \ "drive"

8 / Hubic

  \ "hubic"

9 / Local Disk

  \ "local"

10 / Microsoft OneDrive

  \ "onedrive"

11 / Openstack Swift (Rackspace Cloud Files, Memset Memstore, OVH)

  \ "swift"

12 / Yandex Disk

  \ "yandex"

Storage> 7

 

Then you see a few messages like the ones below:

 

Google Application Client Id - leave blank normally.

client_id> (just press enter key here)

 

Google Application Client Secret - leave blank normally.

client_secret> (just press the enter key here)

 

Now since you are remotely accessing the cluster you have to select remote config i.e. option n

 

Remote config

Use auto config?

* Say Y if not sure

* Say N if you are working on a remote or headless machine or Y didn't work

y/n> n

 

You will see a message similar to the one below:

 

If your browser doesn't open automatically go to the following link: https://accounts.google.com/o/oauth2/auth?client_id=202264815644.apps.googleusercontent.com&redirect_uri=urn...

 

Log in and authorize rclone for access.

You have to open this url in your workstation system browser and authenticate your Google drive options. Once that is done you will get a screen that displays a password/ verification code.

Type or copy this key from the browser and paste it into the terminal. Once the terminal accepts the verification code it will display the options below, choose one:

 

y) Yes this is OK

e) Edit this remote

d) Delete this remote

y/e/d> y

 

You can select y if everything seems okay with the remote or you can edit the same.

You can also view the current existing remotes.

 

Example 2: Configuration for transferring files to/from Dropbox

 

A few notes about this storage system:

  • For security information including storing sensitive data, and personal information, consult this link.

 

Step 1:

Login to Marvin:

$ ssh [username]@marvin.s.upf.edu

 

Step 2:

Get a interactive session:

$ interactive

 

Step 3:

Load rclone module:

$ module load rclone

 

Step 4:

Configuring rclone and setting up remote access to your dropbox, using command:

$ rclone config

You can select one of the options (here we show how to setup a new remote)

 

2017/04/24 10:21:00 Config file "/homes/users/test/.rclone.conf" not found - using defaults

No remotes found - make a new one

n) New remote

s) Set configuration password

q) Quit config

n/s/q> n

 

You enter n for a new remote connection and give it a name. (what ever you want)

 

name> my_dropbox

 

Then you choose the type of storage for which you are setting up the remote (here we show the method for setting up a remote for dropbox which is option 7)

 

Type of storage to configure.

Choose a number from below, or type in your own value

1 / Amazon Drive

  \ "amazon cloud drive"

2 / Amazon S3 (also Dreamhost, Ceph, Minio)

  \ "s3"

3 / Backblaze B2

  \ "b2"

4 / Dropbox

  \ "dropbox"

5 / Encrypt/Decrypt a remote

  \ "crypt"

6 / Google Cloud Storage (this is not Google Drive)

  \ "google cloud storage"

7 / Google Drive

  \ "drive"

8 / Hubic

  \ "hubic"

9 / Local Disk

  \ "local"

10 / Microsoft OneDrive

  \ "onedrive"

11 / Openstack Swift (Rackspace Cloud Files, Memset Memstore, OVH)

  \ "swift"

12 / Yandex Disk

  \ "yandex"

Storage>4

 

The command line will show the following commands:

 

Dropbox App Key - leave blank normally.

app_key> (just press enter key here)

Dropbox App Secret - leave blank normally.

app_secret> (just press enter key here)

Remote config

Please visit:

https://www.dropbox.com/1/oauth2/authorize?client_id=sdffsdf&response_type=code

 

You have to open this url in your workstation system browser and authenticate your Dropbox options. Once that is done you will get a screen that displays a password/ verification code.

Type or copy this key from the browser and paste it into the terminal. Once the terminal accepts the verification code it will display the options below, choose one:

 

Enter the code: fsdfsdfsdfsdfqwe45t54tgw5y5w4yhsd

--------------------

[my_dropbox]

app_key =

app_secret =

token = sdfgsdgsdfghrsethy37uy4erhytrsdhjdtyjnhdfHG·%UHSDHDFSHSDFW$%&GSDFGSDX

 

--------------------

y) Yes this is OK

e) Edit this remote

d) Delete this remote

y/e/d> y

 

You can select y if everything seems okay with the remote or you can edit the same.

You can also view the current existing remotes.

 

1.2 Examples to transfer files using rclone and Google Drive

The following commands are useful tools for transfering files using rclone with Google Drive (seen here as upf_drive), although these commands also work with other clouds systems.

1)List the drive’s directory:

 

rclone lsd upf_drive:[path]

 

2) Copy files from marvin to drive:

 

rclone copy [marvin path] upf_drive:[path drive]

 

3)Copy files from drive to marvin:

 

rclone copy upf_drive:[path drive] [marvin path]

 

4) Backup marvin home directory to drive:

 

DATE=‘date +'%d-%m-%Y_%H:%M:%S'’

rclone copy ~ upf_drive:backups/mavin/home/$DATE

 

5) Synchronizing the home directory and a copy of the home directory on google drive:

 

rclone sync ~ upf_drive:syncs/marvin/home

 

6) Synchronizing the home directory and a copy of the home directory on google drive but with a 10MB/s limit of bandwith:

 

rclone sync ~ upf_drive:syncs/marvin/home --verbose=1 --bwlimit 10M

 

7) Synchronizing the home directory and a copy of the home directory on google drive but with a bandwidth limit within certain time slots: specifically from 8am to 10am the limit is 20MB/s, 10am to 18pm the limit is 5MB/s while the outside those times it is unlimited:

 

rclone sync ~ upf_drive:syncs/marvin/home --verbose=1 --bwlimit "08:00,10M 10:00,5M 18:00,off"

IT’S RECOMMENDED TO ALWAYS PUT THESE LIMITS.

 

More info:

   https://rclone.org/


 

 

 

FTP

FTP is a Server-Client protocol used mainly for transfering data. The transfer itself is not encoded but the data access is protected by a username and password system. If you wish to share data through FTP send us an email at sit@upf.edu.