from google.colab import drive

drive.mount('/content/gdrive', force_remount = "True")
root_path = 'gdrive/My Drive/sims-data/malaria/'  #change dir to your project folder
Mounted at /content/gdrive
!pwd
import os
os.chdir("/content/gdrive/My Drive/sims-data/malaria/")
!pwd
shell-init: error retrieving current directory: getcwd: cannot access parent directories: Transport endpoint is not connected
pwd: error retrieving current directory: getcwd: cannot access parent directories: Transport endpoint is not connected
/content/gdrive/My Drive/sims-data/malaria

1. Download files directly in GoogleColab

!wget -P '/content/gdrive/My Drive/sims-data/malaria/'  'ftp://lhcftp.nlm.nih.gov/Open-Access-Datasets/Malaria/cell_images.zip' 
--2020-12-30 09:32:01--  ftp://lhcftp.nlm.nih.gov/Open-Access-Datasets/Malaria/cell_images.zip
           => ‘/content/gdrive/My Drive/sims-data/malaria/cell_images.zip’
Resolving lhcftp.nlm.nih.gov (lhcftp.nlm.nih.gov)... 130.14.55.35, 2607:f220:41e:7055::35
Connecting to lhcftp.nlm.nih.gov (lhcftp.nlm.nih.gov)|130.14.55.35|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD (1) /Open-Access-Datasets/Malaria ... done.
==> SIZE cell_images.zip ... 353452851
==> PASV ... done.    ==> RETR cell_images.zip ... done.
Length: 353452851 (337M) (unauthoritative)

cell_images.zip     100%[===================>] 337.08M  41.4MB/s    in 8.8s    

2020-12-30 09:32:11 (38.1 MB/s) - ‘/content/gdrive/My Drive/sims-data/malaria/cell_images.zip’ saved [353452851]

!ls
cell_images.zip
!pwd
/content/gdrive/My Drive/sims-data/malaria
! rm -R "/content/gdrive/My Drive/sims-data/malaria/cell_images/"

2. Unzip folder in GoogleColab

  • Unzipping in GoogleColab is painfully slow. It is faster to download data locally and load it unzipped in GoogleColab.

!unzip -q file[.zip] -d [exdir]

-q suppress the printing of the file names being extracted

-d [exdir] optional directory to which to extract file

!unzip -q "cell_images.zip"

3. Check size of folder in GoogleColab

! du -sh "/content/gdrive/"
8.0G	/content/gdrive/

4. GitClone a GitHub repository in GoogleDrive

from google.colab import drive

drive.mount('/content/gdrive', force_remount = "True")
root_path = 'gdrive/My Drive/'  #change dir to your project folder
!pwd
import os
os.chdir(root_path)
!pwd
Mounted at /content/gdrive
/content
/content/gdrive/My Drive
! git clone https://github.com/simsisim/sims-data.git
Cloning into 'sims-data'...
remote: Enumerating objects: 27566, done.
remote: Total 27566 (delta 0), reused 0 (delta 0), pack-reused 27566
Receiving objects: 100% (27566/27566), 331.64 MiB | 12.95 MiB/s, done.
Checking out files: 100% (27560/27560), done.