DataLad¶
The TemplateFlow Archive is an infrastructure reliant on DataLad. Therefore, it is possible (and recommended for those who want to leverage the power of DataLad) to access the Archive using just DataLad.
Installing the Archive¶
The archive is indexed by a superdataset, which can be installed with:
$ datalad install -r https://github.com/templateflow/templateflow.git
or just:
$ datalad install -r ///templateflow
Please note the -r
modifier, which will automatically install all the
subdatasets.
In this case, subdatasets (sub-folders) are the individual templates
(signified by the tpl-
prefix).
If the operation finished successfully, you should be able to change directories
into templateflow and see something like:
$ cd templateflow/
$ ls -lh
total 76K
-rw-rw-r-- 1 oesteban oesteban 122 Sep 8 10:42 dataset_description.json
drwxrwxr-x 4 oesteban oesteban 4.0K Sep 8 10:43 tpl-fsaverage
drwxrwxr-x 5 oesteban oesteban 4.0K Sep 8 10:43 tpl-fsLR
drwxrwxr-x 5 oesteban oesteban 4.0K Sep 8 10:42 tpl-MNI152Lin
drwxrwxr-x 5 oesteban oesteban 16K Sep 8 10:42 tpl-MNI152NLin2009cAsym
drwxrwxr-x 5 oesteban oesteban 4.0K Sep 8 10:42 tpl-MNI152NLin2009cSym
drwxrwxr-x 5 oesteban oesteban 12K Sep 8 10:42 tpl-MNI152NLin6Asym
drwxrwxr-x 5 oesteban oesteban 4.0K Sep 8 10:42 tpl-MNI152NLin6Sym
drwxrwxr-x 16 oesteban oesteban 4.0K Sep 8 10:43 tpl-MNIInfant
drwxrwxr-x 11 oesteban oesteban 4.0K Sep 8 10:43 tpl-MNIPediatricAsym
drwxrwxr-x 5 oesteban oesteban 4.0K Sep 8 10:43 tpl-NKI
drwxrwxr-x 4 oesteban oesteban 4.0K Sep 8 10:43 tpl-OASIS30ANTs
drwxrwxr-x 4 oesteban oesteban 4.0K Sep 8 10:43 tpl-PNC
drwxrwxr-x 5 oesteban oesteban 4.0K Sep 8 10:43 tpl-WHS
Important
The DataLad install operation DOES NOT download the data. Please see how to get the data below.
Accessing templates¶
Before going ahead, make sure you understand how DataLad works. Once the TemplateFlow superdataset has been installed, as well as all or some of the subdatasets, it is possible to access data. For example, pulling down all T1-weighted NIfTI images of all datasets would look like:
$ find . -name "*_T1w.nii.gz" -exec datalad get {} +
Let’s unpack what happened.
DataLad (or more precisely, the git-annex working under the hood)
replaces large files with symbolic links which point to files that
permit the location of the actual resource.
This technique (“annexing” to git) permits keeping the actual files
outside the version control system that (unless set up with some
special extension such as LFS) is not adequate to track large data
files.
Because annexed files are indeed in the file tree, it is possible to
search with tools like find
or tree
:
$ tree tpl-MNI152Lin
tpl-MNI152Lin
├── CHANGES
├── LICENSE
├── scripts
│ ├── headmask.py
│ ├── normalize.py
│ └── sanitize.py
├── template_description.json
├── tpl-MNI152Lin_res-01_desc-brain_mask.nii.gz -> .git/annex/objects/J4/J9/URL-s131839--https&c%%files.osf.io%v1%resourc-4a92beb360af57cc397642c99e4f34ee/URL-s131839--https&c%%files.osf.io%v1%resourc-4a92beb360af57cc397642c99e4f34ee
├── tpl-MNI152Lin_res-01_desc-head_mask.nii.gz -> .git/annex/objects/j3/Jw/URL-s168509--https&c%%files.osf.io%v1%resourc-2e366aff039e485ce73875dd1fc912fd/URL-s168509--https&c%%files.osf.io%v1%resourc-2e366aff039e485ce73875dd1fc912fd
├── tpl-MNI152Lin_res-01_PD.nii.gz -> .git/annex/objects/5m/4z/URL-s10250635--https&c%%files.osf.io%v1%resourc-d38cc6938c26e9389a1a9acf03f5a4b6/URL-s10250635--https&c%%files.osf.io%v1%resourc-d38cc6938c26e9389a1a9acf03f5a4b6
├── tpl-MNI152Lin_res-01_T1w.nii.gz -> .git/annex/objects/pM/Fm/URL-s10669511--https&c%%files.osf.io%v1%resourc-2e59511114a1686f937e0127af887b83/URL-s10669511--https&c%%files.osf.io%v1%resourc-2e59511114a1686f937e0127af887b83
├── tpl-MNI152Lin_res-01_T2w.nii.gz -> .git/annex/objects/63/jK/URL-s10096230--https&c%%files.osf.io%v1%resourc-7ee9c493542a55d96d28d55d57a3ee52/URL-s10096230--https&c%%files.osf.io%v1%resourc-7ee9c493542a55d96d28d55d57a3ee52
├── tpl-MNI152Lin_res-02_desc-brain_mask.nii.gz -> .git/annex/objects/vj/pW/URL-s25649--https&c%%files.osf.io%v1%resourc-ebe0f869bd33c9dd7d983a73f7704326/URL-s25649--https&c%%files.osf.io%v1%resourc-ebe0f869bd33c9dd7d983a73f7704326
├── tpl-MNI152Lin_res-02_desc-head_mask.nii.gz -> .git/annex/objects/7q/gF/URL-s32857--https&c%%files.osf.io%v1%resourc-4c79972ef82dfaa9070522b558a8411c/URL-s32857--https&c%%files.osf.io%v1%resourc-4c79972ef82dfaa9070522b558a8411c
├── tpl-MNI152Lin_res-02_PD.nii.gz -> .git/annex/objects/1m/jq/URL-s1411464--https&c%%files.osf.io%v1%resourc-95c7dabef32603e9f1d4f3f9cb92b800/URL-s1411464--https&c%%files.osf.io%v1%resourc-95c7dabef32603e9f1d4f3f9cb92b800
├── tpl-MNI152Lin_res-02_T1w.nii.gz -> .git/annex/objects/Wf/Fx/URL-s1448817--https&c%%files.osf.io%v1%resourc-2ba5a81206dff8bbf84fb319ed1d7201/URL-s1448817--https&c%%files.osf.io%v1%resourc-2ba5a81206dff8bbf84fb319ed1d7201
└── tpl-MNI152Lin_res-02_T2w.nii.gz -> .git/annex/objects/X8/Fv/URL-s1375781--https&c%%files.osf.io%v1%resourc-6f1f3ad0441ef1200307a70b32b4f303/URL-s1375781--https&c%%files.osf.io%v1%resourc-6f1f3ad0441ef1200307a70b32b4f303
1 directory, 16 files
If your terminal has advanced coloring, you will also see that only the two
links ending with _T1w.nii.gz
are not “broken” links.
This is because we did datalad get
on both of them in the previous step.
DataLad only pulls the actual file objects when they are requested.