Running fMRIPrep via Singularity containers¶
Preparing a Singularity image¶
Singularity version >= 2.5. If the version of Singularity installed on your HPC system is modern enough you can create Singularity image directly on the system. This is as simple as:
$ singularity build /my_images/fmriprep-<version>.simg docker://poldracklab/fmriprep:<version>
where <version>
should be replaced with the desired version of fMRIPrep that you
want to download.
Singularity version < 2.5. In this case, start with a machine (e.g., your personal computer) with Docker installed. Use docker2singularity to create a singularity image. You will need an active internet connection and some time.
$ docker run --privileged -t --rm \
-v /var/run/docker.sock:/var/run/docker.sock \
-v D:\host\path\where\to\output\singularity\image:/output \
singularityware/docker2singularity \
poldracklab/fmriprep:<version>
Where <version>
should be replaced with the desired version of fMRIPrep that you want
to download.
Beware of the back slashes, expected for Windows systems. For *nix users the command translates as follows:
$ docker run --privileged -t --rm \
-v /var/run/docker.sock:/var/run/docker.sock \
-v /absolute/path/to/output/folder:/output \
singularityware/docker2singularity \
poldracklab/fmriprep:<version>
Transfer the resulting Singularity image to the HPC, for example, using scp
.
$ scp poldracklab_fmriprep*.img user@hcpserver.edu:/my_images
Running a Singularity Image¶
If the data to be preprocessed is also on the HPC, you are ready to run fMRIPrep.
$ singularity run --cleanenv fmriprep.simg \
path/to/data/dir path/to/output/dir \
participant \
--participant-label label
Handling environment variables¶
Singularity by default exposes all environment variables from the host inside
the container.
Because of this, your host libraries (e.g., nipype or a Python 2.7 environment)
could be accidentally used instead of the ones inside the container.
To avoid such a situation, we recommend using the --cleanenv
argument in
all scenarios. For example:
$ singularity run --cleanenv fmriprep.simg \
/work/04168/asdf/lonestar/ $WORK/lonestar/output \
participant \
--participant-label 387 --nthreads 16 -w $WORK/lonestar/work \
--omp-nthreads 16
Alternatively, conflicts might be preempted and some problems mitigated by
unsetting potentially problematic settings, such as the PYTHONPATH
variable,
before running:
$ unset PYTHONPATH; singularity run fmriprep.simg \
/work/04168/asdf/lonestar/ $WORK/lonestar/output \
participant \
--participant-label 387 --nthreads 16 -w $WORK/lonestar/work \
--omp-nthreads 16
It is possible to define environment variables scoped within the container by
using the SINGULARITYENV_*
magic, in combination with --cleanenv
.
For example, we can set the FreeSurfer license variable (see The FreeSurfer license)
as follows:
$ export SINGULARITYENV_FS_LICENSE=$HOME/.freesurfer.txt
$ singularity exec --cleanenv fmriprep.simg env | grep FS_LICENSE
FS_LICENSE=/home/users/oesteban/.freesurfer.txt
As we can see, the export in the first line tells Singularity to set a
corresponding environment variable of the same name after dropping the
prefix SINGULARITYENV_
.
Accessing the host’s filesystem¶
Depending on how Singularity is configured on your cluster it might or might not
automatically bind (mount or expose) host’s folders to the container (e.g., /scratch
,
or $HOME
).
This is particularly relevant because, if you can’t run Singularity in privileged
mode (which is almost certainly true in all the scenarios), Singularity containers
are read only.
This is to say that you won’t be able to write anything unless Singularity can
access the host’s filesystem in write mode.
By default, Singularity automatically binds (mounts) the user’s home directory and
a scratch directory.
In addition, Singularity generally allows binding the necessary folders with
the -B <host_folder>:<container_folder>[:<permissions>]
Singularity argument.
For example:
$ singularity run --cleanenv -B /work:/work fmriprep.simg \
/work/my_dataset/ /work/my_dataset/derivatives/fmriprep \
participant \
--participant-label 387 --nthreads 16 \
--omp-nthreads 16
Warning
If your Singularity installation doesn’t allow you to bind non-existent bind points,
you’ll get an error saying WARNING: Skipping user bind, non existent bind point
(directory) in container
.
In this scenario, you can either try to bind things onto some other bind point you
know it exists in the image or rebuild your singularity image with docker2singularity
as follows:
$ docker run --privileged -ti --rm -v /var/run/docker.sock:/var/run/docker.sock \
-v $PWD:/output singularityware/docker2singularity \
-m "/gpfs /scratch /work /share /lscratch /opt/templateflow"
In the example above, the following bind points are created: /gpfs
, /scratch
,
/work
, /share
, /opt/templateflow
.
Note
One great feature of containers is their confinement or isolation from the host
system.
Binding mount points breaks this principle, as the container has now access to
create changes in the host.
Therefore, it is generally recommended to use binding scarcely and granting
very limited access to the minimum necessary resources.
In other words, it is preferred to bind just one subdirectory of $HOME
than
the full $HOME
directory of the host (see #1778 (comment)).
Relevant aspects of the $HOME
directory within the container.
By default, Singularity will bind the user’s $HOME
directory in the host
into the /home/$USER
(or equivalent) in the container.
Most of the times, it will also redefine the $HOME
environment variable and
update it to point to the corresponding mount point in /home/$USER
.
However, these defaults can be overwritten in your system.
It is recommended to check your settings with your system’s administrators.
If your Singularity installation allows it, you can workaround the $HOME
specification combining the bind mounts argument (-B
) with the home overwrite
argument (--home
) as follows:
$ singularity run -B $HOME:/home/fmriprep --home /home/fmriprep \
--cleanenv fmriprep.simg <fmriprep arguments>
TemplateFlow and Singularity¶
TemplateFlow is a helper tool that allows fMRIPrep (or any other neuroimaging workflow) to programmatically access a repository of standard neuroimaging templates. In other words, TemplateFlow allows fMRIPrep to dynamically change the templates that are used, e.g., in the atlas-based brain extraction step or spatial normalization.
Default settings in the Singularity image should get along with the Singularity
installation of your system.
However, deviations from the default configurations of your installation may break
this compatibility.
A particularly problematic case arises when the home directory is mounted in the
container, but the $HOME
environment variable is not correspondingly updated.
Typically, you will experience errors like OSError: [Errno 30] Read-only file system
or FileNotFoundError: [Errno 2] No such file or directory: '/home/fmriprep/.cache'
.
If it is not explicitly forbidden in your installation, the first attempt to overcome this
issue is manually setting the $HOME
directory as follows:
$ singularity run --home $HOME --cleanenv fmriprep.simg <fmriprep arguments>
If the user’s home directory is not automatically bound, then the second step would include manually binding it as in the section above:
$ singularity run -B $HOME:/home/fmriprep --home /home/fmriprep \
--cleanenv fmriprep.simg <fmriprep arguments>
Finally, if the --home
argument cannot be used, you’ll need to provide the container with
writable filesystems where TemplateFlow’s files can be downloaded.
In addition, you will need to indicate fMRIPrep to update the default paths with the new mount
points setting the SINGULARITYENV_TEMPLATEFLOW_HOME
variable.
$ export SINGULARITYENV_TEMPLATEFLOW_HOME=/opt/templateflow # Tell fMRIPrep the mount point
$ singularity run -B <writable-path-on-host>:/opt/templateflow \
--cleanenv fmriprep.simg <fmriprep arguments>
Internet access problems¶
We have identified several conditions in which running fMRIPrep might fail because of spotty or impossible access to Internet.
If your compute node cannot have access to Internet, then you’ll need to make sure
you run fMRIPrep with the --notrack
argument and pull down from TemplateFlow
all the resources that will be necessary.
If that is not the case (i.e., you should be able to hit HTTP/s endpoints), then you can try the following:
VerifiedHTTPSConnection ... Failed to establish a new connection: [Errno 110] Connection timed out
.
If you encounter an error like this, probably you’ll need to set up an http proxy exporting
SINGULARITYENV_http_proxy
(see #1778 (comment)).
For example:
$ export SINGULARITYENV_https_proxy=http://<ip or proxy name>:<port>
requests.exceptions.SSLError: HTTPSConnectionPool ...
.
In this case, your container seems to be able to reach the Internet, but unable to use SSL
encription.
There are two potential solutions to the issue.
The recommended one
is setting REQUESTS_CA_BUNDLE
to the appropriate path, and/or binding
the appropriate filesystem:
$ export SINGULARITYENV_REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt
$ singularity run -B <path-to-certs-folder>:/etc/ssl/certs \
--cleanenv fmriprep.simg <fmriprep arguments>
$ export TEMPLATEFLOW_HOME=/path/to/keep/templateflow
$ python -m pip install -U templateflow # Install the client
$ python
>>> import templateflow.api
>>> templateflow.api.TF_S3_ROOT = 'http://templateflow.s3.amazonaws.com'
>>> api.get(‘MNI152NLin6Asym’)
Finally, run the singularity image binding the appropriate folder:
$ export SINGULARITYENV_TEMPLATEFLOW_HOME=/templateflow
$ singularity run -B ${TEMPLATEFLOW_HOME:-$HOME/.cache/templateflow}:/templateflow \
--cleanenv fmriprep.simg <fmriprep arguments>
Troubleshooting¶
Setting up a functional execution framework with Singularity might be tricky in some
HPC systems.
Please make sure you have read the relevant documentation of Singularity, and checked all the defaults and configuration in your
system.
The next step is checking the environment and access to fMRIPrep resources, using
singularity shell
.
Check access to input data folder, and BIDS validity:
$ singularity shell -B path/to/data:/data fmriprep.simg Singularity fmriprep.simg:~> ls /data CHANGES README dataset_description.json participants.tsv sub-01 sub-02 sub-03 sub-04 sub-05 sub-06 sub-07 sub-08 sub-09 sub-10 sub-11 sub-12 sub-13 sub-14 sub-15 sub-16 task-balloonanalogrisktask_bold.json Singularity fmriprep.simg:~> bids-validator /data 1: [WARN] You should define 'SliceTiming' for this file. If you don't provide this information slice time correction will not be possible. (code: 13 - SLICE_TIMING_NOT_DEFINED) ./sub-01/func/sub-01_task-balloonanalogrisktask_run-01_bold.nii.gz ./sub-01/func/sub-01_task-balloonanalogrisktask_run-02_bold.nii.gz ./sub-01/func/sub-01_task-balloonanalogrisktask_run-03_bold.nii.gz ./sub-02/func/sub-02_task-balloonanalogrisktask_run-01_bold.nii.gz ./sub-02/func/sub-02_task-balloonanalogrisktask_run-02_bold.nii.gz ./sub-02/func/sub-02_task-balloonanalogrisktask_run-03_bold.nii.gz ./sub-03/func/sub-03_task-balloonanalogrisktask_run-01_bold.nii.gz ./sub-03/func/sub-03_task-balloonanalogrisktask_run-02_bold.nii.gz ./sub-03/func/sub-03_task-balloonanalogrisktask_run-03_bold.nii.gz ./sub-04/func/sub-04_task-balloonanalogrisktask_run-01_bold.nii.gz ... and 38 more files having this issue (Use --verbose to see them all). Please visit https://neurostars.org/search?q=SLICE_TIMING_NOT_DEFINED for existing conversations about this issue.Check access to output data folder, and whether you have write permissions.
$ singularity shell -B path/to/data/derivatives/fmriprep-1.5.0:/out fmriprep.simg Singularity fmriprep.simg:~> ls /out Singularity fmriprep.simg:~> touch /out/test Singularity fmriprep.simg:~> rm /out/testCheck access and permissions to
$HOME
:$ singularity shell fmriprep.simg Singularity fmriprep.simg:~> mkdir -p $HOME/.cache/testfolder Singularity fmriprep.simg:~> rmdir $HOME/.cache/testfolderCheck TemplateFlow operation:
$ singularity shell -B path/to/templateflow:/templateflow fmriprep.simg Singularity fmriprep.simg:~> echo ${TEMPLATEFLOW_HOME:-$HOME/.cache/templateflow} /home/users/oesteban/.cache/templateflow Singularity fmriprep.simg:~> python -c "from templateflow.api import get; get(['MNI152NLin2009cAsym', 'MNI152NLin6Asym', 'OASIS30ANTs', 'MNIPediatricAsym', 'MNIInfant'])" Downloading https://templateflow.s3.amazonaws.com/tpl-MNI152NLin6Asym/tpl-MNI152NLin6Asym_res-01_atlas-HOCPA_desc-th0_dseg.nii.gz 304B [00:00, 1.28kB/s] Downloading https://templateflow.s3.amazonaws.com/tpl-MNI152NLin6Asym/tpl-MNI152NLin6Asym_res-01_atlas-HOCPA_desc-th25_dseg.nii.gz 261B [00:00, 1.04kB/s] Downloading https://templateflow.s3.amazonaws.com/tpl-MNI152NLin6Asym/tpl-MNI152NLin6Asym_res-01_atlas-HOCPA_desc-th50_dseg.nii.gz 219B [00:00, 867B/s] ...
Running Singularity on a SLURM system¶
An example of sbatch
script to run fMRIPrep on a SLURM system 2 is given below.
The submission script will generate one task per subject using a job array.
Submission is as easy as:
$ export STUDY=/path/to/some/folder
$ sbatch --array=1-$(( $( wc -l $STUDY/data/participants.tsv | cut -f1 -d' ' ) - 1 )) sbatch.slurm
#!/bin/bash
#
#SBATCH -J fmriprep
#SBATCH --time=48:00:00
#SBATCH -n 1
#SBATCH --cpus-per-task=16
#SBATCH --mem-per-cpu=4G
#SBATCH -p normal,mygroup # Queue names you can submit to
# Outputs ----------------------------------
#SBATCH -o log/%x-%A-%a.out
#SBATCH -e log/%x-%A-%a.err
#SBATCH --mail-user=%u@domain.tld
#SBATCH --mail-type=ALL
# ------------------------------------------
BIDS_DIR="$STUDY/data"
DERIVS_DIR="derivatives/fmriprep-1.5.0"
# Prepare some writeable bind-mount points.
TEMPLATEFLOW_HOST_HOME=$HOME/.cache/templateflow
FMRIPREP_HOST_CACHE=$HOME/.cache/fmriprep
mkdir -p ${TEMPLATEFLOW_HOST_HOME}
mkdir -p ${FMRIPREP_HOST_CACHE}
# Prepare derivatives folder
mkdir -p ${BIDS_DIR}/${DERIVS_DIR}
# This trick will help you reuse freesurfer results across pipelines and fMRIPrep versions
mkdir -p ${BIDS_DIR}/derivatives/freesurfer-6.0.1
if [ ! -d ${BIDS_DIR}/${DERIVS_DIR}/freesurfer ]; then
ln -s ${BIDS_DIR}/derivatives/freesurfer-6.0.1 ${BIDS_DIR}/${DERIVS_DIR}/freesurfer
fi
# Make sure FS_LICENSE is defined in the container.
export SINGULARITYENV_FS_LICENSE=$HOME/.freesurfer.txt
# Designate a templateflow bind-mount point
export SINGULARITYENV_TEMPLATEFLOW_HOME="/templateflow"
SINGULARITY_CMD="singularity run --cleanenv -B $BIDS_DIR:/data -B ${TEMPLATEFLOW_HOST_HOME}:${SINGULARITYENV_TEMPLATEFLOW_HOME} -B $L_SCRATCH:/work $STUDY/images/poldracklab_fmriprep_1.5.0.simg"
# Parse the participants.tsv file and extract one subject ID from the line corresponding to this SLURM task.
subject=$( sed -n -E "$((${SLURM_ARRAY_TASK_ID} + 1))s/sub-(\S*)\>.*/\1/gp" ${BIDS_DIR}/participants.tsv )
# Remove IsRunning files from FreeSurfer
find ${BIDS_DIR}/derivatives/freesurfer-6.0.1/sub-$subject/ -name "*IsRunning*" -type f -delete
# Compose the command line
cmd="${SINGULARITY_CMD} /data /data/${DERIVS_DIR} participant --participant-label $subject -w /work/ -vv --omp-nthreads 8 --nthreads 12 --mem_mb 30000 --output-spaces MNI152NLin2009cAsym:res-2 anat fsnative fsaverage5 --use-aroma"
# Setup done, run the command
echo Running task ${SLURM_ARRAY_TASK_ID}
echo Commandline: $cmd
eval $cmd
exitcode=$?
# Output results to a table
echo "sub-$subject ${SLURM_ARRAY_TASK_ID} $exitcode" \
>> ${SLURM_JOB_NAME}.${SLURM_ARRAY_JOB_ID}.tsv
echo Finished tasks ${SLURM_ARRAY_TASK_ID} with exit code $exitcode
exit $exitcode
- 2
assuming that job arrays and Singularity are available