Datasets

Pypes already includes sets of workflows to against specific datasets (listed in the sub-sections below).

For preparation you have to:

  • download the raw dataset or,
  • organize your data in the same way explained below, and
  • setup the pipeline parameters.

COBRE

The COBRE dataset consists of raw anatomical and functional MR data from 72 patients with Schizophrenia and 75 healthy controls.

Once you download it, you will find a file structure tree as this: {base_dir}/cobre/{subject_id}/session_1/{modality}/{image}.

Run a T1-weighted and resting-state fMRI pre-processing for this database with: neuro_pypes.datasets.cobre_crumb_workflow.

How to use it

import os.path as path

from hansel import Crumb
from neuro_pypes.datasets import cobre_crumb_workflow
from neuro_pypes.run import run_debug

# we downloaded the database in:
base_dir = '/home/pyper/data/cobre/raw'
cobre_tree = path.join('{subject_id}', 'session_1', '{modality}', '{image}')

# we define the database tree
cobre_crumb = Crumb(path.join(base_dir, cobre_tree), ignore_list=['.*'])

# output and working dir
output_dir = path.join(path.dirname(base_dir), 'out')
cache_dir  = path.join(path.dirname(base_dir), 'wd')

# we have a configuration file in:
config_file = path.join(path.dirname(base_dir), 'pypes_config.yml')

# we choose what pipeline set we want to run.
# the choices are: 'spm_anat_preproc', 'spm_rest_preproc'
wf_name = 'spm_rest_preproc' # for MPRAGE and rs-fMRI preprocessing

# instantiate the workflow
wf = cobre_crumb_workflow(wf_name     = wf_name,
                          data_crumb  = cobre_crumb,
                          cache_dir   = cache_dir,
                          output_dir  = output_dir,
                          config_file = config_file,
                         )

# run it
run_debug(wf, plugin='MultiProc', n_cpus=4)

My clinical dataset

Sadly, this is not public available.

The dataset we are working in our department has a very similar folder structure as COBRE: {base_dir}/{subject_id}/{session_id}/{image}. If you organize your data in the same way, you can directly use the function neuro_pypes.datasets.clinical_crumb_workflow. Have a look at the _clinical_wf_setup function to see all the sets of pipelines you can pick.

import os.path as path

from hansel import Crumb
from neuro_pypes.datasets import clinical_crumb_workflow
from neuro_pypes.run import run_debug

# we downloaded the database in:
base_dir = '/home/pyper/data/nuk/raw'
data_tree = path.join('{subject_id}', '{session_id}', '{image}')

# we define the database tree
data_crumb = Crumb(path.join(base_dir, data_tree), ignore_list=['.*'])

# output and working dir
output_dir = path.join(path.dirname(base_dir), 'out')
cache_dir  = path.join(path.dirname(base_dir), 'wd')

# we have a configuration file in:
config_file = path.join(path.dirname(base_dir), 'pypes_config.yml')

# we choose what pipeline set we want to run.
# there are many choices
wf_name = 'spm_anat_pet_tpm_pvc' # MPRAGE preprocessing, PET MNI group template, PET PVC, and PET normalization to group template
# another could be 'anat_dti_camino' for MPRAGE and DTI/tractography

# instantiate the workflow
wf = clinical_crumb_workflow(wf_name     = wf_name,
                             data_crumb  = data_crumb,
                             cache_dir   = cache_dir,
                             output_dir  = output_dir,
                             config_file = config_file,
                             )

# run it
run_debug(wf, plugin='MultiProc', n_cpus=4)