fMRIprep Pipeline To Machine Learning

Overview

fMRIprep Pipeline To Machine Learning(Demo)

所有配置均在config.py文件下定义

前置环境(lilab)

  • 各个节点均安装docker,并有fmripre的镜像
  • 可以使用conda中的base环境(相应的第三份包之后更新)

1. fmriprep script on single machine(docker)

config.py中的fMRI_Prep_Job类中配置相应变量,注意在修改cmd时,不能修改{}中的关键字。在执行此步骤时,将自动在bids同级目录下建立processed文件夹,用来存放后处理数据。其中处理后的fmriprep数据存放在processed/frmriprepprceossed/fressurfer中。

class fMRI_Prep_Job:
    # input data path
    bids_data_path  = "/share/data2/dataset/ds002748/depression"
    # 一个容器中处理多少个被试 
    step = 8
    # fmriprep opm thread
    thread = 9
    # max work contianers
    max_work_nums = 10

    # 在bids同级目录下创建processed文件夹
    bids_output_path = os.path.join("/".join(bids_data_path.split('/')[:-1]),'processed')
    if not os.path.exists(bids_output_path):
        os.mkdir(bids_output_path)
    # fmri work path 
    fmri_work="/share/fmri_work"
    # freesurfer_license
    freesurfer_license = "/share/user_data/public/fanq_ocd/license.txt"
    # contianer id fmriprep
    contianer_id = "d7235efbbd3c"
    # fmriprep cmd 
    cmd ="docker run -it --rm -v {bids_data_path}:/data -v {freesurfer_license}:/opt/freesurfer/license.txt -v {bids_output_path}:/out -v {fmri_work}:/work {contianer_id} /data /out --skip_bids_validation --ignore slicetiming fieldmaps  -w /work --omp-nthreads {thread} --fs-no-reconall --resource-monitor participant --participant-label {subject_ids}"

2. fmriprep post preocess

这一步的操作主要依赖于fmribrant,主要作用是回归掉白质信号、脑脊液信号、全脑信号、头动信息、并进行滤波(可选),将其处理后的文件放存在prcoessed/post-precoss/ fliter/clean_imgs 中, 可选表示是否进行滤波。该配置中不建议修改dataset_path,store_path

class PostProcess:
    """
    fmriprep 后处理数据
    """
    # 类型的名字
    task_type = "rest"

    dataset_path = os.path.join(fMRI_Prep_Job.bids_output_path,'fmriprep')

    store_path = os.path.join(fMRI_Prep_Job.bids_output_path,'post-process')

    t_r = 2.5

    low_pass = 0.08

    high_pass = 0.01

    n_process = 40

    if t_r != None:
        store_path = os.path.join(store_path,'filter','clean_imgs')
    else:
        store_path = os.path.join(store_path,'unfilter','clean_imgs')

    os.makedirs(store_path,exist_ok=True)

3.获取ROI级别的时间序列

atlas由271个roi组成,分别是Schaefer_200(皮上),Tianye_54(皮下),Buckner_17(小脑)。由于在fmribrant中实现提取时间序列的功能,简单封装一下。

class RoiTs:
    """
    ROI 级别时间序列
    处理271个全脑roi
    """
    n_process = 40

    # 如果在第二步fmri post process已经滤波之后,不建议再次使用滤波操作
    t_r = None
    
    low_pass = None

    high_pass = None
    
    flag_gs = False #  回归全脑均值为 True 否则为False
    # 以下内容不建议修改

    if flag_gs:
        file_name = "*with_gs.nii.gz"
        ts_file = "GS"
    else:
        file_name = "*without_gs.nii.gz"
        ts_file = "NO_GS"
    
    reg_path = os.path.join(PostProcess.store_path,"*",PostProcess.task_type,file_name)
    
    subject_id_index = -3

    save_path = os.path.join("/".join(PostProcess.store_path.split('/')[:-1]),'timeseries',ts_file)

    os.makedirs(save_path,exist_ok=True)

4. Machine Learning(Baseline)

这一步是可选的,一般先用来看看FC做性别分类、年龄回归的效果如何。只保留粗略结果,详细结果可以使用baseline这个包。

class ML:
    # 选择的subject id 默认是全部
    sub_ids = [i.split('.')[0] for i in os.listdir(RoiTs.save_path)]
    # 量表位置
    csv = pd.read_csv('/share/data2/dataset/ds002748/depression/participants.tsv',sep='\t')
    #取交集
    csv = pd.DataFrame({"participant_id":sub_ids}).merge(csv)
    # 分类的任务
    classifies = ["gender"]
    # 回归的任务
    regressions = ["age"]
    # 分类模型
    classify_models = [SVC(),SVC(C=100),SVC(kernel='linear'),SVC(kernel='linear',C=100)]
    # 回归模型
    regress_models = [SVR(),SVR(C=100),SVR(kernel='linear'),SVR(kernel='linear',C=100)]
    kfold = 3
    # 多少个roi
    rois = 200

5. run

修改script/run.py

from fmriprep_job import run_fmri_prep
from fmriprep_pprocess import  run as pp_run
from roi2ts import run as roi_ts_run
from fast_fc_ml import run as ml_run


if __name__ =='__main__':
    run_fmri_prep() # fmriprep
    pp_run() # fmriprep post process
    roi_ts_run() # get roi time series
    ml_run() # machine learning

然后执行

python run.py

6. To Do

  • 质量控制
Owner
Alien
A student
Alien
A Python-based application demonstrating various search algorithms, namely Depth-First Search (DFS), Breadth-First Search (BFS), and A* Search (Manhattan Distance Heuristic)

A Python-based application demonstrating various search algorithms, namely Depth-First Search (DFS), Breadth-First Search (BFS), and the A* Search (using the Manhattan Distance Heuristic)

17 Aug 14, 2022
This is a curated list of medical data for machine learning

Medical Data for Machine Learning This is a curated list of medical data for machine learning. This list is provided for informational purposes only,

Andrew L. Beam 5.4k Dec 26, 2022
Open source time series library for Python

PyFlux PyFlux is an open source time series library for Python. The library has a good array of modern time series models, as well as a flexible array

Ross Taylor 2k Jan 02, 2023
Machine Learning Model to predict the payment date of an invoice when it gets created in the system.

Payment-Date-Prediction Machine Learning Model to predict the payment date of an invoice when it gets created in the system.

15 Sep 09, 2022
Flask app to predict daily radiation from the time series of Solcast from Islamabad, Pakistan

Solar-radiation-ISB-MLOps - Flask app to predict daily radiation from the time series of Solcast from Islamabad, Pakistan.

Abid Ali Awan 1 Dec 31, 2021
Dieses Projekt ermöglicht es den Smartmeter der EVN (Netz Niederösterreich) über die Kundenschnittstelle auszulesen.

SmartMeterEVN Dieses Projekt ermöglicht es den Smartmeter der EVN (Netz Niederösterreich) über die Kundenschnittstelle auszulesen. Smart Meter werden

greenMike 43 Dec 04, 2022
Short PhD seminar on Machine Learning Security (Adversarial Machine Learning)

Short PhD seminar on Machine Learning Security (Adversarial Machine Learning)

141 Dec 27, 2022
Data Efficient Decision Making

Data Efficient Decision Making

Microsoft 197 Jan 06, 2023
Distributed Evolutionary Algorithms in Python

DEAP DEAP is a novel evolutionary computation framework for rapid prototyping and testing of ideas. It seeks to make algorithms explicit and data stru

Distributed Evolutionary Algorithms in Python 4.9k Jan 05, 2023
Machine Learning Techniques using python.

👋 Hi, I’m Fahad from TEXAS TECH. 👀 I’m interested in Optimization / Machine Learning/ Statistics 🌱 I’m currently learning Machine Learning and Stat

FAHAD MOSTAFA 1 Jan 19, 2022
Module for statistical learning, with a particular emphasis on time-dependent modelling

Operating system Build Status Linux/Mac Windows tick tick is a Python 3 module for statistical learning, with a particular emphasis on time-dependent

X - Data Science Initiative 410 Dec 14, 2022
scikit-learn: machine learning in Python

scikit-learn is a Python module for machine learning built on top of SciPy and is distributed under the 3-Clause BSD license. The project was started

neurodata 3 Dec 16, 2022
Python implementation of Weng-Lin Bayesian ranking, a better, license-free alternative to TrueSkill

Python implementation of Weng-Lin Bayesian ranking, a better, license-free alternative to TrueSkill This is a port of the amazing openskill.js package

Open Debates Project 156 Dec 14, 2022
🌊 River is a Python library for online machine learning.

River is a Python library for online machine learning. It is the result of a merger between creme and scikit-multiflow. River's ambition is to be the go-to library for doing machine learning on strea

OnlineML 4k Jan 03, 2023
Data from "Datamodels: Predicting Predictions with Training Data"

Data from "Datamodels: Predicting Predictions with Training Data" Here we provid

Madry Lab 51 Dec 09, 2022
Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost Models on Time Series data sets with a Single Line of Code. Now updated with Dask to handle millions of rows.

Auto_TS: Auto_TimeSeries Automatically build multiple Time Series models using a Single Line of Code. Now updated with Dask. Auto_timeseries is a comp

AutoViz and Auto_ViML 519 Jan 03, 2023
Toolss - Automatic installer of hacking tools (ONLY FOR TERMUKS!)

Tools Автоматический установщик хакерских утилит (ТОЛЬКО ДЛЯ ТЕРМУКС!) Оригиналь

14 Jan 05, 2023
Predicting job salaries from ads - a Kaggle competition

Predicting job salaries from ads - a Kaggle competition

Zygmunt Zając 57 Oct 23, 2020
An easier way to build neural search on the cloud

Jina is geared towards building search systems for any kind of data, including text, images, audio, video and many more. With the modular design & multi-layer abstraction, you can leverage the effici

Jina AI 17k Jan 01, 2023
2021 Machine Learning Security Evasion Competition

2021 Machine Learning Security Evasion Competition This repository contains code samples for the 2021 Machine Learning Security Evasion Competition. P

Fabrício Ceschin 8 May 01, 2022