A modified version of DeepMind's Alphafold2 to divide CPU part (MSA and template searching) and GPU part (prediction model)

Overview

ParallelFold

Author: Bozitao Zhong

This is a modified version of DeepMind's Alphafold2 to divide CPU part (MSA and template searching) and GPU part (prediction model) of Alphafold2 local version.

How to install

First you should install Alphafold2. You can choose one of the following methods to install Alphafold locally.

  • Use official version from DeepMind with docker.
  • There are some other versions install Alphafold without docker.
  • Also you can use my guide which based on non_docker version and it can adjust to different cuda versions (cuda driver >= 10.1)

Then, put these 4 files in your Alphafold folder, this folder should have an original run_alphafold.py file, and I use a run_alphafold.sh file to run Alphafold easily (learned from non_docker version)

4 files:

  • run_alphafold.py: modified version of original run_alphafold.py, it skips featuring steps when there exists feature.pkl in output folder
  • run_alphaold.sh: bash script to run run_alphafold.py
  • run_feature.py: modified version of original run_alphafold.py, it exit python process after finished writing feature.pkl
  • run_feature.sh: bash scripts to run run_feature.py

How to run

First, you need CPUs to run run_feature.sh:

./run_feature.sh -d data -o output -m model_1 -f input/test3.fasta -t 2021-07-27

8 CPUs is enough, according to my test, more CPUs won't help with speed.

GPU can accelerate the hhblits step (but I think you choose this repo because GPU is expensive)

Featuring step will output the feature.pkl and MSA folder in your output folder: ./output/JOBNAME/

PS: Here I put my input files in an input folder to better organize my files, you can remove this.

Second, you can run run_alphafold.sh using GPU:

./run_alphafold.sh -d data -o output -m model_1,model_2,model_3,model_4,model_5 -f input/test.fasta -t 2021-07-27

If you have successfully output feature.pkl, you can have a very fast featuring step

I have also upload my scripts in SJTU HPC (using slurm): sub_alphafold.slurm and sub_feature.slurm

Other Files

In ./Alphafold folder, I modified some python files (hhblits.py, hmmsearch.py, jackhmmer.py) , give these steps more CPUs for acceleration. But these processes have been tested and shown to be unable to accelerate by providing more CPU. Maybe this is because

Probably because DeepMind uses a wrapped process, I'm trying to improve it (work in progress).

If you have any question, please send your problem in issues

Comments
  • 运行脚本后,还是有问题。

    运行脚本后,还是有问题。

    博士好! 我发现我运行脚本后,cpu部分是可以正常运行了,但是GPU部分不管短序列(200+aa)还是长序列(1800+aa),都会报错,我的脚本如下: #!/bin/bash module load anaconda/2020.11 source activate /data/home/zhoujy/run/alphafold2 ./run_feature.sh -d /data/public/alphafold2 -o /data/home/zhoujy/run/output -m model_1 -f /data/home/zhoujy/run/input/Q9NYP9.fasta -t 2021-07-27 ./run_alphafold.sh -d /data/public/alphafold2 -o /data/home/zhoujy/run/output -m model_1,model_2,model_3,model_4,model_5 -f /data/home/zhoujy/run/input/Q9NYP9.fasta -t 2021-07-27

    用了1张GPU卡提交的。

    报错内容如下:

    87 I0927 17:05:14.162350 139818804778816 xla_bridge.py:226] Unable to initialize backend 'tpu': Invalid argument: TpuPlatform is not available. 88 I0927 17:05:23.883118 139818804778816 run_alphafold.py:272] Have 5 models: ['model_1', 'model_2', 'model_3', 'model_4', 'model_5'] 89 I0927 17:05:23.883379 139818804778816 run_alphafold.py:285] Using random seed 491376288278862761 for the data pipeline 90 I0927 17:05:23.892619 139818804778816 run_alphafold.py:151] Running model model_1 91 I0927 17:05:34.480318 139818804778816 model.py:131] Running predict with shape(feat) = {'aatype': (4, 233), 'residue_index': (4, 233), 'seq_length': (4,) , 'template_aatype': (4, 4, 233), 'template_all_atom_masks': (4, 4, 233, 37), 'template_all_atom_positions': (4, 4, 233, 37, 3), 'template_sum_probs': (4 , 4, 1), 'is_distillation': (4,), 'seq_mask': (4, 233), 'msa_mask': (4, 508, 233), 'msa_row_mask': (4, 508), 'random_crop_to_size_seed': (4, 2), 'templat e_mask': (4, 4), 'template_pseudo_beta': (4, 4, 233, 3), 'template_pseudo_beta_mask': (4, 4, 233), 'atom14_atom_exists': (4, 233, 14), 'residx_atom14_to_ atom37': (4, 233, 14), 'residx_atom37_to_atom14': (4, 233, 37), 'atom37_atom_exists': (4, 233, 37), 'extra_msa': (4, 5120, 233), 'extra_msa_mask': (4, 51 20, 233), 'extra_msa_row_mask': (4, 5120), 'bert_mask': (4, 508, 233), 'true_msa': (4, 508, 233), 'extra_has_deletion': (4, 5120, 233), 'extra_deletion_v alue': (4, 5120, 233), 'msa_feat': (4, 508, 233, 49), 'target_feat': (4, 233, 22)} 92 2021-09-27 17:05:35.143686: W external/org_tensorflow/tensorflow/stream_executor/gpu/asm_compiler.cc:81] Couldn't get ptxas version string: Internal: Run ning ptxas --version returned 32512 93 2021-09-27 17:05:35.324896: F external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:479] ptxas returned an error during compilati on of ptx to sass: 'Internal: ptxas exited with non-zero error code 32512, output: ' If the error message indicates that a file could not be written, pl ease verify that sufficient filesystem space is provided. 94 Fatal Python error: Aborted 95 96 Thread 0x00007f2a1a311740 (most recent call first): 97 File "/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/jax/interpreters/xla.py", line 387 in backend_compile 98 File "/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/jax/interpreters/xla.py", line 324 in xla_primitive_callable 99 File "/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/jax/_src/util.py", line 188 in cached 100 File "/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/jax/_src/util.py", line 195 in wrapper 101 File "/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/jax/interpreters/xla.py", line 275 in apply_primitive 102 File "/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/jax/core.py", line 612 in process_primitive 103 File "/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/jax/core.py", line 267 in bind 104 File "/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/jax/_src/lax/lax.py", line 388 in shift_right_logical 105 File "/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/jax/_src/prng.py", line 229 in threefry_seed 106 File "/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/jax/_src/prng.py", line 191 in seed_with_impl 107 File "/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/jax/_src/random.py", line 105 in PRNGKey 108 File "/data/run01/zhoujy/ParallelFold-main/alphafold/model/model.py", line 133 in predict 109 File "/data/run01/zhoujy/ParallelFold-main/run_alphafold.py", line 158 in predict_structure 110 File "/data/run01/zhoujy/ParallelFold-main/run_alphafold.py", line 289 in main 111 File "/data/home/zhoujy/.local/lib/python3.8/site-packages/absl/app.py", line 258 in _run_main 112 File "/data/home/zhoujy/.local/lib/python3.8/site-packages/absl/app.py", line 312 in run 113 File "/data/run01/zhoujy/ParallelFold-main/run_alphafold.py", line 316 in

    从92-113行,不论序列长短都会出现这种报错。这是什么原因引起的呢? @Zuricho

    opened by zhoujingyu13687306871 16
  • Where Can I find The Protein sequence?

    Where Can I find The Protein sequence?

    After Reading the Article, AlphaFold Deployment and Optimization on HPC Platform, I want make some experiments according to the arctile, But I cannot find the Protein sequence online. Can you tell me the way to downloading the fasta file in the article?

    opened by yanchenmochen 4
  • How to run GPU part?

    How to run GPU part?

    How do I run model inference on GPU part of the process after featurization step? Does the model inference step automatically find feature.pkl in some folder?

    opened by hrzolix 4
  • How to accelerate the HHBLITS step with GPU

    How to accelerate the HHBLITS step with GPU

    Halo! Thanks for your good job! I have some question about this job:

    Q1: Do you Know how to accelerate the HHBLITS step with GPU? image

    Q2: I use --cpu 8 to run jackhmmer but alway just use 2 cpu and I dont know why

    image

    opened by Licko0909 4
  • 2022-01-11 09:19:03.536275: F external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:479] ptxas returned an error during compilation of ptx to sass: 'Internal: ptxas exited with non-zero error code 32512, output: '  If the error message indicates that a file could not be written, please verify that sufficient filesystem space is provided. Fatal Python error: Aborted

    2022-01-11 09:19:03.536275: F external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:479] ptxas returned an error during compilation of ptx to sass: 'Internal: ptxas exited with non-zero error code 32512, output: ' If the error message indicates that a file could not be written, please verify that sufficient filesystem space is provided. Fatal Python error: Aborted

    2022-01-11 09:19:02.638037: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 28422 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:41:00.0, compute capability: 7.0) I0111 09:19:03.171788 47078973446272 model.py:165] Running predict with shape(feat) = {'aatype': (4, 45), 'residue_index': (4, 45), 'seq_length': (4,), 'template_aatype': (4, 4, 45), 'template_all_atom_masks': (4, 4, 45, 37), 'template_all_atom_positions': (4, 4, 45, 37, 3), 'template_sum_probs': (4, 4, 1), 'is_distillation': (4,), 'seq_mask': (4, 45), 'msa_mask': (4, 508, 45), 'msa_row_mask': (4, 508), 'random_crop_to_size_seed': (4, 2), 'template_mask': (4, 4), 'template_pseudo_beta': (4, 4, 45, 3), 'template_pseudo_beta_mask': (4, 4, 45), 'atom14_atom_exists': (4, 45, 14), 'residx_atom14_to_atom37': (4, 45, 14), 'residx_atom37_to_atom14': (4, 45, 37), 'atom37_atom_exists': (4, 45, 37), 'extra_msa': (4, 5120, 45), 'extra_msa_mask': (4, 5120, 45), 'extra_msa_row_mask': (4, 5120), 'bert_mask': (4, 508, 45), 'true_msa': (4, 508, 45), 'extra_has_deletion': (4, 5120, 45), 'extra_deletion_value': (4, 5120, 45), 'msa_feat': (4, 508, 45, 49), 'target_feat': (4, 45, 22)} 2022-01-11 09:19:03.503247: W external/org_tensorflow/tensorflow/stream_executor/gpu/asm_compiler.cc:81] Couldn't get ptxas version string: Internal: Running ptxas --version returned 32512 2022-01-11 09:19:03.536275: F external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:479] ptxas returned an error during compilation of ptx to sass: 'Internal: ptxas exited with non-zero error code 32512, output: ' If the error message indicates that a file could not be written, please verify that sufficient filesystem space is provided. Fatal Python error: Aborted

    Thread 0x00002ad16d7d1880 (most recent call first): File "/public/software/.local/easybuild/software/Anaconda3/2020.02/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 360 in backend_compile File "/public/software/.local/easybuild/software/Anaconda3/2020.02/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 297 in xla_primitive_callable File "/public/software/.local/easybuild/software/Anaconda3/2020.02/envs/alphafold/lib/python3.8/site-packages/jax/_src/util.py", line 179 in cached File "/public/software/.local/easybuild/software/Anaconda3/2020.02/envs/alphafold/lib/python3.8/site-packages/jax/_src/util.py", line 186 in wrapper File "/public/software/.local/easybuild/software/Anaconda3/2020.02/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 248 in apply_primitive File "/public/software/.local/easybuild/software/Anaconda3/2020.02/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 603 in process_primitive File "/public/software/.local/easybuild/software/Anaconda3/2020.02/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 264 in bind File "/public/software/.local/easybuild/software/Anaconda3/2020.02/envs/alphafold/lib/python3.8/site-packages/jax/_src/lax/lax.py", line 382 in shift_right_logical File "/public/software/.local/easybuild/software/Anaconda3/2020.02/envs/alphafold/lib/python3.8/site-packages/jax/_src/random.py", line 75 in PRNGKey File "/public/software/.local/easybuild/software/ParallelFold/ParallelFold/alphafold/model/model.py", line 167 in predict File "/public/software/.local/easybuild/software/ParallelFold/ParallelFold/run_alphafold.py", line 210 in predict_structure File "/public/software/.local/easybuild/software/ParallelFold/ParallelFold/run_alphafold.py", line 429 in main File "/public/software/.local/easybuild/software/Anaconda3/2020.02/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 258 in _run_main File "/public/software/.local/easybuild/software/Anaconda3/2020.02/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 312 in run File "/public/software/.local/easybuild/software/ParallelFold/ParallelFold/run_alphafold.py", line 455 in ./run_alphafold.sh: line 233: 7015 Aborted python $alphafold_script --fasta_paths=$fasta_path --model_names=$model_selection --data_dir=$data_dir --output_dir=$output_dir --jackhmmer_binary_path=$jackhmmer_binary_path --hhblits_binary_path=$hhblits_binary_path --hhsearch_binary_path=$hhsearch_binary_path --hmmsearch_binary_path=$hmmsearch_binary_path --hmmbuild_binary_path=$hmmbuild_binary_path --kalign_binary_path=$kalign_binary_path --uniref90_database_path=$uniref90_database_path --mgnify_database_path=$mgnify_database_path --bfd_database_path=$bfd_database_path --small_bfd_database_path=$small_bfd_database_path --uniclust30_database_path=$uniclust30_database_path --uniprot_database_path=$uniprot_database_path --pdb70_database_path=$pdb70_database_path --pdb_seqres_database_path=$pdb_seqres_database_path --template_mmcif_dir=$template_mmcif_dir --max_template_date=$max_template_date --obsolete_pdbs_path=$obsolete_pdbs_path --db_preset=$db_preset --model_preset=$model_preset --benchmark=$benchmark --amber_relaxation=$amber_relaxation --recycling=$recycling --run_feature=$run_feature --logtostderr

    opened by chenshixinnb 3
  • ValueError: jaxlib is version 0.1.69, but this version of jax requires version 0.1.74.

    ValueError: jaxlib is version 0.1.69, but this version of jax requires version 0.1.74.

    根据您的步骤安装conda环境, 在conda环境中执行:import jax; print(jax.devices()) 报错:ValueError: jaxlib is version 0.1.69, but this version of jax requires version 0.1.74. 请问如何解决呢,谢谢!

    opened by chenshixinnb 3
  • somthing wrong occured when I run the job

    somthing wrong occured when I run the job

    hi,dear author , I installed the required modules according to the link requirements, but the following error occurred when I was running the script. Can you help me find out what is causing it? My installation steps are as follows: 1、conda create --prefix=/data/home/zhoujy/run/alphafold2 python=3.8 2、conda activate /data/home/zhoujy/run/alphafold2 3、conda install cudatoolkit=10.1 cudnn 4、pip install tensorflow==2.3.0 5、pip install biopython==1.79 chex==0.0.7 dm-haiku==0.0.4 dm-tree==0.1.6 immutabledict==2.0.0 jax==0.2.14 ml-collections==0.1.0 6、pip install --upgrade jax jaxlib==0.1.69+cuda101 -f https://storage.googleapis.com/jax-releases/jax_releases.html

    and then , I run the script:

    #!/bin/bash module load anaconda/2020.11 source activate /data/home/zhoujy/run/alphafold2 ./run_feature.sh -d /data/public/alphafold2 -o /data/home/zhoujy/run/output -m model_1 -f /data/home/zhoujy/run/input/Tb927.10.2950.fasta -t 2021-07-27

    result show as follows: Traceback (most recent call last): File "/data/run01/zhoujy/ParallelFold-main/run_feature.py", line 33, in from alphafold.model import data File "/data/run01/zhoujy/ParallelFold-main/alphafold/model/data.py", line 20, in from alphafold.model import utils File "/data/run01/zhoujy/ParallelFold-main/alphafold/model/utils.py", line 21, in import haiku as hk File "/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/haiku/init.py", line 17, in from haiku import data_structures File "/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/haiku/data_structures.py", line 17, in from haiku._src.data_structures import to_immutable_dict File "/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/haiku/_src/data_structures.py", line 30, in from haiku._src import utils File "/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/haiku/_src/utils.py", line 24, in import jax File "/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/jax/init.py", line 16, in from .api import ( File "/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/jax/api.py", line 38, in from . import core File "/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/jax/core.py", line 31, in from . import dtypes File "/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/jax/dtypes.py", line 31, in from .lib import xla_client File "/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/jax/lib/init.py", line 51, in from jaxlib import pytree ImportError: cannot import name 'pytree' from 'jaxlib' (/data/home/zhoujy/run/alphafold2/lib/python3.8/site-packages/jaxlib/init.py)

    why ? I need you help

    opened by zhoujingyu13687306871 2
  • Limit RAM usage

    Limit RAM usage

    Im trying to run a fasta file with 3643 in length. MSA part was done, but the inference part tried to allocate 80 GB of VRAM on GPU which I dont have access to, Graphic cards are NVIDIA Tesla V100 16 GB. Now im trying to run inference on CPU which is a very slow process, and the job keeps using a lot of RAM and expand the usage as the time passes. Can I limit usage of RAM somehow? Or can I run inference on more graphic cards maybe with parallel process?

    opened by hrzolix 1
  • GPU利用率问题

    GPU利用率问题

    博士好!我昨天进行多次尝试后,现在可以运行了,但是我发现运行run_alphafold.sh脚本的时候,涉及GPU计算部分,在相当长的一段时间处于CPU运行状态,GPU利用率长时间为0,我尝试计算一条序列长为2000的蛋白质,用了4个V100的卡,计算了9天,这个速度和情况这个是否正常呢?另外前面在安装tensorflow阶段,是否有必要安装GPU版的tensorflow呢?

    @Zuricho

    opened by zhoujingyu13687306871 1
  • Error after GPU part

    Error after GPU part

    Hi, after installation the "CPU part" (jackhammer and hhblits) work well. But when i start the gpu part, i've got this error message: TypeError: take requires ndarray or scalar arguments, got <class 'list'> at position 0.

    1st part: ./run_feature.sh -d data -o ./tmp -m model_1,model_2,model_3,model_4,model_5 -f ./query/1crn.fasta -t 2021-07-27 2st part: ./run_alphafold.sh -d data -o ./tmp -m model_1,model_2,model_3,model_4,model_5 -f ./query/1crn.fasta -t 2021-07-27

    Full error message: File "/softwares/alphafold/run_alphafold.py", line 316, in app.run(main) File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 312, in run _run_main(main, args) File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main sys.exit(main(argv)) File "/softwares/alphafold/run_alphafold.py", line 289, in main predict_structure( File "/softwares/alphafold/run_alphafold.py", line 188, in predict_structure relaxed_pdb_str, _, _ = amber_relaxer.process(prot=unrelaxed_protein) File "/softwares/alphafold/alphafold/relax/relax.py", line 58, in process out = amber_minimize.run_pipeline( File "/softwares/alphafold/alphafold/relax/amber_minimize.py", line 482, in run_pipeline ret.update(get_violation_metrics(prot)) File "/softwares/alphafold/alphafold/relax/amber_minimize.py", line 356, in get_violation_metrics structural_violations, struct_metrics = find_violations(prot) File "/softwares/alphafold/alphafold/relax/amber_minimize.py", line 338, in find_violations violations = folding.find_structural_violations( File "/softwares/alphafold/alphafold/model/folding.py", line 757, in find_structural_violations atom14_atom_radius = batch['atom14_atom_exists'] * utils.batched_gather( File "/softwares/alphafold/alphafold/model/utils.py", line 39, in batched_gather return take_fn(params, indices) File "/softwares/alphafold/alphafold/model/utils.py", line 36, in take_fn = lambda p, i: jnp.take(p, i, axis=axis) File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py", line 5383, in take return _take(a, indices, None if axis is None else operator.index(axis), out, File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback return fun(*args, **kwargs) File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/jax/_src/api.py", line 411, in cache_miss out_flat = xla.xla_call( File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 1618, in bind return call_bind(self, fun, *args, **params) File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 1609, in call_bind outs = primitive.process(top_trace, fun, tracers, params) File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 1621, in process return trace.process_call(self, fun, tracers, params) File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 615, in process_call return primitive.impl(f, *tracers, **params) File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 622, in _xla_call_impl compiled_fun = _xla_callable(fun, device, backend, name, donated_invars, File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/jax/linear_util.py", line 262, in memoized_fun ans = call(fun, *args) File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 694, in _xla_callable return lower_xla_callable(fun, device, backend, name, donated_invars, *arg_specs).compile().unsafe_call File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 702, in lower_xla_callable jaxpr, out_avals, consts = pe.trace_to_jaxpr_final( File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1522, in trace_to_jaxpr_final jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(fun, main, in_avals) File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1500, in trace_to_subjaxpr_dynamic ans = fun.call_wrapped(*in_tracers) File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/jax/linear_util.py", line 166, in call_wrapped ans = self.f(*args, **dict(self.params, **kwargs)) File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py", line 5390, in _take _check_arraylike("take", a) File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py", line 559, in _check_arraylike raise TypeError(msg.format(fun_name, type(arg), pos)) jax._src.traceback_util.UnfilteredStackTrace: TypeError: take requires ndarray or scalar arguments, got <class 'list'> at position 0.

    The stack trace below excludes JAX-internal frames. The preceding is the original exception that occurred, unmodified.


    The above exception was the direct cause of the following exception:

    Traceback (most recent call last): File "/softwares/alphafold/run_alphafold.py", line 316, in app.run(main) File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 312, in run _run_main(main, args) File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main sys.exit(main(argv)) File "/softwares/alphafold/run_alphafold.py", line 289, in main predict_structure( File "/softwares/alphafold/run_alphafold.py", line 188, in predict_structure relaxed_pdb_str, _, _ = amber_relaxer.process(prot=unrelaxed_protein) File "/softwares/alphafold/alphafold/relax/relax.py", line 58, in process out = amber_minimize.run_pipeline( File "/softwares/alphafold/alphafold/relax/amber_minimize.py", line 482, in run_pipeline ret.update(get_violation_metrics(prot)) File "/softwares/alphafold/alphafold/relax/amber_minimize.py", line 356, in get_violation_metrics structural_violations, struct_metrics = find_violations(prot) File "/softwares/alphafold/alphafold/relax/amber_minimize.py", line 338, in find_violations violations = folding.find_structural_violations( File "/softwares/alphafold/alphafold/model/folding.py", line 757, in find_structural_violations atom14_atom_radius = batch['atom14_atom_exists'] * utils.batched_gather( File "/softwares/alphafold/alphafold/model/utils.py", line 39, in batched_gather return take_fn(params, indices) File "/softwares/alphafold/alphafold/model/utils.py", line 36, in take_fn = lambda p, i: jnp.take(p, i, axis=axis) File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py", line 5383, in take return _take(a, indices, None if axis is None else operator.index(axis), out, File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py", line 5390, in _take _check_arraylike("take", a) File "/softwares/alphafold/envs/alphafold/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py", line 559, in _check_arraylike raise TypeError(msg.format(fun_name, type(arg), pos)) TypeError: take requires ndarray or scalar arguments, got <class 'list'> at position 0.

    opened by ebettler 1
  • Running ParallelFold on reduced database?

    Running ParallelFold on reduced database?

    Is it possible to run ParallelFold on reduced_dbs, or is it not yet supported? I tried to use -c reduced_dbs but it did not work. Then I tried modifying the bfd_path set in run_alphafold.sh, somehow it threw directory/file cannot found error. (I'm pretty sure it's there bc I'm able to run alphafold using it). Thank you for your help in advance!

    opened by xinyu-g 0
  • Is CPU acceleration failed?

    Is CPU acceleration failed?

    Last day, I make some experiments in a Server to run the ./run_alphafold.sh -d /dataset/ -o result -p monomer -m model_2 -i input/T1061.fasta and I read the log, confused, the T1061 is 949AA. ` I0822 07:33:00.806264 140553952322112 jackhmmer.py:133] Launching subprocess "/opt/conda/bin/jackhmmer -o /dev/null -A /tmp/tmpxbrk9wt6/output.sE 0.0001 -E 0.0001 --cpu 8 -N 1 input/T1061.fasta /dataset//uniref90/uniref90.fasta" I0822 07:33:01.157015 140553952322112 utils.py:36] Started Jackhmmer (uniref90.fasta) query I0822 07:37:27.058227 140553952322112 utils.py:40] Finished Jackhmmer (uniref90.fasta) query in 265.901 seconds I0822 07:37:27.072012 140553952322112 jackhmmer.py:133] Launching subprocess "/opt/conda/bin/jackhmmer -o /dev/null -A /tmp/tmpnn6am537/output.sE 0.0001 -E 0.0001 --cpu 8 -N 1 input/T1061.fasta /dataset//mgnify/mgy_clusters_2018_12.fa" I0822 07:37:27.439405 140553952322112 utils.py:36] Started Jackhmmer (mgy_clusters_2018_12.fa) query I0822 07:42:42.192071 140553952322112 utils.py:40] Finished Jackhmmer (mgy_clusters_2018_12.fa) query in 314.752 seconds I0822 07:42:42.364272 140553952322112 hhsearch.py:85] Launching subprocess "/opt/conda/bin/hhsearch -i /tmp/tmpog4q4684/query.a3m -o /tmp/tmpog40/pdb70" I0822 07:42:42.712445 140553952322112 utils.py:36] Started HHsearch query I0822 07:44:18.199999 140553952322112 utils.py:40] Finished HHsearch query in 95.487 seconds I0822 07:44:18.555797 140553952322112 hhblits.py:128] Launching subprocess "/opt/conda/bin/hhblits -i input/T1061.fasta -cpu 4 -oa3m /tmp/tmpz9oq 1000000 -realign_max 100000 -maxfilt 100000 -min_prefilter_hits 1000 -d /dataset//bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_optst30_2018_08" I0822 07:44:19.050110 140553952322112 utils.py:36] Started HHblits query

    I0822 09:01:02.278290 140553952322112 utils.py:40] Finished HHblits query in 4603.228 seconds ` feature extraction spend time: 5305.185729026794 feature extraction Completed succesfully

    I print the feature extraction time, find that , the 5305 is almost equals to the sum of each db search time, but according to the article, I think the feature extraction spend time should be almost equal to HHblits search, so can you explain the confusing problem?

    opened by yanchenmochen 3
  • failed to alloc 2147483648 bytes on host: CUDA_ERROR_OPERATING_SYSTEM: OS call failed or operation not supported on this OS

    failed to alloc 2147483648 bytes on host: CUDA_ERROR_OPERATING_SYSTEM: OS call failed or operation not supported on this OS

    When I use the code to compute T1050.fasta, which is composed of 700 residuses, the command line output the problem。 The Environment is GPU: A100, Ubuntu,but I use higher version jax and jaxlib, is it the problem causing this?

    (parafold) [email protected]:~# pip list | grep jax jax 0.3.15 jaxlib 0.3.15+cuda11.cudnn82

    opened by yanchenmochen 3
  • Too many command-line arguments

    Too many command-line arguments

    Hi,

    First of all, thanks for developing this tool, I'm looking forward to playing with it!

    I installed the ParallelFold into a Ubuntu 18 machine, and the full alphafold database into an external drive.

    When running the command: $ ./run_alphafold.sh -d /media/qhr/"My Passport"/alphafold/AlphaFold_DB -o output -p monomer_ptm -i input/GA98.fasta -m model_1 -f

    I get the Error: Too many command-line arguments.

    Also get the same error by calling directly to run_alphafold.py: $ python3 run_alphafold.py --fasta_paths=input/GA98.fasta --model_preset=monomer --data_dir=/media/qhr/"My Passport"/alphafold/AlphaFold_DB --output_dir=output --uniref90_database_path=/media/qhr/"My Passport"/alphafold/AlphaFold_DB/uniref90 --mgnify_database_path=/media/qhr/"My Passport"/alphafold/AlphaFold_DB/mgnify --template_mmcif_dir=/media/qhr/"My Passport"/alphafold/AlphaFold_DB/pdb_mmcif --obsolete_pdbs_path=/media/qhr/"My Passport"/alphafold/AlphaFold_DB/pdb_mmcif/obsolete.dat --use_gpu_relax=True bfd_database_path=/media/qhr/"My Passport"/alphafold/AlphaFold_DB/bfd --max_template_date=2020-05-14

    Is it possible that the space in the name of the external drive "My Passport" is causing such error?

    Thanks! Ana

    opened by AnaValero 1
  • Alphafold2 v/s Parafold timings

    Alphafold2 v/s Parafold timings

    I have a fundamental doubt about the difference between Alphafold2 and Parafold running procedure, how to determine whether Parafold is doing Parallel task unlike sequential tasks performed by Alphafold2 for the first step involving Jackhmmer, Jackhmmer and HHblits searches.

    Snippets of log files obtained from running Alphafold2 and Parafold

    Alphafold2 log:

    I0409 14:04:28.020900 139865793787712 run_alphafold.py:376] Have 5 models: ['model_1_pred_0', 'model_2_pred_0', 'model_3_pred_0', 'model_4_pred_0', 'model_5_pred_0']
    I0409 14:04:28.021180 139865793787712 run_alphafold.py:393] Using random seed 1420247507508611084 for the data pipeline
    I0409 14:04:28.021463 139865793787712 run_alphafold.py:161] Predicting seq1
    I0409 14:04:28.037414 139865793787712 jackhmmer.py:133] Launching subprocess "/.conda/envs/alphafold/bin/jackhmmer -o /dev/null -A /tmp/tmpm1u84thu/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1/fasta_files/seq1.fasta /alphafold_data//uniref90/uniref90.fasta"
    I0409 14:04:28.111756 139865793787712 utils.py:36] Started Jackhmmer (uniref90.fasta) query
    I0409 14:10:17.276236 139865793787712 utils.py:40] Finished Jackhmmer (uniref90.fasta) query in 349.164 seconds
    I0409 14:10:17.462168 139865793787712 jackhmmer.py:133] Launching subprocess "/.conda/envs/alphafold/bin/jackhmmer -o /dev/null -A /tmp/tmpub1qi595/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /fasta_files/seq1.fasta /alphafold_data//mgnify/mgy_clusters_2018_12.fa"
    I0409 14:10:17.513182 139865793787712 utils.py:36] Started Jackhmmer (mgy_clusters_2018_12.fa) query
    I0409 14:16:32.112656 139865793787712 utils.py:40] Finished Jackhmmer (mgy_clusters_2018_12.fa) query in 374.599 seconds
    I0409 14:16:33.369129 139865793787712 hhsearch.py:85] Launching subprocess "/.conda/envs/alphafold/bin/hhsearch -i /tmp/tmpyot74k7r/query.a3m -o /tmp/tmpyot74k7r/output.hhr -maxseq 1000000 -d /alphafold_data//pdb70/pdb70"
    I0409 14:16:33.466009 139865793787712 utils.py:36] Started HHsearch query
    I0409 14:22:32.148045 139865793787712 utils.py:40] Finished HHsearch query in 358.682 seconds
    I0409 14:22:32.838686 139865793787712 hhblits.py:128] Launching subprocess "/.conda/envs/alphafold/bin/hhblits -i /fasta_files/seq1.fasta -cpu 4 -oa3m /tmp/tmpedyoxta1/output.a3m -o /dev/null -n 3 -e 0.001 -maxseq 1000000 -realign_max 100000 -maxfilt 100000 -min_prefilter_hits 1000 -d /alphafold_data//bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt -d /alphafold_data//uniclust30/uniclust30_2018_08/uniclust30_2018_08"
    I0409 14:22:32.926801 139865793787712 utils.py:36] Started HHblits query
    I0409 18:56:30.223437 139865793787712 utils.py:40] Finished HHblits query in 16437.296 seconds
    

    Parafold log:

    I0427 21:17:27.915049 140305630689088 run_alphafold.py:397] Have 5 models: ['model_1_pred_0', 'model_2_pred_0', 'model_3_pred_0', 'model_4_pred_0', 'model_5_pred_0']
    I0427 21:17:27.915312 140305630689088 run_alphafold.py:414] Using random seed 1534697036303804749 for the data pipeline
    I0427 21:17:27.915629 140305630689088 run_alphafold.py:165] Predicting seq2
    I0427 21:17:27.925500 140305630689088 jackhmmer.py:133] Launching subprocess "/.conda/envs/alphafold/bin/jackhmmer -o /dev/null -A /tmp/tmp5fo28348/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /fasta_files/seq2.fasta /alphafold_data//uniref90/uniref90.fasta"
    I0427 21:17:27.996705 140305630689088 utils.py:36] Started Jackhmmer (uniref90.fasta) query
    I0427 21:23:54.643056 140305630689088 utils.py:40] Finished Jackhmmer (uniref90.fasta) query in 386.646 seconds
    I0427 21:23:54.829476 140305630689088 jackhmmer.py:133] Launching subprocess "/.conda/envs/alphafold/bin/jackhmmer -o /dev/null -A /tmp/tmprs3za6w_/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /fasta_files/seq2.fasta /alphafold_data//mgnify/mgy_clusters_2018_12.fa"
    I0427 21:23:54.875119 140305630689088 utils.py:36] Started Jackhmmer (mgy_clusters_2018_12.fa) query
    I0427 21:31:38.409492 140305630689088 utils.py:40] Finished Jackhmmer (mgy_clusters_2018_12.fa) query in 463.534 seconds
    I0427 21:31:39.768360 140305630689088 hhsearch.py:85] Launching subprocess "/.conda/envs/alphafold/bin/hhsearch -i /tmp/tmpjgr58ebb/query.a3m -o /tmp/tmpjgr58ebb/output.hhr -maxseq 1000000 -d /alphafold_data//pdb70/pdb70"
    I0427 21:31:39.850885 140305630689088 utils.py:36] Started HHsearch query
    I0427 21:39:23.420352 140305630689088 utils.py:40] Finished HHsearch query in 463.569 seconds
    I0427 21:39:24.173583 140305630689088 hhblits.py:128] Launching subprocess "/.conda/envs/alphafold/bin/hhblits -i /fasta_files/seq2.fasta -cpu 4 -oa3m /tmp/tmpmzl5arhr/output.a3m -o /dev/null -n 3 -e 0.001 -maxseq 1000000 -realign_max 100000 -maxfilt 100000 -min_prefilter_hits 1000 -d /alphafold_data//bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt -d /alphafold_data//uniclust30/uniclust30_2018_08/uniclust30_2018_08"
    I0427 21:39:24.259592 140305630689088 utils.py:36] Started HHblits query
    I0428 01:34:31.302148 140305630689088 utils.py:40] Finished HHblits query in 14107.042 seconds
    

    They look similar to me, and both use 8cpus, 8cpus, and 4cpus, respectively. Please clarify this for me.

    Thank you Aditi

    opened by adi1bioinfo 0
  • An error in feature generation

    An error in feature generation

    Hi, When I used your new version to make fearure.pkl, this error occurred, could you give any advice on how to solve it?

    FATAL Flags parsing error: Unknown command line flag 'model_names'. Did you mean: model_preset ? Pass --helpshort or --helpfull to see help on flags.

    opened by YiningWang2 1
Releases(v1.1)
Owner
Bozitao Zhong
Protein Design
Bozitao Zhong
A data-driven maritime port simulator

PySeidon - A Data-Driven Maritime Port Simulator 🌊 Extendable and modular software for maritime port simulation. This software uses entity-component

6 Apr 10, 2022
Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

Han Xu 129 Dec 11, 2022
Poplar implementation of "Bundle Adjustment on a Graph Processor" (CVPR 2020)

Poplar Implementation of Bundle Adjustment using Gaussian Belief Propagation on Graphcore's IPU Implementation of CVPR 2020 paper: Bundle Adjustment o

Joe Ortiz 34 Dec 05, 2022
PyTorch framework for Deep Learning research and development.

Accelerated DL & RL PyTorch framework for Deep Learning research and development. It was developed with a focus on reproducibility, fast experimentati

Catalyst-Team 29 Jul 13, 2022
Face-Recognition-based-Attendance-System - An implementation of Attendance System in python.

Face-Recognition-based-Attendance-System A real time implementation of Attendance System in python. Pre-requisites To understand the implentation of F

Muhammad Zain Ul Haque 1 Dec 31, 2021
Disagreement-Regularized Imitation Learning

Due to a normalization bug the expert trajectories have lower performance than the rl_baseline_zoo reported experts. Please see the following link in

Kianté Brantley 25 Apr 28, 2022
EqGAN - Improving GAN Equilibrium by Raising Spatial Awareness

EqGAN - Improving GAN Equilibrium by Raising Spatial Awareness Improving GAN Equilibrium by Raising Spatial Awareness Jianyuan Wang, Ceyuan Yang, Ying

GenForce: May Generative Force Be with You 149 Dec 19, 2022
SberSwap Video Swap base on deep learning

SberSwap Video Swap base on deep learning

Sber AI 431 Jan 03, 2023
Explaining Hyperparameter Optimization via PDPs

Explaining Hyperparameter Optimization via PDPs This repository gives access to an implementation of the methods presented in the paper submission “Ex

2 Nov 16, 2022
Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation

DynaBOA Code repositoty for the paper: Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation Shanyan Guan, Jingwei Xu, Michell

197 Jan 07, 2023
git《Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser》(2021) GitHub: [fig5]

Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser Abstract The success of deep denoisers on real-world colo

Yue Cao 51 Nov 22, 2022
Video Instance Segmentation using Inter-Frame Communication Transformers (NeurIPS 2021)

Video Instance Segmentation using Inter-Frame Communication Transformers (NeurIPS 2021) Paper Video Instance Segmentation using Inter-Frame Communicat

Sukjun Hwang 81 Dec 29, 2022
Unsupervised MRI Reconstruction via Zero-Shot Learned Adversarial Transformers

Official TensorFlow implementation of the unsupervised reconstruction model using zero-Shot Learned Adversarial TransformERs (SLATER). (https://arxiv.

ICON Lab 22 Dec 22, 2022
We will release the code of "ConTNet: Why not use convolution and transformer at the same time?" in this repo

ConTNet Introduction ConTNet (Convlution-Tranformer Network) is proposed mainly in response to the following two issues: (1) ConvNets lack a large rec

93 Nov 08, 2022
Instance-wise Occlusion and Depth Orders in Natural Scenes (CVPR 2022)

Instance-wise Occlusion and Depth Orders in Natural Scenes Official source code. Appears at CVPR 2022 This repository provides a new dataset, named In

27 Dec 27, 2022
基于Paddle框架的arcface复现

arcface-Paddle 基于Paddle框架的arcface复现 ArcFace-Paddle 本项目基于paddlepaddle框架复现ArcFace,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 参考项目: InsightFace Padd

QuanHao Guo 16 Dec 15, 2022
bio_inspired_min_nets_improve_the_performance_and_robustness_of_deep_networks

Code Submission for: Bio-inspired Min-Nets Improve the Performance and Robustness of Deep Networks Run with docker To build a docker environment, chan

0 Dec 09, 2021
Omniscient Video Super-Resolution

Omniscient Video Super-Resolution This is the official code of OVSR (Omniscient Video Super-Resolution, ICCV 2021). This work is based on PFNL. Datase

36 Oct 27, 2022
Refactoring dalle-pytorch and taming-transformers for TPU VM

Text-to-Image Translation (DALL-E) for TPU in Pytorch Refactoring Taming Transformers and DALLE-pytorch for TPU VM with Pytorch Lightning Requirements

Kim, Taehoon 61 Nov 07, 2022
Official Implementation of SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual Representations

Official Implementation of SimIPU SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual Representations Since

Zhyever 37 Dec 01, 2022