Running Alphafold3
Alphafold3 is a Nobel Prize winning machine learning tool designed to predict the 3d molecular structure of proteins from either the peptide sequence or the DNA sequence alone. Currently, Alphafold3 is available for use on Hopper as a Docker Singularity Container
Interactively
To get an Alphafold3 session started, first you must set up an interactive gpu session with salloc
salloc -p gpuq -q gpu --nodes=1 --ntasks-per-node=4 --gres=gpu:2g.20gb:1 --mem=15GB --time=0-02:00:00
Then you must load the compiler and singularity module and set up the Alphafold Singularity Environment
module load gnu10
module load singularity
DB_DIR=/datasets/alphafold3/databases
MODEL_PARAMETERS_DIR=/datasets/alphafold3/model_parameters
AF3_SCRIPTS=/containers/dgx/Containers/alphafold3/v3.0.1
cp $AF3_SCRIPTS/*.py .
set environment commands to run image
AF3_IMAGE=/containers/dgx/Containers/alphafold3/v3.0.1/singularity/alphafold3.sif
SINGULARITY_RUN="singularity exec --nv --bind $MODEL_PARAMETERS_DIR:/root/models --bind $DB_DIR:/root/public_databases --bind $AF3_SCRIPTS:/root/scripts"
{
"name": "2PV7",
"sequences": [
{
"protein": {
"id": ["A", "B"],
"sequence": "GMRESYANENQFGFKTINSDIHKIVIVGGYGKLGGLFARYLRASGYPISILDREDWAVAESILANADVVIVSVPINLTLETIERLKPYLTENMLLADLTSVKREPLAKMLEVHTGAVLGLHPMFGADIASMAKQVVVRCDGRFPERYEWLLEQIQIWGAKIYQTNATEHDHNMTYIQALRHFSTFANGLHLSKQPINLANLLALSSPIYRLELAMIGRLFAQDAELYADIIMDKSENLAVIETLKQTYDEALTFFENNDRQGFIDAFHKVRDWFGDYSEQFLKESRQLLQQANDLKQG"
}
}
],
"modelSeeds": [1],
"dialect": "alphafold3",
"version": 1
}
Run Alphafold
${SINGULARITY_RUN} ${AF3_IMAGE} python3 run_alphafold.py --json_path= name_of_the_file.json --output_dir= your_output_directory --db_dir=/root/public_databases --model_dir=/root/models
The prediction may take a fair amount of time to complete. The predicted fold can be found in the output folder you specified.
Through batch submission
The above steps can also be submitted in the form of a Slurm script
#!/bin/bash
#SBATCH --partition=gpuq
#SBATCH --qos=gpu
#SBATCH --job-name=af3_example
#SBATCH --output=af3.%j.out
#SBATCH --output=af3.%j.err
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=4
#SBATCH --gres=gpu:2g.20gb:1
#SBATCH --mem=25GB
#SBATCH --time=0-01:00:00
module load gnu10
module load singularity
DB_DIR=/datasets/alphafold3/databases
MODEL_PARAMETERS_DIR=/datasets/alphafold3/model_parameters
AF3_SCRIPTS=/containers/dgx/Containers/alphafold3/v3.0.1
cp $AF3_SCRIPTS/*.py .
AF3_IMAGE=/containers/dgx/Containers/alphafold3/v3.0.1/singularity/alphafold3.sif
SINGULARITY_RUN="singularity exec --nv --bind $MODEL_PARAMETERS_DIR:/root/models --bind $DB_DIR:/root/public_databases --bind $AF3_SCRIPTS:/root/scripts"
${SINGULARITY_RUN} ${AF3_IMAGE} python3 run_alphafold.py --json_path=fold_input.json --output_dir=your_output_directory --db_dir=/root/public_databases --model_dir=/root/models
Visualization
Visualization of the predicted fold needs to be done through third party software such as ChimeraX. The output of a fold will be in the form of a folder containing a .cif file. This file contains the data for visualization.
To open ChimeraX, set up a Virtual Desktop Session through Open OnDemand. Navigate to the terminal/command line application, then type the following
module load gnu9
module load chimeraX
ChimeraX