Skip to content

AlphaFill is an algorithm based on sequence and structure similarity that “transplants” missing compounds to the AlphaFold models. By adding the molecular context to the protein structures, the models can be more easily appreciated in terms of function and structure integrity.

License

Notifications You must be signed in to change notification settings

PDB-REDO/alphafill

Repository files navigation

github CI GitHub License

AlphaFill

AlphaFill is an algorithm based on sequence and structure similarity that “transplants” missing compounds to the AlphaFold models. By adding the molecular context to the protein structures, the models can be more easily appreciated in terms of function and structure integrity.

Building

In order to build alphafill, you need to have a modern C++ compiler (c++17), a recent version of cmake and the following libraries installed:

The default assumes you only want to process predicted models locally. If you want to build the web application environment you will have to install the following as well:

  • libpq, the PostgreSQL library
  • libpqxx version 7.2 or higher
  • yarn to package the data for the web interface.
  • mrc to package all the runtime data into resources in the final excutable. This is optional and will not work on macOS.

Once all the requirements are met, building is as simple as:

git clone https://github.com/PDB-REDO/alphafill
cd alphafill
cmake .. -DCMAKE_BUILD_TYPE=Release
cmake --build build
ctest --test-dir build
cmake --install build

Configuration

After installing alphafill as described in the previous section, a default configuration file has been written to your /etc directory. The install command should have told you which file to edit.

In this configuration file you should at least specify the following three paths:

  • pdb-dir

    This is the directory containing your copy of the entire PDB or PDB-REDO's XXXX_final.cif files.

  • pdb-fasta

    The path to the file containing the sequences from the PDB files in FastA format. This file is generated by the alphafill create-index command.

  • ligands

    The path to the file containing the ligands that need to be included in processing. A default file is provided.

  • db-dir

    The directory containing the eventual alphafill files. This directory only needs to be specified if you build a webserver.

alphafill

Before you can process your models, you will have to build the PDB FastA file using the create-index command:

alphafill create-index

Processing a model is then as easy as:

alphafill process /srv/data/afdb/cif/AF-XXX.cif.gz /srv/data/af-filled/AF-XXX.cif.gz

Typical running time is less than 2 minutes but varies depending on the number of transplants.

web interface

To enable the web interface you needed to have the BUILD_WEB_APPLICATION option ON during configuration.

Before running the web application, you need to create a PostgreSQL database. This database will contain the statistics for the processed files in db-dir. The connection details for this databank should be recorded in the alphafill.conf file. The database can be filled with the alphafill rebuild-db command.

After this setting up, you can start a web server using alphafill server start. Use alphafill server status to find the status of the server and alphafill server stop to stop it again. In this case the alphafill server runs as a daemon and log files will be written to /var/log/alphafill/. You can also start with the extra --no-daemon option and then the server will run in the foreground.

About

AlphaFill is an algorithm based on sequence and structure similarity that “transplants” missing compounds to the AlphaFold models. By adding the molecular context to the protein structures, the models can be more easily appreciated in terms of function and structure integrity.

Resources

License

Stars

Watchers

Forks

Packages

No packages published