The scope of this module is downloading or reading PDB-Files and subsequently parsing them.
This project is only intended for my personal use and may change significantly over time. You are free to use this code at own risk.
The class PDBFile is initialized with pdb_id
(str, an identifier for the file) and pdb_content
(str, the content of
a PDB-file). In default check_lines
(bool) is set to True
, which will raise an AssertionError
if the reconstructed
line for an atom differs from the original line (alignment of atom label is allowed to differ).
After reviewing differences you can suppress the error with
setting check_lines
to False
.
The CONNECT info in PDB-files is often referencing atoms not present in the file. When setting allow_connect_errors
to False
these errors are ignored.
The class PDBFile can also be constructed from a file:
pdbfile_obj = PDBFile.from_file("./path/to/file.pdb", "PDB_ID_of_File")
Or directly from PDBe
pdbfile_obj = PDBFile.from_online("PDB_ID")
Most PDB files contain only one protein model, however there are PDB files which contain multiple protein conformations
(mainly NMR-resolved structures).
Models are herein called PDBStructure
and can be accessed via the PDBFIle object:
pdbfile_obj = PDBFile.from_online("PDB_ID")
model_0 = pdbfile_obj.model[0]
On this note: My naming conventions with model and structure may be not very consistent. This is a #TODO for future refactorings.
Each PDBStructure consists of residudes represented by PDBResidue
-objects. Which themselves consist of PDBAtom
-
objects.
Residues can be accessed via:
pdbfile_obj = PDBFile.from_online("PDB_ID")
model_0 = pdbfile_obj.model[0]
residue = model_0.residues
Residues specified by chain and residue-ID can also be accessed via a dictionary:
chain = "A"
res_id = 1
residue_A_1 = model_0.residue_dict[(chain, res_id)]
For some atoms in a residue alternative positions are given. When specific atom positions are required other locations can be removed via:
residue.remove_alternate_positions(keep="A")
keep
specifies the alternative position to keep. The default value is "A"
.
To apply this to all residues of a PDBStructure, call:
model_0.remove_alternate_positions(keep="A")