Added multi-chain permutation steps, multimer datamodule, and training code for multimer #336

dingquanyu · 2023-07-21T12:52:17Z

CUDA error: device-side assert triggered still persists when slicing a tensor

…optimal_transform

added multi-chain permutation to AlphaFoldMultimerLoss

Modify assignment stage

added openfold multimer dataloader class and overwrite batch processing

…nto a pytorch tensor

Added Multimer dataloader and training scripts

christinaflo

Initial review

christinaflo · 2023-08-01T18:34:36Z

README.md

Revert this to the original OF readme

Sure thing. Done in commit b8d0069

christinaflo · 2023-08-01T18:43:03Z

openfold/config.py

@@ -163,6 +163,9 @@ def model_config(
        for k,v in multimer_model_config_update.items():


Good catch, can you change this so that the model components are in a 'model' dict so that it matches the loss dict: multimer_model_config_update['model']

Sure. I have added a key called 'model' inside the multimer_model_config_update

christinaflo · 2023-08-01T18:57:34Z

openfold/data/data_pipeline.py

@@ -830,6 +831,15 @@ def read_template(start, size):
                    with open(path, "r") as fp:
                        hits = parsers.parse_hhr(fp.read())
                    all_hits[f] = hits
+                    fp.close()


fp is automatically closed when using with statement, can remove this

Sure it's removed

christinaflo · 2023-08-01T19:03:08Z

openfold/data/data_pipeline.py

@@ -830,6 +831,15 @@ def read_template(start, size):
                    with open(path, "r") as fp:
                        hits = parsers.parse_hhr(fp.read())
                    all_hits[f] = hits
+                    fp.close()
+
+                elif (ext =='.sto') and (f.startswith("pdb")):


The template file would be named hmm_output.sto right? Is the startswith "pdb" in reference to something else?

Sorry my mistake. I saw in the test_data there is a pdb70_hits.hhr file and I thought the hmm results will also be named as such. I've changed the 2nd condition so that it check whether the file starts with 'hmm'

christinaflo · 2023-08-01T19:12:20Z

openfold/model/model.py

@@ -535,7 +535,10 @@ def forward(self, batch):

            # Enable grad iff we're training and it's the final recycling layer
            is_final_iter = cycle_no == (num_iters - 1) or early_stop
-            with torch.set_grad_enabled(is_grad_enabled and is_final_iter):
+            enable_grad= is_grad_enabled and is_final_iter
+            if (type(enable_grad)!=bool) and (type(enable_grad)==torch.Tensor):


Both is_grad_enabled and is_final_iter should be type bool, I'm going to check what is causing it to be a tensor, but this should be removed once that is fixed.

Sure I have reversed this part back to the original. I couldn't figure out how this happened. It was a boolean in the first couple of iterations then changed into a tensor with the boolean value in it at some point and gave me this TypeError: enabled must be a bool (got Tensor) error.

christinaflo · 2023-08-01T19:15:07Z

tests/test_data/alignments/1hf9_A/bfd_uniclust_hits.a3m

For the test data alignments, can we compress them and include fewer examples in general

christinaflo · 2023-08-01T19:19:11Z

openfold/utils/loss.py

@@ -298,10 +310,10 @@ def fape_loss(
        interface_bb_loss = backbone_loss(
            traj=traj,
            pair_mask=1. - intra_chain_mask,
-            **{**batch, **config.interface_backbone},
+            **{**batch, **config.intra_chain_backbone},


This is the interface backbone loss, why is it using the intra_chain_backbone config?

I see you're right but in the config.py, there is no such a key called "interface_backbone" in loss config_dict. I suppose it should be "interface" in line 850 of config.py? I have changed this part of loss.py to config.interface

openfold/openfold/config.py

Lines 850 to 853 in bc35ef1

"interface": {

"clamp_distance": 30.0,

"loss_unit_distance": 20.0,

"weight": 0.5,

…hain backbone weights

…t be a bool (got Tensor)' error still persists

…the new permutation unittest

dingquanyu · 2023-08-03T12:43:06Z

openfold/utils/loss.py

-    chains = asym_id.unique()
-    one_hot = torch.nn.functional.one_hot(asym_id, num_classes=chains.shape[0]).to(dtype=all_atom_mask.dtype)
+    chains, _ = asym_id.unique(return_counts=True)
+    one_hot = torch.nn.functional.one_hot(asym_id.to(torch.int64)-1, # have to reduce asym_id by one because class values must be smaller than num_classes  


Sorry I kept my modifications when I solved this conflict as asym_id starts from 1, we should deduct it by 1 so that the class values is always smaller than the number of classes. Otherwise, pytorch throws an error.
@christinaflo

Oh yeah I was going to merge your changes in with this PR, I only wanted to remove the return_counts=True because I wasn't using the returned counts.

dingquanyu · 2023-08-03T12:47:07Z

openfold/utils/loss.py

@@ -529,9 +541,9 @@ def lddt_loss(
        cutoff=cutoff, 
        eps=eps
    )
-
+    score = torch.nan_to_num(score,nan=torch.nanmean(score))


Here I added this checking on NaN in the predicted lddt scores because the ground truth structure I used for initial unittest was completely irrelevant to the fake features I generated. As the result, I got NaN or negative values here. Perhaps it'd be better if this part is removed in the real training code
@christinaflo

Yeah we should remove this, I was going to reformat some things after merging this PR in so I can just remove it then.

dingquanyu and others added 30 commits June 17, 2023 18:08

start working on multimer loss

3983068

start working on permutation

1008f61

finished working on selecting best anchors. now start working on get_…

e3e8a68

…optimal_transform

fixed the assertion errors in get_optimal_transform

d4b6163

update codes

d3b2b26

remove recycling dimentions

c6ac105

fixed get_anchor_candidates error

f563944

finished codes for num_sym 1

5475590

now used the new way of selecting anchors

a0f8a05

fixed downstream greedy align

4a66504

update test codes

ab659c3

revert data_utils and remove all the changes in this file

85cd91f

added multi-chain permutation to AlphaFoldMultimerLoss

a49e489

Merge pull request #2 from dingquanyu/cleanup-permutation

697ac82

added multi-chain permutation to AlphaFoldMultimerLoss

remove unnecessary extra file

50abc1a

added example label files

a741b42

Update README.md

938782e

update codes

4666e15

update the test input to be A2B3

5621ac0

swtich to scipy's kabsch algorithm

82895ec

switch optimal alignment method to procustes package

458a62f

handel NaN and Inf in test output to avoid crashes

2a70e08

remove shuffling

24e470b

start working on loss part

eeb035c

Merge pull request #3 from dingquanyu/modify-assignment-stage

3d87ef2

Modify assignment stage

Update README.md

a3ea7c6

added batch_size dimesion to compute_tm

1510507

added batch_size dimension

c69053e

update the compue_tm function

5e1af36

now check the batch_size in compute_tm function

a420160

dingquanyu and others added 15 commits July 18, 2023 13:28

Merge pull request #5 from dingquanyu/multimer-dataloader

85185ef

added openfold multimer dataloader class and overwrite batch processing

added an extra step because sometimes the boolean value will change i…

2f2793f

…nto a pytorch tensor

added an example training script for training openfold multimer

d2eae83

fixed the overflow problems while slicing matrices

9f24ebf

now update get_optimal_transform

5348936

update kabsch rotation calculation to avoid svd not converge error

7f2a326

updated data_module

e482a26

remove train_multimer script and use the updated train_openfold.py now

d886a7b

update train_openfold.py to accomodate training multimer

4b35415

update test files

d7162be

remove some print statements

4e38231

added train_mmcifs_cache.json

a1ef4c8

Update README.md

5fc8013

Merge pull request #6 from dingquanyu/multimer-dataloader

6f78792

Added Multimer dataloader and training scripts

Update README.md

3a2c148

christinaflo requested changes Aug 2, 2023

View reviewed changes

dingquanyu and others added 10 commits August 3, 2023 12:25

reverse README back to original

b8d0069

put multimer_model_config_update inside a 'model' key in the config dict

833c100

move recycle_early_stop_tolerance into 'model'

43bab15

fixed the confilcts in reading alignment files

bc35ef1

fixed the error where the iterface backbone loss falsely used intra_c…

0caffd5

…hain backbone weights

reverse the early stop check back to originl now but the 'enabled mus…

f44e983

…t be a bool (got Tensor)' error still persists

Merge branch 'multimer' into permutation

08afe38

removed unused alignment files

e2193f7

removed old data and script for permutation unittest. now working on …

139ea9a

…the new permutation unittest

update train_mmcifs_cache so that now only use one structure for testing

e963726

dingquanyu commented Aug 3, 2023

View reviewed changes

christinaflo merged commit 31051cf into aqlaboratory:multimer Aug 3, 2023
1 check passed

ulupo mentioned this pull request Aug 27, 2024

Timeline for Multimer Training #410

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added multi-chain permutation steps, multimer datamodule, and training code for multimer #336

Added multi-chain permutation steps, multimer datamodule, and training code for multimer #336

dingquanyu commented Jul 21, 2023

christinaflo left a comment

christinaflo Aug 1, 2023

dingquanyu Aug 3, 2023

christinaflo Aug 1, 2023

dingquanyu Aug 3, 2023

christinaflo Aug 1, 2023

dingquanyu Aug 3, 2023

christinaflo Aug 1, 2023

dingquanyu Aug 3, 2023

christinaflo Aug 1, 2023

dingquanyu Aug 3, 2023

christinaflo Aug 1, 2023

christinaflo Aug 1, 2023

dingquanyu Aug 3, 2023

dingquanyu Aug 3, 2023

christinaflo Aug 3, 2023

dingquanyu Aug 3, 2023

christinaflo Aug 3, 2023

		@@ -163,6 +163,9 @@ def model_config(
		for k,v in multimer_model_config_update.items():

	"interface": {
	"clamp_distance": 30.0,
	"loss_unit_distance": 20.0,
	"weight": 0.5,

Added multi-chain permutation steps, multimer datamodule, and training code for multimer #336

Added multi-chain permutation steps, multimer datamodule, and training code for multimer #336

Conversation

dingquanyu commented Jul 21, 2023

christinaflo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment