Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for multithreading in python? #91

Open
lpravda opened this issue Mar 4, 2021 · 3 comments
Open

support for multithreading in python? #91

lpravda opened this issue Mar 4, 2021 · 3 comments

Comments

@lpravda
Copy link

lpravda commented Mar 4, 2021

Hi @wojdyr,

I wonder if there is any plan on considering supporting multithreading in python. Presently, when I try to use Python's multiprocessing module it crashes with message similar to:

TypeError: can't pickle gemmi.ResidueSpan objects

I triad approach described here to add custom pickling to the class shown here: , but only managed to make another step before it crashed with

TypeError: gemmi.ResidueSpan: No constructor defined!

So I'm not sure if this is something that needs to be supportd by the library, or if I'm doing something wrong. Any help would be greatly appreciated!

edit: I've just noticed #13, but have not seen any comments there...

@wojdyr
Copy link
Member

wojdyr commented Mar 4, 2021

yes, there was also #85

In general it's not planned. Support for pickling and unpickling everything would be a significant work and then a lot of extra code to maintain. But if it's needed only for a small class, like in #85, then yes, it can be added.

I suppose that you tried multithreading because something was too slow? What was it?

@lpravda
Copy link
Author

lpravda commented Mar 5, 2021

It wasnt slow as in it would take forever. Its just that for processing of hundreds of entries the same way my natural go to mode is multiprocessing and I was facing this issue. Like I said, this is not limiting ATM. Perhaps it would be good to clarify someplace in documentation that this is not possible. So that people are not asking all over again for this feature :).

Thank you!

@wojdyr
Copy link
Member

wojdyr commented Mar 5, 2021

It's possible if you don't pass gemmi objects between processes.
Here is an example:

import multiprocessing as mp
import sys
import gemmi

def f(path):
    st = gemmi.read_structure(path)
    weight = st[0].calculate_mass()
    return (st.name, weight)

def main():
    top_dir = sys.argv[1]
    with mp.Pool(processes=4) as pool:
        it = pool.imap_unordered(f, gemmi.CoorFileWalk(top_dir))
        for (name, weight) in it:
            print(name, weight)

main()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants