Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best way to Read in/Search Over Many Many HMMs? #49

Closed
gbouras13 opened this issue Aug 11, 2023 · 3 comments
Closed

Best way to Read in/Search Over Many Many HMMs? #49

gbouras13 opened this issue Aug 11, 2023 · 3 comments
Labels
duplicate This issue or pull request already exists question Further information is requested

Comments

@gbouras13
Copy link

Hi Martin,

Pyhmmer is awesome - just trying to play around with PHROGs and build it into some tooling.

Using v0.9.0.

One question - what do you think the best way is to read in lots of HMMs? Like 38000? I've made a bunch with pyhmmer really easily.

In the example (https://pyhmmer.readthedocs.io/en/stable/examples/recipes.html#Loading-multiple-HMMs) the hmm were hardcoded. I've tried a few approaches to get around this but am running into a strange error.

For example after tweaking the class to take a list

class HMMFiles(typing.ContextManager[typing.Iterable[HMM]]):
    def __init__(self, files: list['os.PathLike[bytes]']) -> None:
        self.stack = contextlib.ExitStack()
        self.hmmfiles = [self.stack.enter_context(HMMFile(f)) for f in files]

    def __enter__(self) -> typing.Iterable[HMM]:
        return itertools.chain.from_iterable(self.hmmfiles)

    def __exit__(self, exc_value: object, exc_type: object, traceback: object) -> None:
        self.stack.close()

Then specifying the files and reading them in

from pathlib import Path
import glob

# MSA_Phrogs_M50_HMM is the directory in the working dir containg all the .hmms
HMM_dir = Path("MSA_Phrogs_M50_HMM")
pattern = "*.hmm"  # Replace with your desired file pattern
files = HMM_dir.glob(pattern)

with HMMFiles(files) as hmm_files:
    all_hits = list(pyhmmer.hmmsearch(hmm_files, targets))

But this throws a very weird error:

FileNotFoundError: [Errno 2] no such file or directory: PosixPath('MSA_Phrogs_M50_HMM/phrog_29267.hmm')

when this file does definitely exist.

George

@althonos
Copy link
Owner

Hi George, I think this may be related to #48, you could have a look at the solution there as well!

@gbouras13
Copy link
Author

Hi Martin, thanks for that, it is indeed. Closing this now.

@althonos
Copy link
Owner

I'll update the documentation to use the suggested solution there instead, to avoid any more issues because of file descriptors.

@althonos althonos added question Further information is requested duplicate This issue or pull request already exists labels Aug 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants