Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SamReader slower than expected on network drive #1660

Open
BeatWolf opened this issue Mar 11, 2023 · 1 comment
Open

SamReader slower than expected on network drive #1660

BeatWolf opened this issue Mar 11, 2023 · 1 comment

Comments

@BeatWolf
Copy link
Contributor

Description of the issue:

Converting a SAM file to a BAM file was slower than expected. When looking at the profiler i noticed the following:

image

The SAMReader uses about 20% of its time in the File.length() method. While this method is basically "free" on a local filesystem, it is not when using a network drive.

This could easily be fixed by simply caching the size of the file in the constructor of htsjdk.samtools.seekablestream.SeekableFileStream.

Your environment:

  • version of htsjdk: 3.0.2
  • version of java: 19
  • which OS: Windows 10

Steps to reproduce

Put a SAM file on a network drive (in my case a Synology NAS with an SMB connection).
Read the file and profile it.

Expected behaviour

The code should not spend 20% of the time getting the length of the file.

Actual behaviour

The code asks the remote file system constantly how big the file is.

@cmnbroad
Copy link
Collaborator

Yeah, it looks like we should cache the length - some of SeekableFileStream's sibling classes already do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants