Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading PDF gets stuck at READ XREF AND TABLE #18

Closed
hahnrobert opened this issue May 26, 2019 · 5 comments
Closed

Loading PDF gets stuck at READ XREF AND TABLE #18

hahnrobert opened this issue May 26, 2019 · 5 comments

Comments

@hahnrobert
Copy link

When loading this PDF via this code, the program gets stuck:
let _file = pdf::file::File::<Vec<u8>>::open("test1.pdf")?;

@Ploppz
Copy link
Member

Ploppz commented Jun 1, 2019

It seems if you try to run it a little longer, it outputs the error:

 === 
Error: Key Root: cannot convert from primitive to type Catalog
  caused by: Key StructTreeRoot: cannot convert from primitive to type Option < StructTreeRoot >
  caused by: Tried to dereference free object nr 133.
 === 

Edit: btw if anyone wonders how to get such outputs, you can use the print_err function or the provided examples/read.rs

Looking at the Catalog type, StructTreeRoot is optional. Inspecting the pdf reveals that indeed there is a reference to a non-existent/free object. So I think the best fix is to treat such references as absence of the object rather than error. Working on it.

@Ploppz
Copy link
Member

Ploppz commented Jun 1, 2019

Done and pushed to master.
Interesting to note how slow it reads the object. Should definitely be benchmarked.
When running with --release it takes 6 seconds, and without it takes 154 seconds...

@Ploppz
Copy link
Member

Ploppz commented Jun 3, 2019

When I benchmarked it, seems like most of that time is spent in error_chain. I should definitely look into that. But I should also switch to using failure. Maybe that's somehow better.

@Ploppz Ploppz closed this as completed Jul 3, 2019
@Ploppz
Copy link
Member

Ploppz commented Jul 29, 2019

@hahnrobert I don't know if you are still interested, but it seems that with the latest rewrite, reading is much faster. On current master, run cargo run -p read -- files/ep.pdf. (That file is the one you linked in this issue) It is done in less than a second on my computer.

@s3bk
Copy link
Contributor

s3bk commented Jul 29, 2019

there is also --release which should make it at least one magnitude faster

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants