-
Notifications
You must be signed in to change notification settings - Fork 360
The extracted table box coordinates do not correspond to the images converted from the PDF #486
Comments
Curious to know how you get this exact value of |
Answer the question 1:When the camelot package obtains the box coordinates by the pdfminer package, whose resolution's default value is 72 (I fogot to where I saw it), but when the camelot obtains the image by the read_pdf function, whose resolution's default value is 300. Line 93 in cd8ac79
Answer the question 2:You can try others. |
@SWHL Tis really helped me to understand the conversion. However i have a similar problem in which i have a coordinates of an object got it from a page image(pdf page have been converted into page image). Now i want to convert these coordinates into camelot pdf level coordinates. I tried to follow above logic in reverse order which is not successful. |
@baleris You can try it by this: |
@SWHL, this has not worked, when i checked camelot detected table coordinates they are totally different. For example for the above mentioned coordinates, camelot's relevant coordinates are (72.0, 295.2, 563.04, 648.72) |
@SWHL i see in your above solution you are getting a page image from Any suggestions to get image for "stream" parameter/borderless tables ? |
You can refer this: Lines 35 to 40 in cd8ac79
The current issue is beyond the scope of this issue. Suggest opening a new issue to discuss. |
Checklist
Describe the bug
Environment
OS
: CentOS 7Python
: 3.7.11camelot-py
: 0.10.1Reproduction
Bug fix
The text was updated successfully, but these errors were encountered: