Replies: 1 comment
-
I have been having the same issue, the problem isn't specific to PdfPig - every library I have tried has this issue. It seems to occur for me with text that has non-ASCII characters. The problem is referenced here: https://stackoverflow.com/questions/8039423/pdf-data-extraction-gives-symbols-gibberish I have a solution of rendering the PDF to images, then performing OCR, however it is not 100% reliable. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have a PDF which has text, but after extracting text, some part are in text format the rest are in symbols (see screen shot below)
How do I parse this symbols to text ?
Beta Was this translation helpful? Give feedback.
All reactions