How do you Parse Symbols #419

olakusibe · 2022-02-01T10:07:08Z

olakusibe
Feb 1, 2022

I have a PDF which has text, but after extracting text, some part are in text format the rest are in symbols (see screen shot below)
How do I parse this symbols to text ?

adscott1982 · 2024-07-03T13:28:26Z

adscott1982
Jul 3, 2024

I have been having the same issue, the problem isn't specific to PdfPig - every library I have tried has this issue. It seems to occur for me with text that has non-ASCII characters.

The problem is referenced here:

https://stackoverflow.com/questions/8039423/pdf-data-extraction-gives-symbols-gibberish

I have a solution of rendering the PDF to images, then performing OCR, however it is not 100% reliable.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How do you Parse Symbols #419

{{title}}

Replies: 1 comment

{{title}}

Select a reply

How do you Parse Symbols #419

olakusibe Feb 1, 2022

Replies: 1 comment

adscott1982 Jul 3, 2024

olakusibe
Feb 1, 2022

adscott1982
Jul 3, 2024