Skip to content
This repository was archived by the owner on Jul 11, 2024. It is now read-only.
This repository was archived by the owner on Jul 11, 2024. It is now read-only.

pyPDF Unable to resolve IndirectObject getting pdf with empty pages #17

Open
@ambigus9

Description

@ambigus9

I trying to write PDF file to do that i using following code:

from PyPDF3 import PdfFileWriter, PdfFileReader
import boto3
s3 = boto3.resource("s3")
bucket = s3.Bucket(my_s3Bucket_on_AWS)
object = bucket.Object(my_s3_file_on_AWS)
tmp = tempfile2.NamedTemporaryFile()

inputpdf = PdfFileReader(open(tmp.name, "rb"), strict=False)
num_pages = inputpdf.getNumPages()
output = PdfFileWriter()
for i in range(num_pages):
    logger.info(f"Adding page --> {i}")
    output.addPage(inputpdf.getPage(i))

logger.info(f"Here getting UserWarning")
with open(tmp2.name, "wb") as output_stream:
    output.write(output_stream)
    output_stream.close()

Works perfect for at least 10K of PDFs, until 1 PDF that is getting following error:

UserWarning: Unable to resolve [IndirectObject: IndirectObject(7, 0)],
returning NullObject instead [pdf.py:644]

UserWarning: Unable to resolve [IndirectObject: IndirectObject(9, 0)],
returning NullObject instead [pdf.py:644]

UserWarning: Unable to resolve [IndirectObject: IndirectObject(10,
0)], returning NullObject instead [pdf.py:644]

UserWarning: Unable to resolve [IndirectObject: IndirectObject(13,
0)], returning NullObject instead [pdf.py:644]

UserWarning: Unable to resolve [IndirectObject: IndirectObject(16,
0)], returning NullObject instead [pdf.py:644]

UserWarning: Unable to resolve [IndirectObject: IndirectObject(20,
0)], returning NullObject instead [pdf.py:644]

UserWarning: Unable to resolve [IndirectObject: IndirectObject(24,
0)], returning NullObject instead [pdf.py:644]

UserWarning: Unable to resolve [IndirectObject: IndirectObject(29,
0)], returning NullObject instead [pdf.py:644]

Any suggestion about how to fix this?

Note: The PDF i trying to read is not empty, it have data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions