Skip to content

Clarify base64.a85(en,de)code documentation for Adobe mode #134837

Open
@dhdaines

Description

@dhdaines

Bug report

Bug description:

It seems that whitespace is allowed everywhere by base64.a85decode, except after the end-of-data delimiter b'~>' in adobe mode:

>>> base64.a85decode(b"6#q'\\F`JTK<-N74;eT`QF!;`!@:O(oDf,~>", adobe=True)
b'Arthur "Two-Sheds" Jackson'
>>> base64.a85decode(b"  6  # q' \\     F`JTK<-N 7 4 ;eT`QF!;`!@:O(oDf,~>", adobe=True)
b'Arthur "Two-Sheds" Jackson'
>>> base64.a85decode(b"  6  # q' \\     F`JTK<-N 7 4 ;eT`QF!;`!@:O(oDf,  ")
b'Arthur "Two-Sheds" Jackson'
>>> base64.a85decode(b"  6  # q' \\     F`JTK<-N 7 4 ;eT`QF!;`!@:O(oDf,~>  ", adobe=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.11/base64.py", line 388, in a85decode
    raise ValueError(
ValueError: Ascii85 encoded byte sequences must end with b'~>'

While this behaviour is actually compliant with the very latest PDF standard, including errata, in practice it's quite surprising, and also causes problems due to the legacy of centuriesdecades of ambiguous PDF standards and implementations that emit and accept extra whitespace due to these amgibuities.

A separate but related issue is that some very broken PDF implementations have even been known to insert whitespace between the ~ and > bytes. It maybe useful for "Adobe" mode to be tolerant of this as well.

Obviously, also, PostScript doesn't care about extra whitespace after ~> in ASCII85 literal strings. (Note that the leading <~ is only accepted in PostScript and not in PDF).

Because > is a valid ASCII85 digit, an improved rule would be to only accept the regular expression ~\s*>\s* at the end of input in Adobe mode.

CPython versions tested on:

3.11

Operating systems tested on:

Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    docsDocumentation in the Doc dir

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions