Skip to content

Pickle INT opcode boolean conversion discrepancy #135241

Open
@Legoclones

Description

@Legoclones

Bug report

Bug description:

To preserve backwards compatibility with older versions of pickle where there were no opcodes to push bools onto the stack, the INT opcode has hardcoded logic to turn the payload I00\n into False and I01\n into True:

cpython/Lib/pickle.py

Lines 1383 to 1389 in 8fdbbf8

data = self.readline()
if data == FALSE[1:]:
val = False
elif data == TRUE[1:]:
val = True
else:
val = int(data)

cpython/Modules/_pickle.c

Lines 5258 to 5261 in 8fdbbf8

if (len == 3 && (x == 0 || x == 1)) {
if ((value = PyBool_FromLong(x)) == NULL)
return -1;
}

However, there's a slight difference in logic here. In pickle, it looks for the hardcoded values 00\n and 01\n. In _pickle (the accelerated version), it just checks that 3 bytes were provided and strtol() parses it out to 0 and 1. Therefore, payloads like I 0\n and I+0\n will be False in _pickle and 0 in pickle.

While 1/True and 0/False have identical behavior in most scenarios in Python, they are still technically different values and in some situations (like when using the type() function, or 1 is True) they will return different results. I know the situations in which this difference would matter is unlikely (probably why no one has said anything about it yet), but it can't hurt to align the two implementations.

I have attached a PR here that changes the logic in _pickle.c to check for the specific values 00 and 01.

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions