Removed the new LONG2 opcode: it's extravagant. If LONG1 isn't enough,

then the embedded argument consumes at least 256 bytes. The difference between a 3-byte prefix (LONG2 + 2 bytes) and a 5-byte prefix (LONG4 + 4 bytes) is at worst less than 1%. Note that binary strings and binary Unicode strings also have only "size is 1 byte, or size is 4 bytes?" flavors, and I expect for the same reason. The only place a 2-byte thingie was used was in BININT2, where the 2 bytes make up the *entire* embedded argument (and now EXT2 also does this); that's a large savings over 4 bytes, because the total opcode+argument size is so small in the BININT2/EXT2 case. Removed the TAKEN_FROM_ARGUMENT "number of bytes" code, and bifurcated it into TAKEN_FROM_ARGUMENT1 and TAKEN_FROM_ARGUMENT4. Now there's enough info in ArgumentDescriptor objects to deduce the # of bytes consumed by each opcode. Rearranged the order in which proto2 opcodes are listed in pickle.py.
2025-10-09 16:34:44 +00:00 · 2003-01-28 00:13:19 +00:00 · 2003-01-28 00:13:19 +00:00 · fdb8cfab08
commit fdb8cfab08
parent bdbe74183c
2 changed files with 19 additions and 58 deletions
--- a/Lib/pickle.py
+++ b/Lib/pickle.py
@ -135,19 +135,18 @@ FALSE           = 'I00\n'  # not an opcode; see INT docs in pickletools.py

 # Protocol 2 (not yet implemented) (XXX comments will be added later)

-NEWOBJ          = '\x81'
 PROTO           = '\x80'
-EXT2            = '\x83'
+NEWOBJ          = '\x81'
 EXT1            = '\x82'
-TUPLE1          = '\x85'
+EXT2            = '\x83'
 EXT4            = '\x84'
-TUPLE3          = '\x87'
+TUPLE1          = '\x85'
 TUPLE2          = '\x86'
-NEWFALSE        = '\x89'
+TUPLE3          = '\x87'
 NEWTRUE         = '\x88'
-LONG2           = '\x8b'
+NEWFALSE        = '\x89'
 LONG1           = '\x8a'
-LONG4           = '\x8c'
+LONG4           = '\x8b'


 __all__.extend([x for x in dir() if re.match("[A-Z][A-Z0-9_]+$",x)])