Fixed #24242 -- Improved efficiency of utils.text.compress_sequence()

The function no longer flushes zfile after each write as doing so can
lead to the gzipped streamed content being larger than the original
content; each flush adds a 5/6 byte type 0 block. Removing this means
buf.read() may return nothing, so only yield if that has some data.
Testing shows without the flush() the buffer is being flushed every 17k
or so and compresses the same as if it had been done as a whole string.
This commit is contained in:
Matthew Somerville 2015-01-29 07:59:41 +00:00 committed by Tim Graham
parent 2730dad0d7
commit caa3562d5b
2 changed files with 16 additions and 2 deletions

View file

@ -302,6 +302,8 @@ class StreamingBuffer(object):
self.vals.append(val)
def read(self):
if not self.vals:
return b''
ret = b''.join(self.vals)
self.vals = []
return ret
@ -321,8 +323,9 @@ def compress_sequence(sequence):
yield buf.read()
for item in sequence:
zfile.write(item)
zfile.flush()
yield buf.read()
data = buf.read()
if data:
yield data
zfile.close()
yield buf.read()