Replies: 1 comment 4 replies
-
This thread provides the exact details: #520 Based on the discussion, I did some verification with a csv file of equivalent size ~1 GB, it approximately took 5 mins Why would smart_open treat a gz file differently when the request is to read/write as binary ? Is smart_open trying to gunzip and stream as gzip to destination ? GZ file of ~750 MB took around ~21 mins Any response to help move forward is greatly appreciated. Thank you ! |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Prior to smart_open, copying large gz file, size 850MB:
S3 copy from us-east to us-west bucket: download to local EC2 from us-east and upload disk file to us-west bucket. Approximately 3 mins.
Implementing smart_open:
` chunk_size = (64 * 1024 * 1024)
`
Stream copy file from us-east to us-west, taking upwards of 16 mins. Any way to improve the stream copy performance ?
Thank you !
Beta Was this translation helpful? Give feedback.
All reactions