Skip to content

Instantly share code, notes, and snippets.

@gourneau
Created December 4, 2011 18:33
Show Gist options
  • Save gourneau/1430932 to your computer and use it in GitHub Desktop.
Save gourneau/1430932 to your computer and use it in GitHub Desktop.
Download large files with Python urllib2 to a temp directory
import os
import urllib2
import math
def downloadChunks(url):
"""Helper to download large files
the only arg is a url
this file will go to a temp directory
the file will also be downloaded
in chunks and print out how much remains
"""
baseFile = os.path.basename(url)
#move the file to a more uniq path
os.umask(0002)
temp_path = "/tmp/"
try:
file = os.path.join(temp_path,baseFile)
req = urllib2.urlopen(url)
total_size = int(req.info().getheader('Content-Length').strip())
downloaded = 0
CHUNK = 256 * 10240
with open(file, 'wb') as fp:
while True:
chunk = req.read(CHUNK)
downloaded += len(chunk)
print math.floor( (downloaded / total_size) * 100 )
if not chunk: break
fp.write(chunk)
except urllib2.HTTPError, e:
print "HTTP Error:",e.code , url
return False
except urllib2.URLError, e:
print "URL Error:",e.reason , url
return False
return file
#use it like this
#downloadChunks("http://localhost/a.zip")
@MrMitch
Copy link

MrMitch commented Feb 9, 2013

Is there a particular reason to have chosen CHUNK = 256 * 10240 (2560 KB?) over any other chunk size?

Copy link

ghost commented Oct 7, 2015

thank you!

@closedLoop
Copy link

downloaded should be converted to a float in either line 23 or 29 so that the print function doesn't to the division as integers and return 0s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment