Python Public Photobucket Gallery Downloader

Transcribe
by Transcribe · 8 posts
11 years ago in Python
Posted 11 years ago · Author
This will download all public images in a Photobucket gallery.

usage:
just need to enter a directory for the images and a link to a Photobucket gallery.

[code=python file=Untitled.txt]
from sgmllib import SGMLParser
import urllib, re, os
dirs = []
pages = []
check = True
total = 0
count = 0
length = 40
def mkdir():
global DLlocation
DLlocation = raw_input('Enter download folder:\n')
if DLlocation:
root, tail = os.path.splitdrive(DLlocation)
os.chdir(root+os.sep)
try:
os.makedirs(DLlocation)
os.chdir(DLlocation)
DLlocation = os.getcwd()
except:
try:
os.path.exists(DLlocation)
os.chdir(DLlocation)
DLlocation = os.getcwd()
except:
print "Entered bad pathname."
mkdir()
else:
print "Did not enter pathname."
mkdir()
def getimgs():
site = raw_input('Enter photobucket gallery:\n')
if site:
site = re.split('[a-zA-Z0-9_.?=-]*$',site)[0]
try:
scanpages(site)
except:
print "Entered bad URL."
getimgs()
try:
for page in pages:
print "Downloading pictures from page "+str(pages.index(page)+1)+"..."+" "*15+"\n"
getpics(page)
print "\nCompleted downloading all "+str(count)+" images in gallery."+" "*10+"\n"
except:
print "Error downloading images."
getimgs()
else:
print "Did not enter URL."
getimgs()
def dlbar():
barnum = count*length/total
blanknum = length-barnum
bars = '='*barnum
blanks = ' '*blanknum
return "["+bars+blanks+"]"
def DLpic(url,filename):
sock = urllib.urlopen(url)
img = sock.read()
location = open(DLlocation+os.sep+filename, "wb")
location.write(img)
location.close()
def getpics(site):
sock = urllib.urlopen(site)
parser = Parser()
parser.feed(sock.read())
parser.close()
sock.close()
images = []
imgName = re.compile('th_[a-zA-Z0-9_-]+.[a-z]+$')
for url in parser.imgs:
idx = parser.imgs.index(url)
try:
img = imgName.search(url)
file = img.group()
fullimg = file.replace('th_', '')
url = url.replace(file, fullimg)
file = file.replace('th_', '')
images.append(url)
except:
continue
if len(images) == 0:
print "No images on page."
getimgs()
images = images[5:]
for url in images:
global count
count = count + 1
idx = images.index(url)
regx = re.compile('[a-zA-Z0-9_-]+.[a-z]+$')
fullimg = regx.search(url)
done = str(idx+1)
file = fullimg.group()
if int(done) < len(images):
print('\033[AImage '+str(done)+'/'+str(len(images))+' on current page. '+str(count)+'/'+str(total)+' total.'+' '*5)
else:
pagenum = count/20
if count%20:
pagenum=pagenum+1
print('\033[A\033[AAll images on page '+str(pagenum)+' downloaded.'+' '*20)
print(dlbar()+'\033[A')
DLpic(url,file)
def scanpages(site):
num = 0
print "Scanning for end page..."
while True:
imgs = []
link = site+'?start='
usock = urllib.urlopen(link+str(num))
parser = Parser()
parser.feed(usock.read())
parser.close()
for url in parser.imgs:
imgName = re.compile('th_[a-zA-Z0-9_-]+.[a-z]+$')
try:
img = imgName.search(url)
file = img.group()
imgs.append(file)
except:
continue
if not len(imgs) == 25:
global total
total = len(imgs)-5+num
pages.append(link+str(num))
print "Found end page: "+str(num/20+1)
break
pages.append(link+str(num))
num = num + 20
return
class Parser(SGMLParser):
def reset(self):
SGMLParser.reset(self)
self.imgs = []
def start_img(self, attrs):
src = [v for k, v in attrs if k=='src']
if src:
self.imgs.extend(src)
mkdir()
getimgs()[/code]
Posted 5 years ago
Interesting. I'll give it a shot eventually.
Posted 5 years ago
this code is outdated seems to be coded in python 2. some libs are depreciated
Posted 5 years ago
@DataMine


yup although it should still work?, if it had been coded in python3. since it is in python2 = no support for some features. since py3 been out more py2 features are going depreciated as suspected. many still use py2 for specific situation. but py3 is better move.

when time free's up considering adjusting this code for python3. not sure if its worth it though, have some other things i want to work on & add to the python section. to be honest this script would of been cool to tinker with. also to me its still is good for learning how the script works itself.
Posted 5 years ago
@Thirteen100


I didn't write this code but I could maybe update it some time.

The problem though is having python 3 installed in the first place. My work here requires that I have python 2 installed and if I install both 2 and 3 it causes conflicts with files. I wish I knew how to have them both installed and chose the version I want to run when opening/writing scripts or working in pycharm without it being a chore.
Posted 5 years ago
@DataMine


upon installing python 3 make sure to add to your path. sorry i don't really use py-charm i may start using it more often in the future. but in py-charm in settings u will need to tell it witch python version u want to use, by navagating to the folder where python3 is installed. c:\
Posted 5 years ago
@Thirteen100


I hear ya about Pycharm, I only recently (see last night) started to like using it. Up until last night I couldn't get a simple script to run in it. Had all sorts of issues from import errors (even when copy/pasted straight from tutorials) to missing environments. It was a headache.

I'll try installing 3 again soon and see if I can figure out a comfortable solution to switching between versions in different environments.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Sign in

Already have an account? Sign in here

SIGN IN NOW

Create an account

Sign up for a new account in our community. It's easy!

REGISTER A NEW ACCOUNT