image - Scrape a jpg file on webpage, then saving it using python -


ok i'm trying scrape jpg image gucci website. take 1 example.

http://www.gucci.com/images/ecommerce/styles_new/201501/web_full/277520_f4cyg_4080_001_web_full_new_theme.jpg

i tried urllib.urlretrieve, doesn't work becasue gucci blocked function. wanted use requests scrape source code image , write .jpg file.

image = requests.get("http://www.gucci.com/images/ecommerce/styles_new/201501/web_full/277520_f4cyg_4080_001_web_full_new_theme.jpg").text.encode('utf-8') 

i encoded because if don't, keeps telling me gbk cannot encode string.

then:

with open('1.jpg', 'wb') f:     f.write(image) 

looks right? result -- jpg file cannot opened. there's no image! windows tells me jpg file damaged.

what problem?

  1. i'm thinking maybe when scraped image, lost information, or characters wrongly scraped. how can find out which?

  2. i'm thinking maybe information lost via encoding. if don't encode, cannot print it, not mention writing file.

what go wrong?

i not sure purpose of use of encode. you're not working text, you're working image. need access response binary data, not text, , use image manipulation functions rather text ones. try this:

from pil import image io import bytesio import requests  response = requests.get("http://www.gucci.com/images/ecommerce/styles_new/201501/web_full/277520_f4cyg_4080_001_web_full_new_theme.jpg") bytes = bytesio(response.content) image = image.open(bytes) image.save("1.jpg") 

note use of response.content instead of response.text. need have pil or pillow installed use image module. bytesio included in python 3.

or can save data straight disk without looking @ what's inside:

import requests response = requests.get("http://www.gucci.com/images/ecommerce/styles_new/201501/web_full/277520_f4cyg_4080_001_web_full_new_theme.jpg") open('1.jpg','wb') f:     f.write(response.content) 

Comments

Popular posts from this blog

qt - Using float or double for own QML classes -

Create Outlook appointment via C# .Net -

ios - Swift Array Resetting Itself -