Web Scraper per prnt.sc (Lightshot)

JohnTop

Utente Bronze
22 Gennaio 2020
4
2
1
21
Ciao a tutti

Lightshot è un programma per PC e Mac usato per fare screenshot. Questi screenshot vengono caricati sul sito prnt.sc accessibili a tutti online. Uno screenshot è raggiungibile con un link del tipo prnt.sc/XXXXXX, in cui X è un numero o lettera.

Ho scritto un piccolo programmino in Python 3.8 che trova e salva gli screenshot dal sito sul proprio PC


Python:
import requests
import os
from bs4 import BeautifulSoup
import shutil
import time

URL_base = "https://prnt.sc/"
const = "qnz"
dirName = ".\\images\\"
if not os.path.exists(dirName):
    os.mkdir(dirName)
dirName = ".\\images\\" + const
if not os.path.exists(dirName):
    os.mkdir(dirName)
for var0 in "0123456789abcdefghijklmnopqrstuvwxyz":
    for var1 in "0123456789abcdefghijklmnopqrstuvwxyz":
        for var2 in "0123456789abcdefghijklmnopqrstuvwxyz":
            var = var0 + var1 + var2
            URL = URL_base + const + var0 + var1 + var2
            print("Connecting to...")
            print(URL)                              #page URL to scrape
            #getting the HTML of the page
            page = requests.get(URL, headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.117 Safari/537.36'})
            soup = BeautifulSoup(page.content, 'html.parser')
            print("Response status code: ", page.status_code)
            image_url = soup.find('img', id='screenshot-image').get('src')
            print("Image URL: ", image_url)
            if image_url[0] == 'h':
                #getting the image itself
                resp_img = requests.get(image_url, stream=True)
                filename = dirName + "\\" + const + var + '.png'
                local_file = open(filename, 'wb')
                resp_img.raw.decode_content = True
                shutil.copyfileobj(resp_img.raw, local_file)
                print("Saving: ", filename);
                local_file.close()
            else:
                print("Image not available")
            time.sleep(0.02)
            print()