Faster scraping

import requests
from bs4 import BeautifulSoup

BASE_URL = "https://news.ycombinator.com/"
STORY_LINKS = []

for i in range(10):
    resp = requests.get(f"{BASE_URL}news?p={i}")
    soup = BeautifulSoup(resp.content, "html.parser")
    stories = soup.find_all("a", attrs={"class":"storylink"})
    links = [x["href"] for x in stories if "http" in x["href"]]
    STORY_LINKS += links
    time.sleep(0.25)

print(len(STORY_LINKS))

for url in STORY_LINKS[:3]:
    print(url)

Posted by: Guest on June-08-2021

Source

Code answers related to "Faster scraping"

Code answers related to "Whatever"

Browse Popular Code Answers by Language

Answers for "Faster scraping"

Code answers related to "Faster scraping"

Code answers related to "Whatever"

Browse Popular Code Answers by Language

Popular Programming Languages

Advertisements

Company

Compilers

Help

Connect with us