Python Web Scraping Tutorial

Photo by Ilya Pavlov on Unsplash

Introduction:

Prerequisites:

Tutorial:

Installing the Dependencies:

pip install requests beautifulsoup4
from bs4 import BeautifulSoup
import requests
URL = "https://web.archive.org/web/20200518073855/https://www.empireonline.com/movies/features/best-movies-2/"

response = requests.get(URL)
website_html = response.text
<Response [200]>

Extracting data:

soup = BeautifulSoup(website_html, "html.parser")
print(soup.prettify())
HTML stored in the soup object

Getting all the Titles of the Movies:

all_movies_title = soup.find_all(name="h3", class_="title")
print(all_movies_title)
# Getting the title of each H3 and forming a list of all titles
movie_titles = [movie.getText() for movie in all_movies_title]
# Reversing the list using reverse()
movie_titles.reverse()
print(movie_titles)
# Writing the top 100 movies to a file called movies.txt
with open("movies.txt", mode="w", encoding="utf-8") as file:
for movie in movie_titles:
file.write(f"{movie}\n")

--

--

Loves tech, video games and a nice cup of tea☕

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store