示例示例Python爬取电影的步骤:导入相关库:
Python爬取电影的步骤:
1. 导入相关库:
python
import requests
from bs4 import BeautifulSoup
2. 构建请求头,设置User-Agent:
python
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
}
3. 请求电影页面,获取网页源码:
python
url = 'https://movie.douban.com/'
response = requests.get(url, headers=headers)
html = response.text
4. 使用BeautifulSoup解析网页源码:
python
soup = BeautifulSoup(html, 'lxml')
5. 找到电影信息所在的标签:
python
movies = soup.find_all('div', class_='item')
6. 遍历每一部电影,抓取电影信息:
python
for movie in movies:
title = movie.find('span', class_='title').text
score = movie.find('span', class_='rating_num').text
print(title, score)
完整代码如下:
python
import requests
from bs4 import BeautifulSoup
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
}
url = 'https://movie.douban.com/'
response = requests.get(url, headers=headers)
html = response.text
soup = BeautifulSoup(html, 'lxml')
movies = soup.find_all('div', class_='item')
for movie in movies:
title = movie.find('span', class_='title').text
score = movie.find('span', class_='rating_num').text
print(title, score)
本站系公益性非盈利分享网址,本文来自用户投稿,不代表码文网立场,如若转载,请注明出处
评论列表(6条)