Computer Hope
Software => Computer programming => Topic started by: Organ on July 22, 2021, 02:03:11 AM
-
import requests
from bs4 import BeautifulSoup
url = "https://[Moderator edit: host removed]/bizhitupian/meinvbizhi/yangyanmeinv.htm"
dicc = {"user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36"}
a = requests.get(url, headers = dicc)
a.encoding = 'utf-8'
b = BeautifulSoup(a.text,'html.parser')
c = b.find("div", class_="Typelist").find_all("a")
print(c)
AttributeError: 'NoneType' object has no attribute 'find_all'
____
Moderator edit: removed unknown domain from code example
-
The code looks OK, my guess is that your html does not contain a div with class="Typelist". BS4 would return None for b.find(), then try to run None.find_all("a") and you would get that error.
But the code itself should be fine, this works:
import requests
from bs4 import BeautifulSoup
url = "https://www.computerhope.com"
dicc = {"user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36"}
a = requests.get(url, headers = dicc)
a.encoding = 'utf-8'
b = BeautifulSoup(a.text,'html.parser')
c = b.find("div", class_="skip").find_all("a")
print(c)
output:
[<a href="#main-content">Skip to Main Content</a>]
If you're processing multiple files, and some of them might not have a <div class="Typelist"> then you should capture the b.find() return value separately, and branch if it doesn't have a value e.g.
b = BeautifulSoup(a.text,'html.parser')
bf = b.find("div", class_="Typelist")
if bf: # if the result of b.find() was truthy, then
c = bf.find_all("a")
print(c)
else: # otherwise, bf is falsy (None, False, zero, empty string, etc.), so run this code instead
print("Typelist not found")
https://docs.python.org/3/library/stdtypes.html#truth-value-testing