Computer Hope

Software => Computer programming => Topic started by: Organ on July 22, 2021, 02:03:11 AM

Title: python programming error
Post by: Organ on July 22, 2021, 02:03:11 AM
Code: [Select]

import requests
from bs4 import BeautifulSoup
url = "https://[Moderator edit: host removed]/bizhitupian/meinvbizhi/yangyanmeinv.htm"
dicc = {"user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36"}
a = requests.get(url, headers = dicc)
a.encoding = 'utf-8'
b = BeautifulSoup(a.text,'html.parser')
c = b.find("div", class_="Typelist").find_all("a")
print(c)

AttributeError: 'NoneType' object has no attribute 'find_all'

____
Moderator edit: removed unknown domain from code example
Title: Re: python programming error
Post by: nil on July 22, 2021, 05:49:32 AM
The code looks OK, my guess is that your html does not contain a div with class="Typelist". BS4 would return None for b.find(), then try to run None.find_all("a") and you would get that error.

But the code itself should be fine, this works:

Code: [Select]
import requests
from bs4 import BeautifulSoup
url = "https://www.computerhope.com"
dicc = {"user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36"}
a = requests.get(url, headers = dicc)
a.encoding = 'utf-8'
b = BeautifulSoup(a.text,'html.parser')
c = b.find("div", class_="skip").find_all("a")
print(c)

output:

Code: [Select]
[<a href="#main-content">Skip to Main Content</a>]
If you're processing multiple files, and some of them might not have a <div class="Typelist"> then you should capture the b.find() return value separately, and branch if it doesn't have a value e.g.

Code: [Select]
b = BeautifulSoup(a.text,'html.parser')
bf = b.find("div", class_="Typelist")
if bf:      # if the result of b.find() was truthy, then
    c = bf.find_all("a")
    print(c)
else:       # otherwise, bf is falsy (None, False, zero, empty string, etc.), so run this code instead
    print("Typelist not found")

https://docs.python.org/3/library/stdtypes.html#truth-value-testing