Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
938 views
in Technique[技术] by (71.8m points)

html - Python 3 - Get text from tag in beautifulSoup

I am using beautifulSoup to extract data from website. Text from that website changes everytime you reload your page so basically I wish to be able to set a focus on the class name as a Static variable since the text is Dynamic.

import requests
from bs4 import BeautifulSoup
url = 'xxxxxxxxxxx'
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
class2 = soup.find_all(True, class_="template_title")
print (class2)

which prints out
<td align="left" class="template_title" height="50" valign="bottom" width="535"><div style="padding-bottom:9px;">4</div></td>
When the page reloads, I will still have the focus on the area but I do not know how to print only the text (which in this case is : 4)

Once this is figured out, I have another question: If the class contains multiple tags, is there a way to get more static data to be sure it only prints the text I was searching for and not more? ( I have class, but could I use height="50" valign="bottom" width="535" as well?)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
  1. You can use text or string attribute of the element.

    elems = soup.find_all(True, class_='template_title')
    print([elem.string for elem in elems])
    # prints `['4']` for the given html snippet
    
  2. Specify more attributes as you want:

    elems = soup.find_all(True, class_='template_title',
                          height='50', valign='bottom', width='535')
    

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...