使用Python搜索网页子页面中单词的频率 [英] Search the frequency of words in the sub pages of a webpage using Python

查看：47 发布时间：2021/4/15 19:20:33 python beautifulsoup

本文介绍了使用Python搜索网页子页面中单词的频率的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我一直在寻求如何抓取网页中的每个链接(页面或子页面)以及查找任何单词出现频率的方法，因此寻求帮助.我用了漂亮的汤刮，但我不认为我做得对.例如:我需要进入立即服务官方页面>解决方案>查看所有解决方案.并在查看所有解决方案"下的所有链接/子页面中找到智能"的频率.任何帮助将不胜感激.谢谢:)

I seek help as I am stuck on how to crawl each and every link (pages or sub pages) in a webpage and find the frequency of any word. I used beautiful soup for scraping but I don't think so I am doing it right. For ex: I need to go to Service now official page > Solutions > View all Solutions. And find the frequency of "Intelligent" in all the links/sub pages under View all Solutions. Any help would be very much appreciated. Thank you :)

我的代码

import requests
from bs4 import BeautifulSoup

url = "https://www.servicenow.com/solutions-by-category.html"
serviceNow_r = requests.get(url)
sNow_soup = BeautifulSoup(serviceNow_r.text, 'html.parser')

print(sNow_soup.find_all('href',{'class':'cta-list component'}))


for name in sNow_soup.find_all('href',{'class':'cta-list component'}):
    print(name.text)

推荐答案

这是访问页面中每个链接的href属性所需要的.

This is what you need to access the href attribute for every link in the page.

import requests
from bs4 import BeautifulSoup

url = "https://www.servicenow.com/solutions-by-category.html"
serviceNow_r = requests.get(url)
sNow_soup = BeautifulSoup(serviceNow_r.text, 'html.parser')

for anchor in sNow_soup.find_all('a', href=True):
    print(anchor['href'])

这篇关于使用Python搜索网页子页面中单词的频率的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用Python搜索网页子页面中单词的频率 [英] Search the frequency of words in the sub pages of a webpage using Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用Python搜索网页子页面中单词的频率 [英] Search the frequency of words in the sub pages of a webpage using Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭