首页
Python
如何使用 Beautifulsoup 解析网站

如何使用 Beautifulsoup 解析网站 [英] How to parse the website using Beautifulsoup

查看：20 发布时间：2021/12/23 20:46:51 python parsing web-scraping beautifulsoup linkedin

本文介绍了如何使用 Beautifulsoup 解析网站的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我是网络抓取的新手，我想获取页面的 html.但是当我运行程序时，我得到 html 为空，控制台显示 javascript

I am new to web scraping and i want to get the html of the page.But when i run the program i get html empty and console show the javascript

from bs4 import BeautifulSoup
import requests
import urllib

url = "https://linkedin.com/company/1005"

r = requests.get(url)
html_content = r.text
soup = BeautifulSoup(html_content,'html.parser')
print (soup.prettify())

推荐答案

问题不是 BeautifulSoup 而是服务器，它在请求中需要更多信息才能让您访问此页面.现在它会发送 JavaScript 代码，将您重定向到登录页面.

Problem is not BeautifulSoup but server which needs more information in requests to give you access to this page. Now it sends JavaScript code which redirects you to login page.

您需要 User-Agent 标头来获取此页面.

You need User-Agent header to get this page.

您可以使用http://httpbin.org/get查看User-Agent在您的浏览器中.


You can use http://httpbin.org/get to see User-Agent in your browser.
import requests
from bs4 import BeautifulSoup

headers = {'User-Agent': 'Mozilla/5.0'}

url = "https://linkedin.com/company/1005"

r = requests.get(url, headers=headers)
print(r.text)

soup = BeautifulSoup(r.text, 'html.parser')
print(soup.prettify())


                        这篇关于如何使用 Beautifulsoup 解析网站的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文


        
            



        
        
            相关文章
            
                    
                        
                            如何使用Beautifulsoup解析网站;
                        
                    
                    
                        
                            使用BeautifulSoup和Selenium解析网站;
                        
                    
                    
                        
                            解析网站与beautifulsoup;
                        
                    
                    
                        
                            用beautifulsoup解析网站;
                        
                    
                    
                        
                            如何在 python 中使用 Selenium 和 Beautifulsoup 解析网站?;
                        
                    
                    
                        
                            Python Beautifulsoup4网站解析;
                        
                    
                    
                        
                            如何使用BeautifulSoup加快解析速度?;
                        
                    
                    
                        
                            使用 BeautifulSoup 解析 HTML;
                        
                    
                    
                        
                            解析使用beautifulsoup HTML页面;
                        
                    
                    
                        
                            使用BeautifulSoup解析的Facebook;
                        
                    
                    
                        
                            使用 beautifulsoup 解析 HTML 页面;
                        
                    
                    
                        
                            python - 使用 BeautifulSoup 抓取 ajax 网站;
                        
                    
                    
                        
                            解析URL beautifulsoup;
                        
                    
                    
                        
                            使用BeautifulSoup解析嵌套的div;
                        
                    
                    
                        
                            使用Python和BeautifulSoup解析表;
                        
                    
                    
                        
                            使用BeautifulSoup的HTML链接解析;
                        
                    
                    
                        
                            刮擦网站，要求使用BeautifulSoup登录;
                        
                    
                    
                        
                            如何安装BeautifulSoup XML解析模块?;
                        
                    
                    
                        
                            解析BeautifulSoup html标签;
                        
                    
                    
                        
                            Python BeautifulSoup XML 解析;
                        
                    
                    
                        
                            BeautifulSoup HTML解析表;
                        
                    
                    
                        
                            Python BeautifulSoup XML解析;
                        
                    
                    
                        
                            python BeautifulSoup解析表;
                        
                    
                    
                        
                            BeautifulSoup HTML 表格解析;
                        
                    
                    
                        
                            HTML解析与BeautifulSoup;


    
        
            Python最新文章
            
                    
                        
                            类型错误：只有长度为1的阵列可以尝试拟合指数的数据转换到Python标量;
                        
                    
                    
                        
                            bs4.FeatureNotFound：找不到一棵树建设者您所要求的功能：LXML。你需要安装一个解析器库？;
                        
                    
                    
                        
                            系列的真值是不明确的。使用a.empty，a.bool（），a.item（），a.any（）或a.all（）;
                        
                    
                    
                        
                            （unicode错误）'unicodeescape'编解码器无法解码位置2-3中的字节：truncated \UXXXXXXXX escape;
                        
                    
                    
                        
                            将pandas dataframe中的列从int转换为string;
                        
                    
                    
                        
                            Python：由实例对象调用方法：“missing 1 required positional argument：'self'”;
                        
                    
                    
                        
                            Sparksql过滤与多个条件（与where子句中选择）;
                        
                    
                    
                        
                            JSONDe codeError：期待值：1行1列（CHAR 0）;
                        
                    
                    
                        
                            Cmake不能找到Python库;
                        
                    
                    
                        
                            Python  - 将Dataframe中的所有项目转换为字符串;
                        
                    
            
        
        
            
                热门教程
            
            
                
                    
                        Java教程
                    
                
                
                    
                        Apache ANT 教程
                    
                
                
                    
                        Kali Linux教程
                    
                
                
                    
                        JavaScript教程
                    
                
                
                    
                        JavaFx教程
                    
                
                
                    
                        MFC 教程
                    
                
                
                    
                        Apache HTTP客户端教程
                    
                
                
                    
                        Microsoft Visio 教程
                    
                
            
        
        
            
                热门工具
            
            
                
                
                    
                        Java 在线工具
                    
                
                
                    
                        C(GCC) 在线工具
                    
                
                
                    
                        PHP 在线工具
                    
                
                
                    
                        C# 在线工具
                    
                
                
                    
                        Python 在线工具
                    
                
                
                    
                        MySQL 在线工具
                    
                
                
                    
                        VB.NET 在线工具
                    
                
                
                    
                        Lua 在线工具
                    
                
                
                    
                        Oracle 在线工具
                    
                
                
                    
                        C++(GCC) 在线工具
                    
                
                
                    
                        Go 在线工具
                    
                
                
                    
                        Fortran 在线工具



    
        
            登录
            关闭
        
        
            
                扫码关注1秒登录
            
            
                
            
            
                
                
            
            
                发送“验证码”获取
                |
                15天全站免登陆
            
            
        
    
    





    
		
			友情链接：
            IT屋
            Chrome插件
            谷歌浏览器插件
        
        
            IT屋
            ©2016-2022 琼ICP备2021000895号-1
            站点地图
            站点标签
            SiteMap
            <免责申明>
            本站内容来源互联网,如果侵犯您的权益请联系我们删除.