登录后网页抓取 [英] Web Scraping after login

查看:24
本文介绍了登录后网页抓取的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我执行以下代码以登录分配给 loginUrl 的 url.身份验证后,我想转到另一个网页,其网址存储在 portfolioUrl 中.但是,当我 print(portfolioPage.content) 时,它会在登录后直接打印网页,但不会打印我想要的 portfolioPage.我的代码有什么问题?

from bs4 import BeautifulSoup进口请求# 创建会话session = requests.Session()loginUrl='https://www.investopedia.com/auth/realms/investopedia/protocol/openid-connect/auth?client_id=inv-simulator&redirect_uri=https%3A%2F%2Fwww.investopedia.com%2Fauth%2Frealms%2Finvestopedia%2Fshopify-auth%2Finv-simulator%2Flogin%3F%26redirectUrl%3Dhttps%253A%252F%252Fwww.investopedia.com%252Fauth%252Frealms%252Finvestopedia%252Fprotocol52%252Fprotocol%3Dhttps%253%3Dhttps%253A%2Fauth%252Fprotocol%2Fprotocol%3Dhttps%250000000000000000000000000000000000000000000000000000000000000000000000间%2526redirect_uri%253Dhttps%25253A%25252F%25252Fwww.investopedia.com%25252Fsimulator%25252Fhome.aspx%2526client_id%253Dinv-simulator-conf&state=7edda384b2famp;bf-response=7edda384b28&78&b88&b88-response=openid&nonce=cd558670-7ae3-4c14-8281-bc149d4987b3'投资组合网址 = 'https://www.investopedia.com/simulator/trade/tradestock.aspx'有效载荷 = {'用户名':'我的电子邮件','密码':'我的密码'}authPage = session.get(loginUrl)汤 = BeautifulSoup(authPage.content, 'html.parser')形式 = 汤.find('形式')postUrl = form['action']auth = session.post(postUrl, data=payload)投资组合页面 = session.get(portfolioUrl)汤 = BeautifulSoup(portfolioPage.content, 'html.parser')打印(portfolioPage.content)

t4kq 的回答非常好;但是,当我 print(page.text) 它没有按预期输出页面的 HTML 代码时,而是输出此代码:

<头部简介=http://www.w3.org/1999/xhtml/vocab"><meta http-equiv="X-UA-Compatible";内容=IE=边缘"/><meta http-equiv="Content-Type";内容=文本/html;字符集=utf-8"/><元名称=应用程序名称"内容=投资百科"/><元名称=视口"内容=宽度=设备宽度,初始比例=1"><meta http-equiv="X-UA-Compatible";内容=IE=9"><!-- 页面分类法--><script type="text/javascript">//<![CDATA[var_pageTaxonomy = {哈希键":$simulator$trade$tradestock",频道":模拟器",子频道":",广告":投资",子广告":模拟器",AdTarget":investopedia.com/simulator",DfpTarget":投资/投资",标签":空,类型":模拟器",利润":空,永恒":永恒",特征":",设计":",InterestLevel":空,路径":/simulator/trade/tradestock.aspx",};//]]><!-- 结束页面分类法--><脚本语言=javascript"type=text/javascript">var idc_slots = {};idc_slots.slots = [AdSlot_AF-Top-Leaderboard"、AdSlot_AF-Left-Multi"、AdSlot_BF-Right-Button1"、AdSlot_BF-Right-Button2"、AdSlot_BF-Right"、AdSlot_BF-Right-Button4"];idc_slots.build = 函数(插槽){return "/479/INV-NA/Investing/Investing/position/Simulator".replace("position", slot.position);};</script><script type="text/javascript">idc_slots.slots.push({AdSlot_AF-Top-Leaderboard":{大小映射:[{视口大小:[1000, 1],slotSizes: [[728, 90], [970, 90], [950, 90], [960, 90], [970, 66], [980, 90], fluid"]},{视口大小:[700, 1],slotSizes: [[728, 90], [468, 60]]},{视口大小:[400, 1],slotSizes: [468, 60]},{视口大小:[0, 0],slotSizes: [[320, 50], [320, 100]]}],amz尺寸:{桌面:[[728 ,90]],平板电脑:[[728,90]],电话:[[728,90],[320,50]]}}});<title>Investopedia Stock Simulator - Investopedia Stock Simulator - 交易股票</title><meta name=Description"content="Fantasy&#x20;stock&#x20;market&#x20;game&#x20;that&#x20;simulates&#x20;trading&#x20;stocks&#x20;和&#x20;options."><元名称=视口"content=width&#x3D;device-width,&#x20;initial-scale&#x3D;1"><元名称=描述"content="Fantasy&#x20;stock&#x20;market&#x20;game&#x20;that&#x20;simulates&#x20;trading&#x20;stocks&#x20;和&#x20;options."><元名称=视口"content=width&#x3D;device-width,&#x20;initial-scale&#x3D;1"><link rel="canonical";href=https://www.investopedia.com/simulator/trade/tradestock.aspx"/><link href=https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;public&#x2F;img&#x2F;favicon.ico"rel="快捷方式&#x20;图标";type="image&#x2F;vnd.microsoft.icon"><link href="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;dest&#x2F;css&#x2F;simulator.css&#x3F;v&#x3D;202102030915"媒体=屏幕"rel=样式表"type="text&#x2F;css"><link href=https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;public&#x2F;img&#x2F;favicon.ico"rel="快捷方式&#x20;图标";type=image&#x2F;vnd.microsoft.icon"><script language=javascript"type="text/javascript">var googletag = googletag ||{};googletag.cmd = googletag.cmd ||[];</script><script language="javascript";type="text/javascript">var sem_pageview = false;var sem_ocode = '9999';var sem_ldid = '';var sem_sh = '';功能更新SemVariable(查询){如果(查询 [1] === 未定义){返回;}开关(查询[0]){案例o":sem_ocode = 查询 [1];休息;案例'ldid':sem_ldid = 查询 [1];休息;案例'sh':sem_sh = 查询 [1];休息;}}函数 getCookie(cname) {var name = cname + "=";var ca = document.cookie.split(';');for (var i = 0; i < ca.length; i++) {var c = ca[i];while (c.charAt(0) == ' ') c = c.substring(1);if (c.indexOf(name) == 0) return c.substring(name.length,c.length);}返回";}函数 getSemCookie() {var queryStr = getCookie('semuser');if (queryStr == "") {返回;}sem_pageview = 真;var 查询 = queryStr.split("&");for (var i = 0, l = queries.length; i < l; i++) {var query = queries[i].split('=');更新SemVariable(查询);}}getSemCookie();var updateAup = function(aUp) {aUp = aUp.replace(INV-NA", invsem-serp-ds");var utms = null;if (typeof getUrlParam === "函数") {尝试 {utms = getUrlParam(utm_source");} 赶上 (e) {}}var aUp_arr = aUp.split("/");var last = aUp_arr.pop();aUp_arr.push((utms !== null ? utms : "dir") +_"+ (typeof sem_ocode !== "未定义" ?sem_ocode : 0));如果(aUp_arr.length > 3){aUp_arr[3] = 最后;}return aUp_arr.join("/");};如果(谷歌标签的类型!==未定义"){googletag.cmd.push( 函数() {if ((typeof sem_pageview !== 'undefined') && (sem_pageview == true)) {var processArgs = 函数(参数){if (typeof arguments === "object") {for (var i = 0; i < arguments.length; i++) {如果(参数[i].indexOf(479")> -1){参数[i] = updateAup(参数[i]);休息;}}}返回参数;};googletag.defineSlot = (function() {var orig_func = googletag.defineSlot;返回函数(){返回 orig_func.apply(this, processArgs(arguments));};})();googletag.defineOutOfPageSlot = (function() {var orig_func = googletag.defineOutOfPageSlot;返回函数(){返回 orig_func.apply(this, processArgs(arguments));};})();}});}</script><script type="text&#x2F;javascript";src="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;public&#x2F;simulator&#x2F;js&#x2F;jquery.min.js?v&#x3D;202102030915"</script><script type="text&#x2F;javascript";src=https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;js&#x2F;jquery.mcs.min.js&#x3F;v&#x3D;202102030915";></脚本><script type="text&#x2F;javascript";src="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;public&#x2F;simulator&#x2F;js&#x2F;cookie.js&#x3F;v&#x3D;202102030915></script><script type="text&#x2F;javascript";src="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;public&#x2F;simulator&#x2F;js&#x2F;cookiemix.js&#x3F;v&#x3D;202102030915></script><script type="text&#x2F;javascript";src="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;public&#x2F;simulator&#x2F;js&#x2F;g.js&#x3F;v&#x3D;202102030915></script><script type="text&#x2F;javascript";src="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;public&#x2F;simulator&#x2F;js&#x2F;microsoftAjax.js&#x3F;v&#x3D;202102030915></script><script type="text&#x2F;javascript";src="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;public&#x2F;simulator&#x2F;js&#x2F;microsoftAjaxWebForms.js&#x3F;v&#x3D;202102030915></script><script type="text&#x2F;javascript";src="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;simulator_ui&#x2F;js&#x2F;ScrollingTicker.js&#x3F;v&#x3D;202102030915》</script><script type="text&#x2F;javascript";src="https&#x3A;&#x2F;&#x2F;cdn.jsdelivr.net&#x2F;npm&#x2F;promise-polyfill&#x40;7&#x2F;dist&#x2F;polyfill.min.js"</script><script type="text&#x2F;javascript";src="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;dest&#x2F;js&#x2F;inv.min.js&#x3F;v=202102030915"></script><script type="text&#x2F;javascript";src=https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;dist&#x2F;simulator.min.js"</script><script type="text&#x2F;javascript";src=https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;dist&#x2F;gdpr.min.js&#x3F;v&#x3D;202102030915">;</脚本><script type="text/javascript">eval(function(p,a,c,k,e,d){e=function(c){return c.toString(36)};if(!''.replace(/^/,String)){while(c--){d[c.toString(a)]=k[c]||c.toString(a)}k=[function(e){return d[e]}];e=function(){return'\\w+'};c=1};while(c--){if(k[c]){p=p.replace(new RegExp('\\b'+e(c)+'\\b','g'),k[c])}}return p}('7 2(9){od(9)}a 0={4:\'\',3:\'e\',6:\'\',5:\'\'};a 8=fc({h:2(\'i=\'),1:{g:2(\'j=\'),k:2(\'m\')}});8.n(7(1){0.4=1[\'4\']||0.4;0.3=1[\'3\']||0.3;0.6=1[\'b\']||0.6;0.5=1[\'l\']||0.5});',25,25,'geoData|data|decode|country_code|city|FIN_zip|FIN_state |功能| jqXHR |编码| VAR | REGION_CODE |阿贾克斯| ATOB | FR | jQuery的| access_key |网址| aHR0cHM6Ly9hcGkuaXBzdGFjay5jb20vY2hlY2s | MTBlZjJlYjI2NzFhNjQ5MTQ5NDk1ODZjMzExMDdiYWQ |领域|拉链| Y2l0eSxjb3VudHJ5X2NvZGUscmVnaW9uX2NvZGUsemlw |完成| return'.split( '|'),0,{}))<script type="text/javascript">(功能(d){var e = d.createElement('script');e.src = d.location.protocol + '//tag.bounceexchange.com/2320/i.js';e.async = true;d.getElementsByTagName("head")[0].appendChild(e);}(文档));<!--shift_source: 4824cfbe9ef0--><body class="simulator-page";onunload="SaveTickerPos();"><div style="display: none;"><!-- DoubleClick Spotlight 代码的开头:请不要删除 --><!-- 此标签的活动名称是:IP Simulator --><!-- 应放置标签的网站 URL:http://www.investopedia.com/simulator --><!-- 这个标签必须放在开头 <body>标签,尽可能靠近它的开头 --><!-- 创建日期:2009 年 7 月 2 日星期四 17:02:35 EDT --><脚本语言=JavaScript">函数 SaveTickerPos(){尝试{for (var obj in allTickers){allTickers[obj].paused = true;jQuery.cookie(allTickers[obj].cookieName, allTickers[obj].x, {path: '/'});}}捕获(e){}}var axel = Math.random() + "";var a = axel * 10000000000000;document.write('<img src="https://ad.doubleclick.net/activity;src=2359949;type=ips;cat=ips;ord=1;num=' + a + '?";宽度=1 高度=1 边框=0>');<noscript><img src=https://ad.doubleclick.net/activity;src=2359949;type=ips;cat=ips;ord=1;num=1?";宽度=1 高度=1 边框=0></noscript><!-- DoubleClick Spotlight 代码结束:请不要删除 --><!-- 开始 comScore 标签 --><script type="text/javascript";语言=javascript">var _comscore = _comscore ||[];_comscore.push({ c1: "2", c2: "18280457", c4: "https://www.investopedia.com/simulator/trade/tradestock.aspx" });(功能() {var s = document.createElement(script"), el = document.getElementsByTagName(script")[0];s.async = 真;s.src = (document.location.protocol == "https:" ? "https://sb" : "http://b") + ".scorecardresearch.com/beacon.js";el.parentNode.insertBefore(s, el);})();<noscript><img src="https://sb.scorecardresearch.com/p?c1=2&c2=18280457&c4=https://www.investopedia.com/simulator/trade/tradestock.aspx&cv=2.0&cj=1"/></noscript><!-- comScore 标签结束-->

<script type='text/javascript' language=JavaScript">//<![CDATA[if (getCookie('freenewsletterreg') == null) {setCookie("freenewsletterreg", "ad", 30);}var user_info = $.parseJSON(decodeURIComponent(getCookie('user_info')).replace(/\+/g, ' '));//]]><!--<script type='text/javascript' src="https://www.investopedia.com/simulator/Common/VcidScript.ashx?u=e3bfd87f21d741578241089c9aa5f4c8"></script>--><!-- Google 标签管理器--><noscript><iframe src="//www.googletagmanager.com/ns.html?id=GTM-5V3WHJ";高度=0"宽度=0"style="display:none;visibility:hidden"></iframe></noscript><脚本>(函数(w, d, s, l, i) {w[l] = w[l] ||[];w[l].push({'gtm.start': new Date().getTime(), event: 'gtm.js'});var f = d.getElementsByTagName(s)[0],j = d.createElement(s), dl = l != 'dataLayer' ?'&l=' + l : '';j.async = 真;j.src ='//www.googletagmanager.com/gtm.js?id=' + i + dl;f.parentNode.insertBefore(j, f);})(window, document, 'script', 'dataLayer', 'GTM-5V3WHJ');</script><!-- 结束 Google 标签管理器 --><script type="text/javascript">dataLayer.push(_pageTaxonomy);var pageviewID = genPageviewId();dataLayer.push({'pageviewID' : pageviewID});<!-- ================================ 标题 ==================================== --><div id="标题"><div class="mid"><div class="brand clear layout-size"><a href=//index.investopedia.com/"><div class=m-search-icon"><i></i></div></a><div class="logo-container"><a href="/";class=logo"></a><div class="button-container"><a class="button view-markets-btn inv-ga-link-tracking";href="/markets/";目标=_blank";data-ga-label=blue-markets-cta">查看市场</a>

<div id=ctl00_AdLeaderBoard1_cgiAdTopLeaderboard"class =领导者"><div id='AdSlot_AF-Top-Leaderboard' adonis-marker></div>

<!-- ================================ 标题//结束 ================================== --><!-- ================================ 内容==================================== --><div id="内容"类=完整"><!-- ================================== 左导航================================== --><div class="left-nav"><div class="label">贸易

<ul><li class=""><span></span><a href="https://www.i

解决方案

你可以试试

导入请求从 bs4 导入 BeautifulSoup# 创建会话session = requests.Session()url = 'https://investopedia.com/simulator/portfolio/'有效载荷 = {'用户名':'您的电子邮件','密码':'你的密码'}# 获取登录页面auth_page = session.get(url)汤 = BeautifulSoup(auth_page.content, 'html.parser')# 获取表格形式 = 汤.find('形式')# 获取帖子地址post_url = 表单['动作']# 认证session.post(post_url, data=payload)# 解析内容content_url = 'https://investopedia.com/simulator/trade/tradestock.aspx'页面 = session.get(content_url)page_soup = BeautifulSoup(page.content, 'html.parser')# 模拟页面sim_page = page_soup.find('div', {'class': 'sim-page'})table = sim_page.find_all('table', {'class': 'table2'})[1]rows = table.find_all('tr')对于行中的行:打印(row.find('th').文本)打印(row.find('td').文本)打印(' -  - ')

价值(美元)10,000.00 美元----购买力10,000.00 美元----现金10,000.00 美元----

I execute the following code to log in to the url that's assigned to loginUrl. After authentication, I want to go to another webpage that has its url stored in portfolioUrl. However, when I print(portfolioPage.content), it prints the webpage directly after log in but not portfolioPage that I want. What's wrong with my code?

from bs4 import BeautifulSoup
import requests
# create session
session = requests.Session()

loginUrl='https://www.investopedia.com/auth/realms/investopedia/protocol/openid-connect/auth?client_id=inv-simulator&redirect_uri=https%3A%2F%2Fwww.investopedia.com%2Fauth%2Frealms%2Finvestopedia%2Fshopify-auth%2Finv-simulator%2Flogin%3F%26redirectUrl%3Dhttps%253A%252F%252Fwww.investopedia.com%252Fauth%252Frealms%252Finvestopedia%252Fprotocol%252Fopenid-connect%252Fauth%253Fresponse_type%253Dcode%2526approval_prompt%253Dauto%2526redirect_uri%253Dhttps%25253A%25252F%25252Fwww.investopedia.com%25252Fsimulator%25252Fhome.aspx%2526client_id%253Dinv-simulator-conf&state=7edda3b2-eb6a-441f-8589-b42b8b78accf&response_mode=fragment&response_type=code&scope=openid&nonce=cd558670-7ae3-4c14-8281-bc149d4987b3'
portfolioUrl = 'https://www.investopedia.com/simulator/trade/tradestock.aspx'

payload = {
    'username': 'my email',
    'password': 'my password'
}
authPage = session.get(loginUrl)
soup = BeautifulSoup(authPage.content, 'html.parser')
form = soup.find('form')
postUrl = form['action']
auth = session.post(postUrl, data=payload)

portfolioPage = session.get(portfolioUrl)
soup = BeautifulSoup(portfolioPage.content, 'html.parser')
print(portfolioPage.content)

Edit: t4kq's answer works perfectly fine; however, when I print(page.text) it doesn't output the HTML code of the page as expected, but outputs this code instead:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" version="XHTML+RDFa 1.0" dir="ltr">
<head profile="http://www.w3.org/1999/xhtml/vocab">
    <meta http-equiv="X-UA-Compatible" content="IE=edge" />
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <meta name="application-name" content="Investopedia"/>
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <meta http-equiv="X-UA-Compatible" content="IE=9">

    <!-- Page Taxonomy -->
<script type="text/javascript">
//<![CDATA[
  var _pageTaxonomy = {
    "Hashkey": "$simulator$trade$tradestock",
    "Channel": "Simulator",
    "SubChannel": "",
    "Advertising": "Investing",
    "SubAdvertising": "Simulator",
    "AdTarget": "investopedia.com/simulator",
    "DfpTarget": "Investing/Investing",
    "Tags": null,
    "Type": "Simulator",
    "Lucrativeness": null,
    "Timelessness": "Timeless",
    "Feature": "",
    "Design": "",
    "InterestLevel": null,
    "Path" : "/simulator/trade/tradestock.aspx",
  };
//]]>
</script>
<!-- End Page Taxonomy -->
    <script language="javascript" type="text/javascript">var idc_slots = {};
        idc_slots.slots = ["AdSlot_AF-Top-Leaderboard","AdSlot_AF-Left-Multi","AdSlot_BF-Right-Button1","AdSlot_BF-Right-Button2","AdSlot_BF-Right-Button3","AdSlot_BF-Right-Button4"];
        idc_slots.build = function(slot) {
            return "/479/INV-NA/Investing/Investing/position/Simulator".replace("position", slot.position);
        };</script><script type="text/javascript">
            idc_slots.slots.push({
                        "AdSlot_AF-Top-Leaderboard" : {
                            sizeMappings: [
                                {
                                    viewportSize: [1000, 1],
                                    slotSizes: [[728, 90], [970, 90], [950, 90], [960, 90], [970, 66], [980, 90],"fluid"]
                                },
                                {
                                    viewportSize: [700, 1],
                                    slotSizes: [[728, 90], [468, 60]]
                                },
                                {
                                    viewportSize: [400, 1],
                                    slotSizes: [468, 60]
                                },
                                {
                                    viewportSize: [0, 0],
                                    slotSizes: [[320, 50], [320, 100]]
                                }
                            ],
                            amzSizes : {
                                desktop: [[728 ,90]],
                                tablet: [[728 ,90]],
                                phone: [[728 ,90], [320,50]]
                            }
                        }});
                </script>

    <title>Investopedia Stock Simulator - Investopedia Stock Simulator - Trade a Stock</title><meta name="Description" content="Fantasy&#x20;stock&#x20;market&#x20;game&#x20;that&#x20;simulates&#x20;trading&#x20;stocks&#x20;and&#x20;options.">
<meta name="viewport" content="width&#x3D;device-width,&#x20;initial-scale&#x3D;1">
<meta name="Description" content="Fantasy&#x20;stock&#x20;market&#x20;game&#x20;that&#x20;simulates&#x20;trading&#x20;stocks&#x20;and&#x20;options.">
<meta name="viewport" content="width&#x3D;device-width,&#x20;initial-scale&#x3D;1">    <link rel="canonical" href="https://www.investopedia.com/simulator/trade/tradestock.aspx" />
    <link href="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;public&#x2F;img&#x2F;favicon.ico" rel="shortcut&#x20;icon" type="image&#x2F;vnd.microsoft.icon">
<link href="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;dest&#x2F;css&#x2F;simulator.css&#x3F;v&#x3D;202102030915" media="screen" rel="stylesheet" type="text&#x2F;css">
<link href="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;public&#x2F;img&#x2F;favicon.ico" rel="shortcut&#x20;icon" type="image&#x2F;vnd.microsoft.icon"><script language="javascript" type="text/javascript">
    var googletag = googletag || {};
    googletag.cmd = googletag.cmd || [];
</script><script language="javascript" type="text/javascript">
    var sem_pageview = false;
    var sem_ocode = '9999';
    var sem_ldid = '';
    var sem_sh = '';
    function updateSemVariable(query) {
        if (query[1] === undefined) {
            return;
        }
        switch(query[0]) {
            case 'o':
                sem_ocode = query[1];
                break;
            case 'ldid':
                sem_ldid = query[1];
                break;
            case 'sh':
                sem_sh = query[1];
                break;
        }
    }
    function getCookie(cname) {
        var name = cname + "=";
        var ca = document.cookie.split(';');
        for (var i = 0; i < ca.length; i++) {
            var c = ca[i];
            while (c.charAt(0) == ' ') c = c.substring(1);
            if (c.indexOf(name) == 0) return c.substring(name.length,c.length);
        }
        return "";
    }
    function getSemCookie() {
        var queryStr = getCookie('semuser');
        if (queryStr == "") {
            return;
        }
        sem_pageview = true;
        var queries = queryStr.split("&");
        for (var i = 0, l = queries.length; i < l; i++) {
            var query = queries[i].split('=');
            updateSemVariable(query);
        }
    }
    getSemCookie();
    var updateAup = function(aUp) {
        aUp = aUp.replace("INV-NA", "invsem-serp-ds");
        var utms = null;
        if (typeof getUrlParam === "function") {
            try {
                utms = getUrlParam("utm_source");
            } catch (e) {}
        }
        var aUp_arr = aUp.split("/");
        var last = aUp_arr.pop();
        aUp_arr.push((utms !== null ? utms : "dir") +
            "_" + (typeof sem_ocode !== "undefined" ?
                sem_ocode : 0));
        if (aUp_arr.length > 3) {
            aUp_arr[3] = last;
        }
        return aUp_arr.join("/");
    };
    if (typeof googletag !== "undefined") {
        googletag.cmd.push( function() {
            if ((typeof sem_pageview !== 'undefined') && (sem_pageview == true)) {
                var processArgs = function(arguments) {
                    if (typeof arguments === "object") {
                        for (var i = 0; i < arguments.length; i++) {
                            if (arguments[i].indexOf("479") > -1) {
                                arguments[i] = updateAup(arguments[i]);
                                break;
                            }
                        }
                    }
                    return arguments;
                };
                googletag.defineSlot = (function() {
                    var orig_func = googletag.defineSlot;
                    return function() {
                        return orig_func.apply(this, processArgs(arguments));
                    };
                })();
                googletag.defineOutOfPageSlot = (function() {
                    var orig_func = googletag.defineOutOfPageSlot;
                    return function() {
                        return orig_func.apply(this, processArgs(arguments));
                    };
                })();
            }
        });
    }
</script><script type="text&#x2F;javascript" src="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;public&#x2F;simulator&#x2F;js&#x2F;jquery.min.js&#x3F;v&#x3D;202102030915"></script>
<script type="text&#x2F;javascript" src="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;js&#x2F;jquery.mcs.min.js&#x3F;v&#x3D;202102030915"></script>
<script type="text&#x2F;javascript" src="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;public&#x2F;simulator&#x2F;js&#x2F;cookie.js&#x3F;v&#x3D;202102030915"></script>
<script type="text&#x2F;javascript" src="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;public&#x2F;simulator&#x2F;js&#x2F;cookiemix.js&#x3F;v&#x3D;202102030915"></script>
<script type="text&#x2F;javascript" src="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;public&#x2F;simulator&#x2F;js&#x2F;g.js&#x3F;v&#x3D;202102030915"></script>
<script type="text&#x2F;javascript" src="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;public&#x2F;simulator&#x2F;js&#x2F;microsoftAjax.js&#x3F;v&#x3D;202102030915"></script>
<script type="text&#x2F;javascript" src="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;public&#x2F;simulator&#x2F;js&#x2F;microsoftAjaxWebForms.js&#x3F;v&#x3D;202102030915"></script>
<script type="text&#x2F;javascript" src="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;simulator_ui&#x2F;js&#x2F;ScrollingTicker.js&#x3F;v&#x3D;202102030915"></script>
<script type="text&#x2F;javascript" src="https&#x3A;&#x2F;&#x2F;cdn.jsdelivr.net&#x2F;npm&#x2F;promise-polyfill&#x40;7&#x2F;dist&#x2F;polyfill.min.js"></script>
<script type="text&#x2F;javascript" src="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;dest&#x2F;js&#x2F;inv.min.js&#x3F;v&#x3D;202102030915"></script>
<script type="text&#x2F;javascript" src="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;dist&#x2F;simulator.min.js"></script>
<script type="text&#x2F;javascript" src="https&#x3A;&#x2F;&#x2F;i.investopedia.com&#x2F;dist&#x2F;gdpr.min.js&#x3F;v&#x3D;202102030915"></script>   

<script type="text/javascript">
eval(function(p,a,c,k,e,d){e=function(c){return c.toString(36)};if(!''.replace(/^/,String)){while(c--){d[c.toString(a)]=k[c]||c.toString(a)}k=[function(e){return d[e]}];e=function(){return'\\w+'};c=1};while(c--){if(k[c]){p=p.replace(new RegExp('\\b'+e(c)+'\\b','g'),k[c])}}return p}('7 2(9){o d(9)}a 0={4:\'\',3:\'e\',6:\'\',5:\'\'};a 8=f.c({h:2(\'i=\'),1:{g:2(\'j=\'),k:2(\'m\')}});8.n(7(1){0.4=1[\'4\']||0.4;0.3=1[\'3\']||0.3;0.6=1[\'b\']||0.6;0.5=1[\'l\']||0.5});',25,25,'geoData|data|decode|country_code|city|FIN_zip|FIN_state|function|jqXHR|encoded|var|region_code|ajax|atob|FR|jQuery|access_key|url|aHR0cHM6Ly9hcGkuaXBzdGFjay5jb20vY2hlY2s|MTBlZjJlYjI2NzFhNjQ5MTQ5NDk1ODZjMzExMDdiYWQ|fields|zip|Y2l0eSxjb3VudHJ5X2NvZGUscmVnaW9uX2NvZGUsemlw|done|return'.split('|'),0,{}))
</script>

    <script type="text/javascript">
        (function(d) {
            var e = d.createElement('script');
            e.src = d.location.protocol + '//tag.bounceexchange.com/2320/i.js';
            e.async = true;
            d.getElementsByTagName("head")[0].appendChild(e);
        }(document));
    </script>
</head>

<!--shift_source: 4824cfbe9ef0-->
<body class="simulator-page" onunload="SaveTickerPos();">
<div style="display: none;">
    <!-- Start of DoubleClick Spotlight Tag: Please do not remove -->
    <!-- Activity Name for this tag is:IP Simulator -->
    <!-- Web site URL where tag should be placed: http://www.investopedia.com/simulator -->
    <!-- This tag must be placed within the opening <body> tag, as close to the beginning of it as possible -->
    <!-- Creation Date: Thu Jul 02 17:02:35 EDT 2009 -->
    <script language="JavaScript">
        function SaveTickerPos()
        {
            try
            {
                for (var obj in allTickers){
                    allTickers[obj].paused = true;
                    jQuery.cookie(allTickers[obj].cookieName, allTickers[obj].x, {path: '/'});
                }
            }
            catch(e){}
        }

        var axel = Math.random() + "";
        var a = axel * 10000000000000;
        document.write('<img src="https://ad.doubleclick.net/activity;src=2359949;type=ips;cat=ips;ord=1;num=' + a + '?" width=1 height=1 border=0>');
    </script>
    <noscript>
        <img src="https://ad.doubleclick.net/activity;src=2359949;type=ips;cat=ips;ord=1;num=1?" width=1 height=1 border=0>
    </noscript>
    <!-- End of DoubleClick Spotlight Tag: Please do not remove -->

    <!-- Begin comScore Tag -->
    <script type="text/javascript" language="javascript">
        var _comscore = _comscore || [];
        _comscore.push({ c1: "2", c2: "18280457", c4: "https://www.investopedia.com/simulator/trade/tradestock.aspx" });
        (function() {
            var s = document.createElement("script"), el = document.getElementsByTagName("script")[0]; s.async = true;
            s.src = (document.location.protocol == "https:" ? "https://sb" : "http://b") + ".scorecardresearch.com/beacon.js";
            el.parentNode.insertBefore(s, el);
        })();
    </script>
    <noscript>
        <img src="https://sb.scorecardresearch.com/p?c1=2&c2=18280457&c4=https://www.investopedia.com/simulator/trade/tradestock.aspx&cv=2.0&cj=1" />
    </noscript>
    <!-- End comScore Tag -->
</div>
<script type='text/javascript' language="JavaScript">
    //<![CDATA[
    if (getCookie('freenewsletterreg') == null) {
        setCookie("freenewsletterreg", "ad", 30);
    }
    var user_info = $.parseJSON(decodeURIComponent(getCookie('user_info')).replace(/\+/g, ' '));
    //]]>
</script>


<!--<script type='text/javascript' src="https://www.investopedia.com/simulator/Common/VcidScript.ashx?u=e3bfd87f21d741578241089c9aa5f4c8"></script>-->
<!-- Google Tag Manager -->
<noscript>
  <iframe src="//www.googletagmanager.com/ns.html?id=GTM-5V3WHJ"
        height="0" width="0"
        style="display:none;visibility:hidden"></iframe>
</noscript>
<script>(function (w, d, s, l, i) {
    w[l] = w[l] || [];
    w[l].push({'gtm.start': new Date().getTime(), event: 'gtm.js'});
    var f = d.getElementsByTagName(s)[0],
        j = d.createElement(s), dl = l != 'dataLayer' ? '&l=' + l : '';
    j.async = true;
    j.src =
        '//www.googletagmanager.com/gtm.js?id=' + i + dl;
    f.parentNode.insertBefore(j, f);
})(window, document, 'script', 'dataLayer', 'GTM-5V3WHJ');</script>
<!-- End Google Tag Manager -->

<script type="text/javascript">
    dataLayer.push(_pageTaxonomy);
    var pageviewID = genPageviewId();
    dataLayer.push({'pageviewID' : pageviewID});
</script>

<!-- ================================= Header ================================= -->
<div id="Header">
    <div class="mid">
        <div class="brand clear layout-size">
            <a href="//index.investopedia.com/"><div class="m-search-icon"><i></i></div></a>
            <div class="logo-container">
                <a href="/" class="logo"></a>
                <div class="button-container">
                    <a class="button view-markets-btn inv-ga-link-tracking" href="/markets/" target="_blank" data-ga-label="blue-markets-cta">      
                        View Markets
                    </a>
                </div>
            </div>
            <div id="ctl00_AdLeaderBoard1_cgiAdTopLeaderboard" class="leader">
                                    <div id='AdSlot_AF-Top-Leaderboard' adonis-marker></div>
                            </div>
        </div>
    </div>
</div>
<!-- ================================= Header //End ================================= -->


<!-- ================================= Content ================================= -->
<div id="Content" class="full">
    <!-- ================================= Left Navigation ================================= -->

    <div class="left-nav">
                    <div class="label">
                Trade            </div>
            <ul>
                                    <li class="">
                                                    <span></span>
                                                <a href="https://www.i

解决方案

You can try that

import requests
from bs4 import BeautifulSoup

# create session
session = requests.Session()

url = 'https://investopedia.com/simulator/portfolio/'

payload = {
    'username': 'your_email',
    'password': 'your_password'
}

# get log in page
auth_page = session.get(url)
soup = BeautifulSoup(auth_page.content, 'html.parser')

# get form
form = soup.find('form')

# get post url
post_url = form['action']

# auth
session.post(post_url, data=payload)

# parse content
content_url = 'https://investopedia.com/simulator/trade/tradestock.aspx'
page = session.get(content_url)
page_soup = BeautifulSoup(page.content, 'html.parser')

# simulate page
sim_page = page_soup.find('div', {'class': 'sim-page'})
table = sim_page.find_all('table', {'class': 'table2'})[1]
rows = table.find_all('tr')

for row in rows:
    print(row.find('th').text)
    print(row.find('td').text)
    print('----')

Value (USD)
$10,000.00
----
Buying Power
$10,000.00
----
Cash
$10,000.00
----

这篇关于登录后网页抓取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
Python最新文章
热门教程
热门工具
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆