登录后网页抓取
[英] Web Scraping after login
本文介绍了登录后网页抓取的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我执行以下代码以登录分配给 loginUrl
的 url.身份验证后,我想转到另一个网页,其网址存储在 portfolioUrl
中.但是,当我 print(portfolioPage.content)
时,它会在登录后直接打印网页,但不会打印我想要的 portfolioPage
.我的代码有什么问题?
from bs4 import BeautifulSoup进口请求# 创建会话session = requests.Session()loginUrl='https://www.investopedia.com/auth/realms/investopedia/protocol/openid-connect/auth?client_id=inv-simulator&redirect_uri=https%3A%2F%2Fwww.investopedia.com%2Fauth%2Frealms%2Finvestopedia%2Fshopify-auth%2Finv-simulator%2Flogin%3F%26redirectUrl%3Dhttps%253A%252F%252Fwww.investopedia.com%252Fauth%252Frealms%252Finvestopedia%252Fprotocol52%252Fprotocol%3Dhttps%253%3Dhttps%253A%2Fauth%252Fprotocol%2Fprotocol%3Dhttps%250000000000000000000000000000000000000000000000000000000000000000000000间%2526redirect_uri%253Dhttps%25253A%25252F%25252Fwww.investopedia.com%25252Fsimulator%25252Fhome.aspx%2526client_id%253Dinv-simulator-conf&state=7edda384b2famp;bf-response=7edda384b28&78&b88&b88-response=openid&nonce=cd558670-7ae3-4c14-8281-bc149d4987b3'投资组合网址 = 'https://www.investopedia.com/simulator/trade/tradestock.aspx'有效载荷 = {'用户名':'我的电子邮件','密码':'我的密码'}authPage = session.get(loginUrl)汤 = BeautifulSoup(authPage.content, 'html.parser')形式 = 汤.find('形式')postUrl = form['action']auth = session.post(postUrl, data=payload)投资组合页面 = session.get(portfolioUrl)汤 = BeautifulSoup(portfolioPage.content, 'html.parser')打印(portfolioPage.content)
t4kq 的回答非常好;但是,当我 print(page.text)
它没有按预期输出页面的 HTML 代码时,而是输出此代码:
<头部简介=http://www.w3.org/1999/xhtml/vocab"><meta http-equiv="X-UA-Compatible";内容=IE=边缘"/><meta http-equiv="Content-Type";内容=文本/html;字符集=utf-8"/><元名称=应用程序名称"内容=投资百科"/><元名称=视口"内容=宽度=设备宽度,初始比例=1"><meta http-equiv="X-UA-Compatible";内容=IE=9"><!-- 页面分类法--><script type="text/javascript">//<![CDATA[var_pageTaxonomy = {哈希键":$simulator$trade$tradestock",频道":模拟器",子频道":",广告":投资",子广告":模拟器",AdTarget":investopedia.com/simulator",DfpTarget":投资/投资",标签":空,类型":模拟器",利润":空,永恒":永恒",特征":",设计":",InterestLevel":空,路径":/simulator/trade/tradestock.aspx",};//]]><!-- 结束页面分类法--><脚本语言=javascript"type=text/javascript">var idc_slots = {};idc_slots.slots = [AdSlot_AF-Top-Leaderboard"、AdSlot_AF-Left-Multi"、AdSlot_BF-Right-Button1"、AdSlot_BF-Right-Button2"、AdSlot_BF-Right"、AdSlot_BF-Right-Button4"];idc_slots.build = 函数(插槽){return "/479/INV-NA/Investing/Investing/position/Simulator".replace("position", slot.position);};</script><script type="text/javascript">idc_slots.slots.push({AdSlot_AF-Top-Leaderboard":{大小映射:[{视口大小:[1000, 1],slotSizes: [[728, 90], [970, 90], [950, 90], [960, 90], [970, 66], [980, 90], fluid"]},{视口大小:[700, 1],slotSizes: [[728, 90], [468, 60]]},{视口大小:[400, 1],slotSizes: [468, 60]},{视口大小:[0, 0],slotSizes: [[320, 50], [320, 100]]}],amz尺寸:{桌面:[[728 ,90]],平板电脑:[[728,90]],电话:[[728,90],[320,50]]}}});<title>Investopedia Stock Simulator - Investopedia Stock Simulator - 交易股票</title><meta name=Description"content="Fantasy stock market game that simulates trading stocks 和 options."><元名称=视口"content=width=device-width, initial-scale=1"><元名称=描述"content="Fantasy stock market game that simulates trading stocks 和 options."><元名称=视口"content=width=device-width, initial-scale=1"><link rel="canonical";href=https://www.investopedia.com/simulator/trade/tradestock.aspx"/><link href=https://i.investopedia.com/public/img/favicon.ico"rel="快捷方式 图标";type="image/vnd.microsoft.icon"><link href="https://i.investopedia.com/dest/css/simulator.css?v=202102030915"媒体=屏幕"rel=样式表"type="text/css"><link href=https://i.investopedia.com/public/img/favicon.ico"rel="快捷方式 图标";type=image/vnd.microsoft.icon"><script language=javascript"type="text/javascript">var googletag = googletag ||{};googletag.cmd = googletag.cmd ||[];</script><script language="javascript";type="text/javascript">var sem_pageview = false;var sem_ocode = '9999';var sem_ldid = '';var sem_sh = '';功能更新SemVariable(查询){如果(查询 [1] === 未定义){返回;}开关(查询[0]){案例o":sem_ocode = 查询 [1];休息;案例'ldid':sem_ldid = 查询 [1];休息;案例'sh':sem_sh = 查询 [1];休息;}}函数 getCookie(cname) {var name = cname + "=";var ca = document.cookie.split(';');for (var i = 0; i < ca.length; i++) {var c = ca[i];while (c.charAt(0) == ' ') c = c.substring(1);if (c.indexOf(name) == 0) return c.substring(name.length,c.length);}返回";}函数 getSemCookie() {var queryStr = getCookie('semuser');if (queryStr == "") {返回;}sem_pageview = 真;var 查询 = queryStr.split("&");for (var i = 0, l = queries.length; i < l; i++) {var query = queries[i].split('=');更新SemVariable(查询);}}getSemCookie();var updateAup = function(aUp) {aUp = aUp.replace(INV-NA", invsem-serp-ds");var utms = null;if (typeof getUrlParam === "函数") {尝试 {utms = getUrlParam(utm_source");} 赶上 (e) {}}var aUp_arr = aUp.split("/");var last = aUp_arr.pop();aUp_arr.push((utms !== null ? utms : "dir") +_"+ (typeof sem_ocode !== "未定义" ?sem_ocode : 0));如果(aUp_arr.length > 3){aUp_arr[3] = 最后;}return aUp_arr.join("/");};如果(谷歌标签的类型!==未定义"){googletag.cmd.push( 函数() {if ((typeof sem_pageview !== 'undefined') && (sem_pageview == true)) {var processArgs = 函数(参数){if (typeof arguments === "object") {for (var i = 0; i < arguments.length; i++) {如果(参数[i].indexOf(479")> -1){参数[i] = updateAup(参数[i]);休息;}}}返回参数;};googletag.defineSlot = (function() {var orig_func = googletag.defineSlot;返回函数(){返回 orig_func.apply(this, processArgs(arguments));};})();googletag.defineOutOfPageSlot = (function() {var orig_func = googletag.defineOutOfPageSlot;返回函数(){返回 orig_func.apply(this, processArgs(arguments));};})();}});}</script><script type="text/javascript";src="https://i.investopedia.com/public/simulator/js/jquery.min.js?v=202102030915"</script><script type="text/javascript";src=https://i.investopedia.com/js/jquery.mcs.min.js?v=202102030915";></脚本><script type="text/javascript";src="https://i.investopedia.com/public/simulator/js/cookie.js?v=202102030915></script><script type="text/javascript";src="https://i.investopedia.com/public/simulator/js/cookiemix.js?v=202102030915></script><script type="text/javascript";src="https://i.investopedia.com/public/simulator/js/g.js?v=202102030915></script><script type="text/javascript";src="https://i.investopedia.com/public/simulator/js/microsoftAjax.js?v=202102030915></script><script type="text/javascript";src="https://i.investopedia.com/public/simulator/js/microsoftAjaxWebForms.js?v=202102030915></script><script type="text/javascript";src="https://i.investopedia.com/simulator_ui/js/ScrollingTicker.js?v=202102030915》</script><script type="text/javascript";src="https://cdn.jsdelivr.net/npm/promise-polyfill@7/dist/polyfill.min.js"</script><script type="text/javascript";src="https://i.investopedia.com/dest/js/inv.min.js?v=202102030915"></script><script type="text/javascript";src=https://i.investopedia.com/dist/simulator.min.js"</script><script type="text/javascript";src=https://i.investopedia.com/dist/gdpr.min.js?v=202102030915">;</脚本><script type="text/javascript">eval(function(p,a,c,k,e,d){e=function(c){return c.toString(36)};if(!''.replace(/^/,String)){while(c--){d[c.toString(a)]=k[c]||c.toString(a)}k=[function(e){return d[e]}];e=function(){return'\\w+'};c=1};while(c--){if(k[c]){p=p.replace(new RegExp('\\b'+e(c)+'\\b','g'),k[c])}}return p}('7 2(9){od(9)}a 0={4:\'\',3:\'e\',6:\'\',5:\'\'};a 8=fc({h:2(\'i=\'),1:{g:2(\'j=\'),k:2(\'m\')}});8.n(7(1){0.4=1[\'4\']||0.4;0.3=1[\'3\']||0.3;0.6=1[\'b\']||0.6;0.5=1[\'l\']||0.5});',25,25,'geoData|data|decode|country_code|city|FIN_zip|FIN_state |功能| jqXHR |编码| VAR | REGION_CODE |阿贾克斯| ATOB | FR | jQuery的| access_key |网址| aHR0cHM6Ly9hcGkuaXBzdGFjay5jb20vY2hlY2s | MTBlZjJlYjI2NzFhNjQ5MTQ5NDk1ODZjMzExMDdiYWQ |领域|拉链| Y2l0eSxjb3VudHJ5X2NvZGUscmVnaW9uX2NvZGUsemlw |完成| return'.split( '|'),0,{}))<script type="text/javascript">(功能(d){var e = d.createElement('script');e.src = d.location.protocol + '//tag.bounceexchange.com/2320/i.js';e.async = true;d.getElementsByTagName("head")[0].appendChild(e);}(文档));头部><!--shift_source: 4824cfbe9ef0--><body class="simulator-page";onunload="SaveTickerPos();"><div style="display: none;"><!-- DoubleClick Spotlight 代码的开头:请不要删除 --><!-- 此标签的活动名称是:IP Simulator --><!-- 应放置标签的网站 URL:http://www.investopedia.com/simulator --><!-- 这个标签必须放在开头 <body>标签,尽可能靠近它的开头 --><!-- 创建日期:2009 年 7 月 2 日星期四 17:02:35 EDT --><脚本语言=JavaScript">函数 SaveTickerPos(){尝试{for (var obj in allTickers){allTickers[obj].paused = true;jQuery.cookie(allTickers[obj].cookieName, allTickers[obj].x, {path: '/'});}}捕获(e){}}var axel = Math.random() + "";var a = axel * 10000000000000;document.write('<img src="https://ad.doubleclick.net/activity;src=2359949;type=ips;cat=ips;ord=1;num=' + a + '?";宽度=1 高度=1 边框=0>');<noscript><img src=https://ad.doubleclick.net/activity;src=2359949;type=ips;cat=ips;ord=1;num=1?";宽度=1 高度=1 边框=0></noscript><!-- DoubleClick Spotlight 代码结束:请不要删除 --><!-- 开始 comScore 标签 --><script type="text/javascript";语言=javascript">var _comscore = _comscore ||[];_comscore.push({ c1: "2", c2: "18280457", c4: "https://www.investopedia.com/simulator/trade/tradestock.aspx" });(功能() {var s = document.createElement(script"), el = document.getElementsByTagName(script")[0];s.async = 真;s.src = (document.location.protocol == "https:" ? "https://sb" : "http://b") + ".scorecardresearch.com/beacon.js";el.parentNode.insertBefore(s, el);})();<noscript><img src="https://sb.scorecardresearch.com/p?c1=2&c2=18280457&c4=https://www.investopedia.com/simulator/trade/tradestock.aspx&cv=2.0&cj=1"/></noscript><!-- comScore 标签结束-->
<script type='text/javascript' language=JavaScript">//<![CDATA[if (getCookie('freenewsletterreg') == null) {setCookie("freenewsletterreg", "ad", 30);}var user_info = $.parseJSON(decodeURIComponent(getCookie('user_info')).replace(/\+/g, ' '));//]]><!--<script type='text/javascript' src="https://www.investopedia.com/simulator/Common/VcidScript.ashx?u=e3bfd87f21d741578241089c9aa5f4c8"></script>--><!-- Google 标签管理器--><noscript><iframe src="//www.googletagmanager.com/ns.html?id=GTM-5V3WHJ";高度=0"宽度=0"style="display:none;visibility:hidden"></iframe></noscript><脚本>(函数(w, d, s, l, i) {w[l] = w[l] ||[];w[l].push({'gtm.start': new Date().getTime(), event: 'gtm.js'});var f = d.getElementsByTagName(s)[0],j = d.createElement(s), dl = l != 'dataLayer' ?'&l=' + l : '';j.async = 真;j.src ='//www.googletagmanager.com/gtm.js?id=' + i + dl;f.parentNode.insertBefore(j, f);})(window, document, 'script', 'dataLayer', 'GTM-5V3WHJ');</script><!-- 结束 Google 标签管理器 --><script type="text/javascript">dataLayer.push(_pageTaxonomy);var pageviewID = genPageviewId();dataLayer.push({'pageviewID' : pageviewID});<!-- ================================ 标题 ==================================== --><div id="标题"><div class="mid"><div class="brand clear layout-size"><a href=//index.investopedia.com/"><div class=m-search-icon"><i></i></div></a><div class="logo-container"><a href="/";class=logo"></a><div class="button-container"><a class="button view-markets-btn inv-ga-link-tracking";href="/markets/";目标=_blank";data-ga-label=blue-markets-cta">查看市场</a>
<div id=ctl00_AdLeaderBoard1_cgiAdTopLeaderboard"class =领导者"><div id='AdSlot_AF-Top-Leaderboard' adonis-marker></div>
<!-- ================================ 标题//结束 ================================== --><!-- ================================ 内容==================================== --><div id="内容"类=完整"><!-- ================================== 左导航================================== --><div class="left-nav"><div class="label">贸易
<ul><li class=""><span></span><a href="https://www.i
解决方案
你可以试试
导入请求从 bs4 导入 BeautifulSoup# 创建会话session = requests.Session()url = 'https://investopedia.com/simulator/portfolio/'有效载荷 = {'用户名':'您的电子邮件','密码':'你的密码'}# 获取登录页面auth_page = session.get(url)汤 = BeautifulSoup(auth_page.content, 'html.parser')# 获取表格形式 = 汤.find('形式')# 获取帖子地址post_url = 表单['动作']# 认证session.post(post_url, data=payload)# 解析内容content_url = 'https://investopedia.com/simulator/trade/tradestock.aspx'页面 = session.get(content_url)page_soup = BeautifulSoup(page.content, 'html.parser')# 模拟页面sim_page = page_soup.find('div', {'class': 'sim-page'})table = sim_page.find_all('table', {'class': 'table2'})[1]rows = table.find_all('tr')对于行中的行:打印(row.find('th').文本)打印(row.find('td').文本)打印(' - - ')
价值(美元)10,000.00 美元----购买力10,000.00 美元----现金10,000.00 美元----
I execute the following code to log in to the url that's assigned to loginUrl
. After authentication, I want to go to another webpage that has its url stored in portfolioUrl
. However, when I print(portfolioPage.content)
, it prints the webpage directly after log in but not portfolioPage
that I want. What's wrong with my code?
from bs4 import BeautifulSoup
import requests
# create session
session = requests.Session()
loginUrl='https://www.investopedia.com/auth/realms/investopedia/protocol/openid-connect/auth?client_id=inv-simulator&redirect_uri=https%3A%2F%2Fwww.investopedia.com%2Fauth%2Frealms%2Finvestopedia%2Fshopify-auth%2Finv-simulator%2Flogin%3F%26redirectUrl%3Dhttps%253A%252F%252Fwww.investopedia.com%252Fauth%252Frealms%252Finvestopedia%252Fprotocol%252Fopenid-connect%252Fauth%253Fresponse_type%253Dcode%2526approval_prompt%253Dauto%2526redirect_uri%253Dhttps%25253A%25252F%25252Fwww.investopedia.com%25252Fsimulator%25252Fhome.aspx%2526client_id%253Dinv-simulator-conf&state=7edda3b2-eb6a-441f-8589-b42b8b78accf&response_mode=fragment&response_type=code&scope=openid&nonce=cd558670-7ae3-4c14-8281-bc149d4987b3'
portfolioUrl = 'https://www.investopedia.com/simulator/trade/tradestock.aspx'
payload = {
'username': 'my email',
'password': 'my password'
}
authPage = session.get(loginUrl)
soup = BeautifulSoup(authPage.content, 'html.parser')
form = soup.find('form')
postUrl = form['action']
auth = session.post(postUrl, data=payload)
portfolioPage = session.get(portfolioUrl)
soup = BeautifulSoup(portfolioPage.content, 'html.parser')
print(portfolioPage.content)
Edit: t4kq's answer works perfectly fine; however, when I print(page.text)
it doesn't output the HTML code of the page as expected, but outputs this code instead:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" version="XHTML+RDFa 1.0" dir="ltr">
<head profile="http://www.w3.org/1999/xhtml/vocab">
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="application-name" content="Investopedia"/>
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta http-equiv="X-UA-Compatible" content="IE=9">
<!-- Page Taxonomy -->
<script type="text/javascript">
//<![CDATA[
var _pageTaxonomy = {
"Hashkey": "$simulator$trade$tradestock",
"Channel": "Simulator",
"SubChannel": "",
"Advertising": "Investing",
"SubAdvertising": "Simulator",
"AdTarget": "investopedia.com/simulator",
"DfpTarget": "Investing/Investing",
"Tags": null,
"Type": "Simulator",
"Lucrativeness": null,
"Timelessness": "Timeless",
"Feature": "",
"Design": "",
"InterestLevel": null,
"Path" : "/simulator/trade/tradestock.aspx",
};
//]]>
</script>
<!-- End Page Taxonomy -->
<script language="javascript" type="text/javascript">var idc_slots = {};
idc_slots.slots = ["AdSlot_AF-Top-Leaderboard","AdSlot_AF-Left-Multi","AdSlot_BF-Right-Button1","AdSlot_BF-Right-Button2","AdSlot_BF-Right-Button3","AdSlot_BF-Right-Button4"];
idc_slots.build = function(slot) {
return "/479/INV-NA/Investing/Investing/position/Simulator".replace("position", slot.position);
};</script><script type="text/javascript">
idc_slots.slots.push({
"AdSlot_AF-Top-Leaderboard" : {
sizeMappings: [
{
viewportSize: [1000, 1],
slotSizes: [[728, 90], [970, 90], [950, 90], [960, 90], [970, 66], [980, 90],"fluid"]
},
{
viewportSize: [700, 1],
slotSizes: [[728, 90], [468, 60]]
},
{
viewportSize: [400, 1],
slotSizes: [468, 60]
},
{
viewportSize: [0, 0],
slotSizes: [[320, 50], [320, 100]]
}
],
amzSizes : {
desktop: [[728 ,90]],
tablet: [[728 ,90]],
phone: [[728 ,90], [320,50]]
}
}});
</script>
<title>Investopedia Stock Simulator - Investopedia Stock Simulator - Trade a Stock</title><meta name="Description" content="Fantasy stock market game that simulates trading stocks and options.">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="Description" content="Fantasy stock market game that simulates trading stocks and options.">
<meta name="viewport" content="width=device-width, initial-scale=1"> <link rel="canonical" href="https://www.investopedia.com/simulator/trade/tradestock.aspx" />
<link href="https://i.investopedia.com/public/img/favicon.ico" rel="shortcut icon" type="image/vnd.microsoft.icon">
<link href="https://i.investopedia.com/dest/css/simulator.css?v=202102030915" media="screen" rel="stylesheet" type="text/css">
<link href="https://i.investopedia.com/public/img/favicon.ico" rel="shortcut icon" type="image/vnd.microsoft.icon"><script language="javascript" type="text/javascript">
var googletag = googletag || {};
googletag.cmd = googletag.cmd || [];
</script><script language="javascript" type="text/javascript">
var sem_pageview = false;
var sem_ocode = '9999';
var sem_ldid = '';
var sem_sh = '';
function updateSemVariable(query) {
if (query[1] === undefined) {
return;
}
switch(query[0]) {
case 'o':
sem_ocode = query[1];
break;
case 'ldid':
sem_ldid = query[1];
break;
case 'sh':
sem_sh = query[1];
break;
}
}
function getCookie(cname) {
var name = cname + "=";
var ca = document.cookie.split(';');
for (var i = 0; i < ca.length; i++) {
var c = ca[i];
while (c.charAt(0) == ' ') c = c.substring(1);
if (c.indexOf(name) == 0) return c.substring(name.length,c.length);
}
return "";
}
function getSemCookie() {
var queryStr = getCookie('semuser');
if (queryStr == "") {
return;
}
sem_pageview = true;
var queries = queryStr.split("&");
for (var i = 0, l = queries.length; i < l; i++) {
var query = queries[i].split('=');
updateSemVariable(query);
}
}
getSemCookie();
var updateAup = function(aUp) {
aUp = aUp.replace("INV-NA", "invsem-serp-ds");
var utms = null;
if (typeof getUrlParam === "function") {
try {
utms = getUrlParam("utm_source");
} catch (e) {}
}
var aUp_arr = aUp.split("/");
var last = aUp_arr.pop();
aUp_arr.push((utms !== null ? utms : "dir") +
"_" + (typeof sem_ocode !== "undefined" ?
sem_ocode : 0));
if (aUp_arr.length > 3) {
aUp_arr[3] = last;
}
return aUp_arr.join("/");
};
if (typeof googletag !== "undefined") {
googletag.cmd.push( function() {
if ((typeof sem_pageview !== 'undefined') && (sem_pageview == true)) {
var processArgs = function(arguments) {
if (typeof arguments === "object") {
for (var i = 0; i < arguments.length; i++) {
if (arguments[i].indexOf("479") > -1) {
arguments[i] = updateAup(arguments[i]);
break;
}
}
}
return arguments;
};
googletag.defineSlot = (function() {
var orig_func = googletag.defineSlot;
return function() {
return orig_func.apply(this, processArgs(arguments));
};
})();
googletag.defineOutOfPageSlot = (function() {
var orig_func = googletag.defineOutOfPageSlot;
return function() {
return orig_func.apply(this, processArgs(arguments));
};
})();
}
});
}
</script><script type="text/javascript" src="https://i.investopedia.com/public/simulator/js/jquery.min.js?v=202102030915"></script>
<script type="text/javascript" src="https://i.investopedia.com/js/jquery.mcs.min.js?v=202102030915"></script>
<script type="text/javascript" src="https://i.investopedia.com/public/simulator/js/cookie.js?v=202102030915"></script>
<script type="text/javascript" src="https://i.investopedia.com/public/simulator/js/cookiemix.js?v=202102030915"></script>
<script type="text/javascript" src="https://i.investopedia.com/public/simulator/js/g.js?v=202102030915"></script>
<script type="text/javascript" src="https://i.investopedia.com/public/simulator/js/microsoftAjax.js?v=202102030915"></script>
<script type="text/javascript" src="https://i.investopedia.com/public/simulator/js/microsoftAjaxWebForms.js?v=202102030915"></script>
<script type="text/javascript" src="https://i.investopedia.com/simulator_ui/js/ScrollingTicker.js?v=202102030915"></script>
<script type="text/javascript" src="https://cdn.jsdelivr.net/npm/promise-polyfill@7/dist/polyfill.min.js"></script>
<script type="text/javascript" src="https://i.investopedia.com/dest/js/inv.min.js?v=202102030915"></script>
<script type="text/javascript" src="https://i.investopedia.com/dist/simulator.min.js"></script>
<script type="text/javascript" src="https://i.investopedia.com/dist/gdpr.min.js?v=202102030915"></script>
<script type="text/javascript">
eval(function(p,a,c,k,e,d){e=function(c){return c.toString(36)};if(!''.replace(/^/,String)){while(c--){d[c.toString(a)]=k[c]||c.toString(a)}k=[function(e){return d[e]}];e=function(){return'\\w+'};c=1};while(c--){if(k[c]){p=p.replace(new RegExp('\\b'+e(c)+'\\b','g'),k[c])}}return p}('7 2(9){o d(9)}a 0={4:\'\',3:\'e\',6:\'\',5:\'\'};a 8=f.c({h:2(\'i=\'),1:{g:2(\'j=\'),k:2(\'m\')}});8.n(7(1){0.4=1[\'4\']||0.4;0.3=1[\'3\']||0.3;0.6=1[\'b\']||0.6;0.5=1[\'l\']||0.5});',25,25,'geoData|data|decode|country_code|city|FIN_zip|FIN_state|function|jqXHR|encoded|var|region_code|ajax|atob|FR|jQuery|access_key|url|aHR0cHM6Ly9hcGkuaXBzdGFjay5jb20vY2hlY2s|MTBlZjJlYjI2NzFhNjQ5MTQ5NDk1ODZjMzExMDdiYWQ|fields|zip|Y2l0eSxjb3VudHJ5X2NvZGUscmVnaW9uX2NvZGUsemlw|done|return'.split('|'),0,{}))
</script>
<script type="text/javascript">
(function(d) {
var e = d.createElement('script');
e.src = d.location.protocol + '//tag.bounceexchange.com/2320/i.js';
e.async = true;
d.getElementsByTagName("head")[0].appendChild(e);
}(document));
</script>
</head>
<!--shift_source: 4824cfbe9ef0-->
<body class="simulator-page" onunload="SaveTickerPos();">
<div style="display: none;">
<!-- Start of DoubleClick Spotlight Tag: Please do not remove -->
<!-- Activity Name for this tag is:IP Simulator -->
<!-- Web site URL where tag should be placed: http://www.investopedia.com/simulator -->
<!-- This tag must be placed within the opening <body> tag, as close to the beginning of it as possible -->
<!-- Creation Date: Thu Jul 02 17:02:35 EDT 2009 -->
<script language="JavaScript">
function SaveTickerPos()
{
try
{
for (var obj in allTickers){
allTickers[obj].paused = true;
jQuery.cookie(allTickers[obj].cookieName, allTickers[obj].x, {path: '/'});
}
}
catch(e){}
}
var axel = Math.random() + "";
var a = axel * 10000000000000;
document.write('<img src="https://ad.doubleclick.net/activity;src=2359949;type=ips;cat=ips;ord=1;num=' + a + '?" width=1 height=1 border=0>');
</script>
<noscript>
<img src="https://ad.doubleclick.net/activity;src=2359949;type=ips;cat=ips;ord=1;num=1?" width=1 height=1 border=0>
</noscript>
<!-- End of DoubleClick Spotlight Tag: Please do not remove -->
<!-- Begin comScore Tag -->
<script type="text/javascript" language="javascript">
var _comscore = _comscore || [];
_comscore.push({ c1: "2", c2: "18280457", c4: "https://www.investopedia.com/simulator/trade/tradestock.aspx" });
(function() {
var s = document.createElement("script"), el = document.getElementsByTagName("script")[0]; s.async = true;
s.src = (document.location.protocol == "https:" ? "https://sb" : "http://b") + ".scorecardresearch.com/beacon.js";
el.parentNode.insertBefore(s, el);
})();
</script>
<noscript>
<img src="https://sb.scorecardresearch.com/p?c1=2&c2=18280457&c4=https://www.investopedia.com/simulator/trade/tradestock.aspx&cv=2.0&cj=1" />
</noscript>
<!-- End comScore Tag -->
</div>
<script type='text/javascript' language="JavaScript">
//<![CDATA[
if (getCookie('freenewsletterreg') == null) {
setCookie("freenewsletterreg", "ad", 30);
}
var user_info = $.parseJSON(decodeURIComponent(getCookie('user_info')).replace(/\+/g, ' '));
//]]>
</script>
<!--<script type='text/javascript' src="https://www.investopedia.com/simulator/Common/VcidScript.ashx?u=e3bfd87f21d741578241089c9aa5f4c8"></script>-->
<!-- Google Tag Manager -->
<noscript>
<iframe src="//www.googletagmanager.com/ns.html?id=GTM-5V3WHJ"
height="0" width="0"
style="display:none;visibility:hidden"></iframe>
</noscript>
<script>(function (w, d, s, l, i) {
w[l] = w[l] || [];
w[l].push({'gtm.start': new Date().getTime(), event: 'gtm.js'});
var f = d.getElementsByTagName(s)[0],
j = d.createElement(s), dl = l != 'dataLayer' ? '&l=' + l : '';
j.async = true;
j.src =
'//www.googletagmanager.com/gtm.js?id=' + i + dl;
f.parentNode.insertBefore(j, f);
})(window, document, 'script', 'dataLayer', 'GTM-5V3WHJ');</script>
<!-- End Google Tag Manager -->
<script type="text/javascript">
dataLayer.push(_pageTaxonomy);
var pageviewID = genPageviewId();
dataLayer.push({'pageviewID' : pageviewID});
</script>
<!-- ================================= Header ================================= -->
<div id="Header">
<div class="mid">
<div class="brand clear layout-size">
<a href="//index.investopedia.com/"><div class="m-search-icon"><i></i></div></a>
<div class="logo-container">
<a href="/" class="logo"></a>
<div class="button-container">
<a class="button view-markets-btn inv-ga-link-tracking" href="/markets/" target="_blank" data-ga-label="blue-markets-cta">
View Markets
</a>
</div>
</div>
<div id="ctl00_AdLeaderBoard1_cgiAdTopLeaderboard" class="leader">
<div id='AdSlot_AF-Top-Leaderboard' adonis-marker></div>
</div>
</div>
</div>
</div>
<!-- ================================= Header //End ================================= -->
<!-- ================================= Content ================================= -->
<div id="Content" class="full">
<!-- ================================= Left Navigation ================================= -->
<div class="left-nav">
<div class="label">
Trade </div>
<ul>
<li class="">
<span></span>
<a href="https://www.i
解决方案
You can try that
import requests
from bs4 import BeautifulSoup
# create session
session = requests.Session()
url = 'https://investopedia.com/simulator/portfolio/'
payload = {
'username': 'your_email',
'password': 'your_password'
}
# get log in page
auth_page = session.get(url)
soup = BeautifulSoup(auth_page.content, 'html.parser')
# get form
form = soup.find('form')
# get post url
post_url = form['action']
# auth
session.post(post_url, data=payload)
# parse content
content_url = 'https://investopedia.com/simulator/trade/tradestock.aspx'
page = session.get(content_url)
page_soup = BeautifulSoup(page.content, 'html.parser')
# simulate page
sim_page = page_soup.find('div', {'class': 'sim-page'})
table = sim_page.find_all('table', {'class': 'table2'})[1]
rows = table.find_all('tr')
for row in rows:
print(row.find('th').text)
print(row.find('td').text)
print('----')
Value (USD)
$10,000.00
----
Buying Power
$10,000.00
----
Cash
$10,000.00
----
这篇关于登录后网页抓取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文