模拟C#中的无限滚动以获取页面的完整html [英] simulate infinite scrolling in c# to get full html of a page

查看：127 发布时间：2020/11/24 19:21:13 c# html-agility-pack infinite-scroll

本文介绍了模拟C#中的无限滚动以获取页面的完整html的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

许多网站都使用这种(imo)令人讨厌的无限滚动"样式. 例如tumblr，twitter，9gag等网站.

There are lots of sites that use this (imo) annoying "infinite scrolling" style. Examples of this are sites like tumblr, twitter, 9gag, etc..

我最近尝试使用HtmlAgilityPack通过编程从这些网站上抓取一些照片. 像这样:

I recently tried to scrape some pics off of these sites programatically with HtmlAgilityPack. like this:

HtmlWeb web = new HtmlWeb();  
HtmlDocument doc = web.Load(url);
var primary = doc.DocumentNode.SelectNodes("//img[@class='badge-item-img']");
var picstring = primary.Select(r => r.GetAttributeValue("src", null)).FirstOrDefault();

这很好，但是当我尝试从某些站点加载HTML时，我注意到我只得到了少量的内容(比如说前10个帖子"或图片"或其他内容). ) 这让我想知道是否有可能在c#中模拟页面的向下滚动到底部".

This works fine, but when I tried to load in the HTML from certain sites, I noticed that I only got back a small amount of content (lets say the first 10 "posts" or "pictures", or whatever..) This made me wonder if it would be possible to simulate the "scrolling down to the bottom" of the page in c#.

这不仅是我以编程方式加载html的情况，当我仅访问tumblr之类的站点，并且检查firebug或只是查看源代码"时，我希望所有内容都在某个地方，但是很多似乎是用javascript隐藏/插入的. HTML源中仅显示屏幕上实际可见的内容.

This isn't just the case when I load the html programatically, when I simply go to sites like tumblr, and I check firebug or just "view source", I expected that all the content would be in there somewhere, but alot of it seems to be hidden/inserted with javascript. Only the content that is actually visible on my screen is present in the HTML source.

所以我的问题是:是否可以模拟无限向下滚动到页面并使用c#加载HTML(最好是使用HTML)?

(我知道我可以使用tumblr和twitter的API，但我只是想与HtmlAgilityPack一起玩一些有趣的黑客东西)

(I know that I can use API's for tumblr and twitter, but i'm just trying to have some fun hacking stuff together with HtmlAgilityPack)

模拟C#中的无限滚动以获取页面的完整html [英] simulate infinite scrolling in c# to get full html of a page

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

模拟C#中的无限滚动以获取页面的完整html [英] simulate infinite scrolling in c# to get full html of a page

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭