VB.Net HTTPWebRequest比较Python中的URLOpen速度很慢 [英] VB.Net HTTPWebRequest Speed is slow comparing URLOpen in Python

查看:105
本文介绍了VB.Net HTTPWebRequest比较Python中的URLOpen速度很慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个网络爬虫,它将对网站进行爬网并有选择地解析网站的不同部分.

我是.Net开发人员,因此选择显然是我在.Net中完成的,但是速度非常慢,其中包括下载和解析HTMLPages

然后,我尝试仅下载内容,首先使用.Net,然后使用python使用相同的域,但是python在下载数据方面给人留下了深刻的印象.我已经使用python实现了下载,但是后面的部分并不是那么容易用python编写代码,这显然是我不想做的.

同一批域在Python中花费了 100秒
在基于.Net的抓取工具中花费了 20分钟

我尝试下载http://www.regexhacks.com/,并在Python中花费了10秒,而在.Net搜寻器中花费了2分钟

有谁知道为什么它在.Net中速度慢而在python中速度快吗?

Hi I am coding a web-crawler which will crawl the websites and selectively parse different sections of a web site.

I am a .Net developer so the choice was obvious that I did it in .Net but the speed was very slow which included downloading and parsing of HTMLPages

Then I tried to just download the contents first using .Net and then same domains using python but the python was very impressive in downloading data. I have achieved downloading using python but the later part is not that easy to code in python, which obviously i don''t want to do.

The same batch of domain which took 100 seconds in Python
was taking 20 minutes in .Net based crawler

I tried http://www.regexhacks.com/ to download and in took 10 seconds in Python and same was taking 2 minutes in .Net crawler

Does anyone anyone have any idea why this is slow in .Net but fast in python?

推荐答案

您是否尝试过使用IronPython将python代码插入.NET应用程序.这应该允许它以Python中找到的速度下载页面.我认为python的速度更快,因为Python会以字节字符串元组的形式下载页面,而.NET可能会按原样下载HTML代码.
Have you tried using IronPython to insert the python code into your .NET application. That should allow it to download the pages with the speed found in Python. In my opinion the speed in python is faster because Python downloads pages in the form of tuples of byte strings whereas .NET may be downloading the HTML code as it is.


这篇关于VB.Net HTTPWebRequest比较Python中的URLOpen速度很慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆