任何人都可以澄清一些关于Python Web自动化的选项 [英] Can anyone clarify some options for Python Web automation

查看:67
本文介绍了任何人都可以澄清一些关于Python Web自动化的选项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用python创建一个简单的脚本,该脚本将扫描一条推文中的链接,然后访问该链接. 我在确定从这里走哪条路时遇到了麻烦.从我的研究看来,我可以使用硒或机械化吗?可以用于浏览器自动化.使用这些将被视为网页抓取吗?

I'm trying to make a simple script in python that will scan a tweet for a link and then visit that link. I'm having trouble determining which direction to go from here. From what I've researched it seems that I can Use Selenium or Mechanize? Which can be used for browser automation. Would using these be considered web scraping?

我可以学习twitter api,Requests库和pyjamas(将python代码转换为javascript)之一,因此我可以制作一个简单的脚本并将其加载到google chrome/firefox扩展程序中.

I can learn one of the twitter apis , the Requests library, and pyjamas(converts python code to javascript) so I can make a simple script and load it into google chrome's/firefox extensions.

哪个会是更好的选择?

推荐答案

进行网络自动化时有许多不同的方法.由于您正在处理Twitter,因此可以尝试Twitter API.如果您要执行其他任务,则还有更多选择.

There are many different ways to go when doing web automation. Since you're doing stuff with Twitter, you could try the Twitter API. If you're doing any other task, there are more options.

  • Selenium 在需要单击按钮或输入值时非常有用形式.唯一的缺点是它会打开一个单独的浏览器窗口.

  • Selenium is very useful when you need to click buttons or enter values in forms. The only drawback is that it opens a separate browser window.

Mechanize 不会打开浏览器窗口,并且非常适合操纵按钮和表格.可能还需要几行来完成工作.

Mechanize, unlike Selenium, does not open a browser window and is also good for manipulating buttons and forms. It might need a few more lines to get the job done.

Urllib /

Urllib/Urllib2 is what I use. Some people find it a bit hard at first, but once you know what you're doing, it is very quick and gets the job done. Plus you can do things with cookies and proxies. It is a built-in library, so there is no need to download anything.

Requests urllib一样好,但是我没有很多经验.您可以执行添加标题之类的操作.这是一个非常好的图书馆.

Requests is just as good as urllib, but I don't have a lot of experience with it. You can do things like add headers. It's a very good library.

一旦获得所需的页面,我建议您使用 BeautifulSoup 解析出您想要的数据.

Once you get the page you want, I recommend you use BeautifulSoup to parse out the data you want.

我希望这会为您引导Web自动化的正确方向.

I hope this leads you in the right direction for web automation.

这篇关于任何人都可以澄清一些关于Python Web自动化的选项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆