使用Tor网络执行JavaScript,无需人工干预 [英] execute JavaScript using Tor network without human interaction

查看:73
本文介绍了使用Tor网络执行JavaScript,无需人工干预的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想通过Tor网络加载html内容,并执行JavaScript以通过AJAX通过该网络加载其他内容.必须通过在Linux服务器上运行的脚本进行自动化,而无需任何人工干预.我找不到能使通过Tor网络自动执行JavaScript的工具组合.

I want to load html content through the Tor network and execute JavaScript to load additional content through this network via AJAX. This must be done automated by a script that runs on a Linux server without any human interaction. I can't find a combination of tools that enables automated execution of JavaScript that came through the Tor network.

我要编写一个具有以下特征的应用程序:

I want to write an application with this characteristics:

环境

  • 自主运行(无需任何人工干预)
  • 在非GUI(无头")Linux服务器(Ubuntu 12.04)上运行

功能

  • 使用Tor网络匿名加载Web内容(html文档,图像等)
  • 执行嵌入或附加到html文档中的JavaScript(以通过AJAX或类似技术加载其他内容)
  • 当所有内容加载完毕后:将html文档转换为DOM树,并从该树中提取特定项.

受环境限制,禁止使用Web浏览器.一切都必须由程序或脚本来完成.功能约束会强制执行不直接连接到Internet而是通过Tor网络的JavaScript.

The environment-constraints forbid the use of a web browser. Everything must be done by programs or scripts. The feature-constraints force to execute JavaScript that doesn't connect directly to the internet, but through the Tor network.

Tor

要使用Tor网络,我可以运行在计算机上提供套接字的Tor客户端.然后,我编写一个连接到此套接字的Perl脚本.Perl脚本通过此套接字将http-和https-请求发送到Tor客户端,后者随后通过Tor网络路由它们.所有响应都以相同的方式返回.

To use the Tor network I can run a Tor client that provides a socket on my machine. Then I write a Perl script that connects to this socket. The Perl scripts sends http- and https-requests through this socket to the Tor client, who subsequently routs them through the Tor network. All response goes the same way back.

我已经测试过了,它工作正常.但是在Perl脚本中,很难执行接收到的html文档随附的JavaScript.我必须在Perl中编写一个JavaScript模拟器才能实现这一点,但这超出了我的可用时间和技能.

I've tested this, it works fine. But in a Perl script it is really hard to execute JavaScript that comes with the received html documents. I had to write a JavaScript emulator in Perl to make this possible but this is way beyond my available time and beyond my skills.

JavaScript

要执行嵌入式或附加的JavaScript,我可以使用phantomJS或slimerJS之类的工具(phantomJS在Ubuntu 12.04上无法正常工作,因此我使用的slimerJS具有几乎相同的功能).使用此工具,我可以加载html文档并自动执行其随附的所有JavaScript,因此,我还收到了所有内容,这些内容不是最初html文档的一部分,而是稍后由Ajax或类似技术加载.另外,我可以轻松地分析文档的DOM树来提取我感兴趣的项目.

To execute embedded or attached JavaScript I can use a tool like phantomJS or slimerJS (phantomJS does not work properly on Ubuntu 12.04, so I use slimerJS which offers almost the same features). With this tools I can load html documents and automatically get all JavaScript executed that comes with it, so I also receive all content that is not part of the initially html document but gets loaded later by Ajax or similar techniques. And additionally I easily can analyze the document's DOM tree to extract the items I am interested in.

我也对此进行了测试,它也可以正常工作,但是我所知道的工具(phantomJS和slimerJS)使用它们自己的过程来连接到Internet.似乎没有办法告诉他们连接到套接字并使用它通过它与互联网进行通信.

I've tested this too and it also works fine, but the tools I know (phantomJS and slimerJS) uses their own procedures to connect to the internet. There seems to be no way to tell them to connect to a socket and use it to communicate through it with the internet.

是否可以通过Tor网络自动执行Ajax调用?

Is there a way to automatically execute Ajax calls through the Tor network?

在我看来,存在两种可能的方式:

To me there seems to exist two possible ways:

  1. 获取在Perl脚本中执行的JavaScript代码.这可以通过模块来完成,但是我找不到模拟JavaScript解释器的任何cpan模块.解释器应该直接调用我必须编写的Perl函数,而不是直接连接到Internet.
  2. 强制slimerJS(或phantomJS或任何其他工具)连接到localhost上的套接字,并通过此套接字发送所有请求.也许可以在假装提供直接访问Internet但实际上将所有通信重定向到Tor客户端套接字的环境中启动slimerJS?

推荐答案

如果您正在运行Tor客户端,则可以使用其侦听的地址进行代理设置.在文档中查看您需要传递的代理选项:

If you have a Tor client running, you can use the address its listening to for proxy settings. Check the docs for the proxy options you need to pass:

代理类型将为SOCKS.请记住,您需要将地址套接字绑定到本地.

The proxy type will be SOCKS. Remember you need the address socket is bound to locally.

这篇关于使用Tor网络执行JavaScript,无需人工干预的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆