Perl-HTTP :: Proxy捕获XHR/JSON通信 [英] Perl - HTTP::Proxy capture XHR/JSON communication

查看:86
本文介绍了Perl-HTTP :: Proxy捕获XHR/JSON通信的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

站点 http://openbook.etoro.com/#/main/具有通过XHR保持活动请求由javascript生成的实时供稿,并以gzip压缩JSON字符串的形式从服务器获取答案.

The site http://openbook.etoro.com/#/main/ has an live feed what is generated by javascript via XHR keep-alive requests and getting answers from server as gzip compressed JSON string.

我想将提要捕获到文件中.

I want capture the feed into a file.

通常的方法(WWW :: Mech ..)(可能)不可行,因为需要对页面中的所有Javascript进行工程设计并模拟浏览器是一项艰巨的任务,因此,寻找替代解决方案.

The usual way (WWW::Mech..) is (probably) not viable because the need of reverese engineering all Javascripts in the page and simulating the browser is really hard task, so, looking for an alternative solution.

我的想法是使用中间人战术,因此浏览器将完成他的工作,我希望通过perl代理(仅用于此任务)捕获通信.

My idea is using a Man-in-the-middle tactics, so the broswser will do his work and i want capture the communication via an perl proxy - dedicated only for this task.

我可以掌握最初的交流,但不能获取供稿本身.代理正常运行,因为在浏览器中供稿正在运行,只有我的文件管理器不起作用.

I'm able catch the initial communication, but not the feed itself. The proxy working OK, because in the browser the feed is running only my filers not works.

use HTTP::Proxy;
use HTTP::Proxy::HeaderFilter::simple;
use HTTP::Proxy::BodyFilter::simple;
use Data::Dumper;
use strict;
use warnings;

my $proxy = HTTP::Proxy->new(
     port => 3128, max_clients => 100, max_keep_alive_requests => 100
);

my $hfilter = HTTP::Proxy::HeaderFilter::simple->new(
    sub {
        my ( $self, $headers, $message ) = @_;
        print STDERR "headers", Dumper($headers);
    }
);

my $bfilter = HTTP::Proxy::BodyFilter::simple->new(
    filter => sub {
        my ( $self, $dataref, $message, $protocol, $buffer ) = @_;
        print STDERR "dataref", Dumper($dataref);
    }
);

$proxy->push_filter( response => $hfilter); #header dumper
$proxy->push_filter( response => $bfilter); #body dumper
$proxy->start;

使用上述代理将Firefox配置为进行所有通信.

Firefox is configured using the above proxy for all communication.

提要在浏览器中运行,因此代理向它提供数据. (当我停止代理时,提要也将停止).随机(无法确定何时)我出现以下错误:

The feed is running in the browser, so the proxy feeding it with data. (When i stop the proxy, the feed is stopping too). Randomly (can't figure when) i getting the following error:

[Tue Jul 10 17:13:58 2012] (42289) ERROR: Getting request failed: Client closed

任何人都可以向我展示一种方法,如何为Dumper构建正确的HTTP :: Proxy过滤器,以便浏览器和服务器之间保持keep_alive XHR的所有通信吗?

推荐答案

以下是我认为您需要做的事情:

Here's something that I think does what you're after:

#!/usr/bin/perl

use 5.010;
use strict;
use warnings;

use HTTP::Proxy;
use HTTP::Proxy::BodyFilter::complete;
use HTTP::Proxy::BodyFilter::simple;
use JSON::XS     qw( decode_json );
use Data::Dumper qw( Dumper );

my $proxy = HTTP::Proxy->new(
    port                     => 3128,
    max_clients              => 100,
    max_keep_alive_requests  => 100,
);

my $filter = HTTP::Proxy::BodyFilter::simple->new(
    sub {
        my ( $self, $dataref, $message, $protocol, $buffer ) = @_;
        return unless $$dataref;
        my $content_type = $message->headers->content_type or return;
        say "\nContent-type: $content_type";
        my $data = decode_json( $$dataref );
        say Dumper( $data );
    }
);

$proxy->push_filter(
    method   => 'GET',
    mime     => 'application/json',
    response => HTTP::Proxy::BodyFilter::complete->new,
    response => $filter
);

$proxy->start;

我认为您不需要单独的标题过滤器,因为您可以使用主体过滤器中的$message->headers访问要查看的任何标题.

I don't think you need a separate header filter because you can access any headers you want to look at using $message->headers in the body filter.

您会注意到,我将两个过滤器推到了管道上.第一个是HTTP::Proxy::BodyFilter::complete类型,其工作是收集响应块,并确保后面的真实过滤器始终在$dataref中获得完整的消息.但是,对于已接收和缓冲的foreach块,将调用以下过滤器并将其传递给空的$dataref.我的过滤器会通过提早返回来忽略它们.

You'll note that I pushed two filters onto the pipeline. The first one is of type HTTP::Proxy::BodyFilter::complete and its job is to collect up the chunks of response and ensure that the real filter that follows always gets a complete message in $dataref. However foreach chunk that's received and buffered, the following filter will be called and passed an empty $dataref. My filter ignores these by returning early.

我还设置了过滤器管道,以忽略除导致JSON响应的GET请求以外的所有内容-因为这些似乎是最有趣的.

I also set up the filter pipeline to ignore everything except GET requests that resulted in JSON responses - since these seem to be the most interesting.

感谢您提出这个问题-这是一个有趣的小问题,您似乎已经完成了大部分艰苦的工作.

Thanks for asking this question - it was an interesting little problem and you seemed to have done most of the hard work already.

这篇关于Perl-HTTP :: Proxy捕获XHR/JSON通信的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆