如何为GET请求绕过LWP的URL编码? [英] How may I bypass LWP's URL encoding for a GET request?

查看:129
本文介绍了如何为GET请求绕过LWP的URL编码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在谈论似乎是损坏的HTTP守护程序,我需要发出一个 GET 请求,其中包括一个管道 | 字符。

I'm talking to what seems to be a broken HTTP daemon and I need to make a GET request that includes a pipe | character in the URL.

LWP :: UserAgent 在发送请求之前转义管道字符。

LWP::UserAgent escapes the pipe character before the request is sent.

例如,网址传递为:

https://hostname/url/doSomethingScript?ss=1234&activities=Lec1|01

传递给HTTP守护程序,

is passed to the HTTP daemon as

https://hostname/url/doSomethingScript?ss=1234&activities=Lec1%7C01

这是正确的,但不适用于此损坏的服务器。

This is correct, but doesn't work with this broken server.

如何覆盖或绕过LWP及其朋友的编码

How can I override or bypass the encoding that LWP and its friends are doing?

注意

我在这里看到并尝试了其他答案在StackOverflow上解决了类似的问题。此处的区别似乎是,这些答案正在处理 POST 请求,可以在其中传递URL的 formfield 部分作为键/值对的数组或'Content'=> $ content 参数。这些方法不适用于LWP请求。

I've seen and tried other answers here on StackOverflow addressing similar problems. The difference here seems to be that those answers are dealing with POST requests where the formfield parts of the URL can be passed as an array of key/value pairs or as a 'Content' => $content parameter. Those approaches aren't working for me with an LWP request.

我也尝试过构建 HTTP :: Request 对象并将其传递给LWP,并将完整的URL直接传递给 LWP-> get()

I've also tried constructing an HTTP::Request object and passing that to LWP, and passing the full URL direct to LWP->get(). No dice with either approach.

为回应Borodin的请求,这是我正在使用的代码的净化版本

In response to Borodin's request, this is a sanitised version of the code I'm using

#!/usr/local/bin/perl -w
use HTTP::Cookies;
use LWP;

my $debug = 1;

# make a 'browser' object
my $browser = LWP::UserAgent->new();

# cookie handling...
$browser->cookie_jar(HTTP::Cookies->new(
             'file' => '.cookie_jar.txt',
             'autosave' => 1,
             'ignore_discard' => 1,
             ));

# proxy, so we can watch...
if ($debug == 1) {
    $browser->proxy(['http', 'ftp', 'https'], 'http://localhost:8080/');
}

# user agent string (pretend to be Firefox)
$agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.7.12) Gecko/20050919 Firefox/1.0.7';

# set the user agent
$browser->agent($agent);

# do some things here to log in to the web site, accept session cookies, etc. 
# These are basic POSTs of filled forms. Works fine.
# [...]

my $baseURL = 'https://hostname/url/doSomethingScript?ss=1234&activities=VALUEA|VALUEB';

@values = ['Lec1', '01', 'Lec1', '02'];

while (1) {
    if (scalar(@values) < 2) { last; }

    my $vala = shift(@values);
    my $valb = shift(@values);

    my $url = $basEURL;
    $url =~ s/VALUEA/$vala/g;
    $url =~ s/VALUEB/$valb/g;

    # simplified. Would usually check request for '200' response, etc...
    $content = $browser->get($url)->content();

    # do something here with the content

    # [...]

    # fails because the '|' character in the url is escaped after it's handed 
    # to LWP

}

# end


推荐答案

正如@bchgys在他的评论中提到的,这(几乎)在链接的线程。这里有两个解决方案:

As @bchgys mentions in his comment, this is (almost) answered in the linked thread. Here are two solutions:

第一个也是可以说最干净的方法是本地覆盖URI :: Escape中的转义图以不修改管道字符:

The first and arguably cleanest one is to locally override the escape map in URI::Escape to not modify the pipe character:

use URI;
use LWP::UserAgent;

my $ua = LWP::UserAgent->new();
my $res;
{
    # Violate RFC 2396 by forcing broken query string
    # local makes the override take effect only in the current code block
    local $URI::Escape::escapes{'|'} = '|';
    $res = $ua->get('http://server/script?q=a|b');
}
print $res->request->as_string, "\n";

或者,您可以简单地通过在请求被请求后直接在请求中修改URI来取消转义。创建:

Alternatively, you can simply undo the escaping by modifying the URI directly in the request after the request has been created:

use HTTP::Request;
use LWP::UserAgent;

my $ua = LWP::UserAgent->new();
my $req = HTTP::Request->new(GET => 'http://server/script?q=a|b');

# Violate RFC 2396 by forcing broken query string
${$req->uri} =~ s/%7C/|/; 

my $res = $ua->request($req);
print $res->request->as_string, "\n";

第一个解决方案几乎可以肯定是更可取的,因为它至少依赖于% URI :: Escape :: escapes 包变量,该变量已导出并记录下来,因此与使用受支持的API来完成此操作的距离很近。

The first solution is almost certainly preferable because it at least relies on the %URI::Escape::escapes package variable which is exported and documented, so that's probably as close as you're gonna get to doing this with a supported API.

请注意,在任何一种情况下,您都违反RFC 2396,但是如上所述,当您与无法控制的损坏服务器通信时,您可能别无选择。

Note that in either case you are in violation of RFC 2396 but as mentioned you may have no choice when talking to a broken server that you have no control over.

这篇关于如何为GET请求绕过LWP的URL编码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆