尝试将变量输入 url 并遇到编码问题 [英] Trying to input variable into url and having encoding issues

查看:23
本文介绍了尝试将变量输入 url 并遇到编码问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 Perl 的新手,正在尝试制作一个脚本,该脚本从用户那里获取输入,然后根据该输入从网站获取 XML 数据和一个 url,然后将其转发给用户.

但是我现在在根据用户的输入制作可用链接时遇到了一些问题.

这是我的完整代码:

使用严格;使用警告;我的 $row = 0;使用 XML::LibXML;print "\n\n\n你需要在什么地方提供天气预报?-> ";chomp( 我的 $ort = <> );my $url = join('', "http://www.yr.no/place/Sweden/Västra_Götaland/",$ort,"/forecast_hour_by_hour.xml");我的 $dom = XML::LibXML->load_xml(location => $url);打印 "\n\n查看下面的天气 ", $ort, ":\n\n";foreach 我的 $weatherdata ($dom->findnodes('//time')) {如果($行!= 10){我的 $temp = $weatherdata->findvalue('./temperature/@value');my $value = $weatherdata->findvalue('./@from');我的 $valuesub = substr $value, 11, 5;打印 "At ", $valuesub, " 温度将是:", $temp, "C\n";$行++;}}打印 "\n\n";

如果我写了一个我想要天气信息的地方.例如:

<块引用>

梅勒鲁德

然后,我从带有正确数据的链接中得到响应.然而.如果我写

<块引用>

Åmål

它对脚本没有任何意义.我现在得到:

<块引用>

无法为文件创建文件解析器上下文"

此外,我以管理员权限在 Windows CMD 中运行它.根据@zdim,它在 linux 中运行良好,输入 xterm,v5.16.有没有办法让它在 Windows 中工作?

解决方案

问题在于 CMD.exe 仅限于 8 位代码页."Å" 和 "å" 字符被映射到(在瑞典语 Windows 中)代码页 850 的上 8 位范围中的位置,这些位置是 Unicode 中的非法代码点.

如果您需要输出非 7 位 ASCII 字符,请考虑运行 PowerShell ISE.如果设置正确,它可以处理您使用的字体支持的任何字符(在输出中).最大的缺点是 PowerShell ISE 不是控制台,因此不允许使用 STDIN 从控制台/键盘输入.您可以通过将输入作为参数、来自管道、设置文件或通过图形 UI 查询元素来解决此问题.

设置 Windows PowerShell ISE 以使用 UTF8:

  1. 通过运行(在管理员提升的 PowerShell 中)将 PowerShell 设置为允许运行本地未签名的用户脚本:

    Set-ExecutionPolicy RemoteSigned

  2. 创建或编辑文件\WindowsPowerShell\Microsoft.PowerShellISE_profile.ps1"并添加如下内容:

    perl -w -e 'print qq!Initializing with Perl...\n!;'[System.Console]::OutputEncoding = [System.Text.Encoding]::UTF8;

    (您需要 Perl 位(或等效的东西)在那里允许修改编码.)

  3. 在 PowerShell ISE 的选项中,将字体设置为 Consolas.

  4. 在你的 perl 脚本中,总是这样做:

    binmode(STDOUT, ':encoding(UTF-8)');binmode(STDERR, ':encoding(UTF-8)');

我对 OP 问题的解决方案:

使用严格;使用警告;我的 $row = 0;使用 XML::LibXML;binmode(STDOUT, ':encoding(UTF-8)');binmode(STDERR, ':encoding(UTF-8)');@ARGV 或死没有参数!\n";我的 $ort = shift @ARGV;打印 "\n\n\n正在获取 \"$ort\"\n" 的天气报告;my $url = join('', "http://www.yr.no/place/Sweden/Västra_Götaland/",$ort,"/forecast_hour_by_hour.xml");我的 $dom = XML::LibXML->load_xml(location => $url);打印 "\n\n查看下面的天气 ", $ort, ":\n\n";foreach 我的 $weatherdata ($dom->findnodes('//time')) {如果($行!= 10){我的 $temp = $weatherdata->findvalue('./temperature/@value');我的 $value = $weatherdata->findvalue('./@from');我的 $valuesub = substr $value, 11, 5;打印 "At ", $valuesub, " 温度将是:", $temp, "C\n";$行++;}}打印 "\n\n";

输出:

(大约在 2018-06-09T14:05 UTC;16:05 CEST(瑞典时区)运行):

PS(审查)>perl -w $env:perl5lib\Tests\Amal-Test.pl "Åmål"获取Åmål"的天气报告请参阅以下 Åmål 的天气:17:00 的温度将是:27C18:00 的温度将是:26C19:00 时的温度为:25C20:00 的温度将是:23C在 21:00 的温度将是:22C在 22:00 的温度将是:21C23:00 的温度将是:20C在 00:00 的温度将是:19C在 01:00 的温度将是:18C在 02:00 的温度将是:17C

另一个注意事项:

依赖数据始终位于字符串中的确切位置可能不是最好的主意.

代替:

my $valuesub = substr $value, 11, 5;

也许可以考虑用正则表达式匹配它:

if ($value =~/T((?:[01]\d|2[0-3]):[0-5]\d):/) {我的 $valuesub = $1;打印 "At ", $valuesub, " 温度将是:", $temp, "C\n";}别的 {警告格式错误的值:$value\n";}

I am new to Perl and trying to make a script that takes input from the user and then get XML data from a website based on that input together with a url and then relay it back to the user.

But I have had some issues now with make a usable link based on the input from the user.

This is my code in full:

use strict;
use warnings;

my $row = 0;

use XML::LibXML;

print "\n\n\nOn what place do you need a weather report for? -> ";

chomp( my $ort = <> );

my $url = join('', "http://www.yr.no/place/Sweden/Västra_Götaland/",$ort,"/forecast_hour_by_hour.xml");

my $dom = XML::LibXML->load_xml(location => $url);

print "\n\nSee below the weather for ", $ort, ":\n\n";

foreach my $weatherdata ($dom->findnodes('//time')) {

    if($row != 10){ 

        my $temp = $weatherdata->findvalue('./temperature/@value');
        my $value = $weatherdata->findvalue('./@from');

        my $valuesub = substr $value, 11, 5;

        print "At ", $valuesub, " the temperature will be: ", $temp, "C\n";

        $row++;
    }
}

print "\n\n";

If I write a place I want the weather info on. For example:

Mellerud

Then it takes that and I get a response from the link with propper data. However. If I Write

Åmål

Its not making any sense to the script. I now get:

Could not create file parser context for file "http://www.yr.no/place/Sweden/V├ñstra_G├Âtaland/Åmål/forecast_hour_by_hour.xml": No error at test4.pl line 14

If I replace ",$ort," and just add Åmål I get the propper result. I have been searching for different types of encoding for this, but I have not found a solution that works.

Once again I would like to point out that I am really new to this. I might miss something really simple. My apologies for that.

::EDIT 1::

After suggestion from @zdim I added use open ':std', ':encoding(UTF-8)';

This added some different results, but does only generate more error as following here:

Also I am running this in Windows CMD under administrator privileges. According to @zdim its running fine in linux with xterm for input, v5.16. Is there a way to make it work in Windows?

解决方案

The problem is that CMD.exe is limited to 8-bit codepages. The "Å" and "å" characters are mapped (in Swedish Windows) to positions in the upper 8-bit range of codepage 850 that are illegal code points in Unicode.

If you need to output non-7-bit-ASCII characters, consider running PowerShell ISE. If you set it up correctly, it can cope with any character (in output) that the font you're using supports. The big downside is that PowerShell ISE is not a console, and therefore doesn't allow input from console/keyboard using STDIN. You can work around this by supplying your input as arguments, from a pipe, in a setting file, or thru graphical UI query elements.

To set up Windows PowerShell ISE to work with UTF8:

  1. Set PowerShell to allow running local unsigned user scripts by running (in administrator elevated PowerShell):

    Set-ExecutionPolicy RemoteSigned
    

  2. Create or edit the file "<Documents>\WindowsPowerShell\Microsoft.PowerShellISE_profile.ps1" and add something like:

    perl -w -e 'print qq!Initializing with Perl...\n!;'
    [System.Console]::OutputEncoding = [System.Text.Encoding]::UTF8;
    

    (You need the Perl bit (or something equivalent) there to allow for the modification of the encoding.)

  3. In PowerShell ISE's options, set the font to Consolas.

  4. In your perl scripts, always do:

    binmode(STDOUT, ':encoding(UTF-8)');
    binmode(STDERR, ':encoding(UTF-8)');
    

My solution to the OP's problem:

use strict;
use warnings;

my $row = 0;

use XML::LibXML;

binmode(STDOUT, ':encoding(UTF-8)');
binmode(STDERR, ':encoding(UTF-8)');

@ARGV  or  die "No arguments!\n";

my $ort = shift @ARGV;

print "\n\n\nGetting weather report for \"$ort\"\n";

my $url = join('', "http://www.yr.no/place/Sweden/Västra_Götaland/",$ort,"/forecast_hour_by_hour.xml");

my $dom = XML::LibXML->load_xml(location => $url);

print "\n\nSee below the weather for ", $ort, ":\n\n";

foreach my $weatherdata ($dom->findnodes('//time')) {

    if($row != 10){ 

        my $temp = $weatherdata->findvalue('./temperature/@value');
        my $value = $weatherdata->findvalue('./@from');

        my $valuesub = substr $value, 11, 5;

        print "At ", $valuesub, " the temperature will be: ", $temp, "C\n";

        $row++;
    }
}

print "\n\n";

Output:

(run at around 2018-06-09T14:05 UTC; 16:05 CEST (which is Sweden's time zone)):

PS (censored)> perl -w $env:perl5lib\Tests\Amal-Test.pl "Åmål"



Getting weather report for "Åmål"


See below the weather for Åmål:

At 17:00 the temperature will be: 27C
At 18:00 the temperature will be: 26C
At 19:00 the temperature will be: 25C
At 20:00 the temperature will be: 23C
At 21:00 the temperature will be: 22C
At 22:00 the temperature will be: 21C
At 23:00 the temperature will be: 20C
At 00:00 the temperature will be: 19C
At 01:00 the temperature will be: 18C
At 02:00 the temperature will be: 17C

Another note:

Relying on data to always be in an exact position in a string might not be the best idea.

Instead of:

my $valuesub = substr $value, 11, 5;

maybe consider matching it with a regular expression instead:

if ($value =~ /T((?:[01]\d|2[0-3]):[0-5]\d):/) {
    my $valuesub = $1;
    print "At ", $valuesub, " the temperature will be: ", $temp, "C\n";    }
else {
    warn "Malformed value: $value\n";
}

这篇关于尝试将变量输入 url 并遇到编码问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆