如何在Java中解析这样的URI [英] How to parse a URI like this in Java

查看:225
本文介绍了如何在Java中解析这样的URI的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试解析以下URI: http: //translate.google.com/#zh-CN | zh |

I'm trying to parse the following URI : http://translate.google.com/#zh-CN|en|你

但收到以下错误消息:

java.net.URISyntaxException: Illegal character in fragment at index 34: http://translate.google.com/#zh-CN|en|你
        at java.net.URI$Parser.fail(URI.java:2809)
        at java.net.URI$Parser.checkChars(URI.java:2982)
        at java.net.URI$Parser.parse(URI.java:3028)

它有问题|如果我摆脱|,最后一个中文字符没有引起任何问题,处理这个问题的正确方法是什么?

It's having problem with the "|" character, if I get rid of the "|", the last Chinese char is not causing any problem, what's the right way to handle this ?

我的方法看起来像这样:

My method look like this :

  public static void displayFileOrUrlInBrowser(String File_Or_Url)
  {
    try { Desktop.getDesktop().browse(new URI(File_Or_Url.replace(" ","%20").replace("^","%5E"))); }
    catch (Exception e) { e.printStackTrace(); }
  }

感谢您的答案,但BalusC的解决方案似乎仅适用于实例对于网址,我的方法需要处理我传递给它的任何网址,它怎么知道将网址分成两部分并且只编码第二部分的起点在哪里?

Thanks for the answers, but BalusC's solution seems to work only for an instance of the url, my method needs to work with any url I pass to it, how would it know where's the starting point to cut the url into two parts and only encode the second part ?

推荐答案

管道字符是被视为不安全用于URL。您可以通过替换|来修复它使用其编码的十六进制等效值,这将是%7C

The pipe character is "considered unsafe" for use in URLs. You can fix it by replacing the | with its encoded hex equivalent, which would be "%7C"

但是,替换URL中的单个字符是一个脆弱的解决方案,当您考虑到这一点时效果不佳,在任何给定的URL中,可能需要替换许多不同的字符。你已经在替换空格,插入符号和管道......但是括号,重音符号和引号呢?或问号和&符号,它们可能是也可能不是URL的有效部分,具体取决于它们的使用方式?

However, replacing individual characters in a URL is a brittle solution that does not work very well when you consider that, in any given URL, there could potentially be quite a number of different characters that may need to be replaced. You are already replacing spaces, carets, and pipes.... but what about brackets, and accent marks, and quotation marks? Or question marks and ampersands, which may or may not be valid parts of a URL, depending on how they are used?

因此,一个优秀的解决方案是使用语言用于编码URL的工具,而不是手动编写。对于Java,请使用 URLEncoder ,按照BalusC对这个问题的答案中的例子。

Thus, a superior solution would be to use the language's facility for encoding URLs, rather than doing it manually. In the case of Java, use URLEncoder, as per the example in BalusC's answer to this question.

这篇关于如何在Java中解析这样的URI的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆