用户提供的URL属性的ESAPI XSS预防 [英] ESAPI XSS prevention for user supplied url property

查看：198 发布时间：2017/8/16 21:35:27 java encoding xss owasp esapi

本文介绍了用户提供的URL属性的ESAPI XSS预防的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的一个REST API正在期待一个属性url，它希望URL作为用户的输入。我正在使用ESAPI来防止XSS攻击。问题是用户提供的URL是类似于

http://example.com/alpha?abc=def&phil=key% 3dbdj

来自ESAPI编码器的cannonicalize方法会引发入侵异常，声称输入具有混合编码，因为它是URL编码的， '& phi'被视为HTML编码，因此被视为异常。

我有一个类似的问题，用于清理我的一个应用程序URL，其中第二个查询参数以'pa'或'pi'开头，并被转换为delta或pi字符HTML解码。请参阅我的此前的Stackoverflow问题

现在，由于整个URL是从用户输入的，所以我无法简单地解析出Query参数并单独对其进行清理，因为可以创建组合两个查询参数的恶意输入，在这种情况下，单独消毒他们不会工作。

示例：& ltscr来自第一个查询param值和ipt& gtalert（0）的最后一部分;或者某些东西作为下一个查询param控件上下文的第一部分。

有人遇到类似的问题吗？我真的很想知道你们实施的解决方案。感谢任何指针。

编辑：avgvstvs的以下答案不会引发入侵异常（Thanks !!）。然而，cannonicalize方法现在更改原始输入字符串。 ESAPI将查询param的&phi phi视为一些html编码的char，并将其替换为'？'char。像我之前的问题在这里是链接的。区别在于这是我的应用程序的URL，而这是用户输入。我唯一的选择是保持一个白名单吗？

解决方案

你在这里遇到的问题是编码URL的不同部分有不同的规则 - 对于内存，URL中有4个部分具有不同的编码规则。首先，了解为什么在Java中，您需要使用 UriBuilder 类来构建URL。 URL 规范将有助于细化细节。

现在，由于问题是，由于整个网址从用户输入
，我无法简单地解析出查询参数和
对它们进行单独的清理，因为恶意输入可以被创建
组合两个查询参数并单独消毒它们
在这种情况下不工作。

< blockquote>

这里唯一真正的选项是 java.net.URI 。

尝试这样：
  URI dirtyURI =新的URI ：//example.com/alpha ABC DEF =&安培;菲尔=关键％3dbdj）; 
 
 String cleanURIStr = enc.canonicalize（dirtyURI.getPath（））; 
  
调用 URI.getPath（）应该给你一个非百分号编码的URL，如果 enc.canonicalize（）在该阶段之后检测到双重编码，那么你真的有一个双重编码的字符串，应该通知调用者只接受单一编码的URL字符串。 URI.getPath（）足以使用URL字符串的每个部分的解码规则。

如果它仍然给你一些麻烦，那么 API参考有其他方法可以提取URL的其他部分，如果您需要与不同的部分做不同的事情的URL。如果您需要手动解析GET请求上的参数，那么您实际上可以让它返回查询字符串本身，并且会对其执行解码传递。

============= JUNIT测试用例============
  package org.owasp.esapi; 
 
 import java.net.URI; 
 import java.net.URISyntaxException; 
 
 import org.junit.Test; 
 
 public class TestURLValidation {
 
 @Test 
 public void test（）throws URISyntaxException {
 Encoder enc = ESAPI.encoder（）; 
 String input =http://example.com/alpha?abc=def&phil=key%3dbdj; 
 URI dirtyURI = new URI（input）; 
 enc.canonicalize（dirtyURI.getQuery（））; 
 
} 
 
} 
  
=================更新问题的答案=====================

没有办法： Encoder.canonicalize（）旨在将转义的字符序列减少到其缩减的本地 - Java表单。网址很可能被视为特殊情况，因此最有可能被故意排除在考虑之外。这是我处理你的情况的方式 - 没有白名单，它将保证您受到 Encoder.canonicalize（）的保护。

使用上面的代码获取输入的URI表示。

步骤1：除了 URI.getQuery（）
之外的所有URI部分的规范化步骤2：使用库解析器将查询字符串解析成数据结构。我将使用来自commons的httpclient-4.3.3.jar和httpcore-4.3.3.jar。然后你会这样做：
  import java.net.URI; 
 import java.net.URISyntaxException; 
 import java.util.Iterator; 
 import java.util.List; 
 
 import javax.ws.rs.core.UriBuilder; 
 
 import org.apache.http.client.utils.URLEncodedUtils; 
 import org.junit.Test; 
 import org.owasp.esapi.ESAPI; 
 import org.owasp.esapi.Encoder; 
 
 public class TestURLValidation 
 {
 
 @Test 
 public void test（）throws URISyntaxException {
 Encoder enc = ESAPI.encoder（） ; 
 String input =http://example.com/alpha?abc=def&phil=key%3dbdj; 
 URI dirtyURI = new URI（input）; 
 UriBuilder uriData = UriBuilder.fromUri（enc.canonicalize（dirtyURI.getScheme（）））; 
 uriData.path（enc.canonicalize（enc.canonicalize（dirtyURI.getAuthority（）+ dirtyURI.getPath（））））; 
 println（uriData.build（）。toString（））; 
列表< org.apache.http.NameValuePair> params = URLEncodedUtils.parse（dirtyURI，UTF-8）; 
迭代器< org.apache.http.NameValuePair> it = params.iterator（）; 
 while（it.hasNext（））{
 org.apache.http.NameValuePair nValuePair = it.next（）; 
 uriData.queryParam（enc.canonicalize（nValuePair.getName（）），enc.canonicalize（nValuePair.getValue（）））; 
} 
 String canonicalizedUrl = uriData.build（）。toString（）; 
 println（canonicalizedUrl）; 
} 
 
 public static void println（String s）{
 System.out.println（s）; 
} 
 
} 
  
我们真正在做什么正在使用标准库来解析inputURL（从而占用我们所有的负担），然后在分析每个部分之后对这些部分进行规范化。

请注意，我列出的代码将不适用于所有网址类型... URL的更多部分方案/权威/路径/查询。（缺少userInfo或port的可能性，如果需要，可以相应地修改此代码。）

One of my REST APIs is expecting a property "url" which expects a URL as input from the user. I am using ESAPI to prevent from XSS attacks. The problem is that the user supplied URL is something like

http://example.com/alpha?abc=def&phil=key%3dbdj

The cannonicalize method from the ESAPI encoder throws intrusion exception here claiming that the input has mixed encoding, since it is url encoded and the piece '&phi' is treated as HTML encoded and thus the exception.

I had a similar problem with sanitizing one of my application urls where the second query parameter started with 'pa' or 'pi' and was converted to delta or pi characters by HTML decoding. Please refer to my previous Stackoverflow question here

Now since the problem is that since the entire URL is coming as input from the user, I cannot simply parse out the Query parameters and sanitize them individually, since malicious input can be created combining the two query parameters and sanitizing them individually wont work in that case.

Example: &ltscr comes is last part of first query param value and ipt&gtalert(0); or something comes as first part of the next query param control context.

Has anyone faced a similar problem? I would really like to know what solutions you guys implemented. Thanks for any pointers.

EDIT: The below answer from 'avgvstvs' does not throw the intrusion exception (Thanks!!). However, the cannonicalize method now changes the original input string. ESAPI treats &phi of the query param to be some html encoded char and replaces it to '?' char. Something like my previous question which is linked here. The difference being that was a URL of my application whereas this is user input. Is my only option maintaining a white list here?
解决方案
The problem that you're facing here, is that there are different rules for encoding different parts of a URL--to memory there's 4 sections in a URL that have different encoding rules. First, understand why in Java, you need to build URLs using the UriBuilder class. The URL specification will help with nitty-gritty details.

Now since the problem is that since the entire URL is coming as input from the user, I cannot simply parse out the Query parameters and sanitize them individually, since malicious input can be created combining the two query parameters and sanitizing them individually wont work in that case.

The only real option here is java.net.URI.

Try this:
URI dirtyURI = new URI("http://example.com/alpha?abc=def&phil=key%3dbdj");

String cleanURIStr = enc.canonicalize( dirtyURI.getPath() );
The call to URI.getPath() should give you a non-percent encoded URL, and if enc.canonicalize() detects double-encoding after that stage then you really DO have a double-encoded string and should inform the caller that you will only accept single-encoded URL strings. The URI.getPath() is smart enough to use decoding rules for each part of the URL string.

If its still giving you some trouble, the API reference has other methods that will extract other parts of the URL, in the event that you need to do different things with different parts of the URL. IF you ever need to manually parse parameters on a GET request for example, you can actually just have it return the query string itself--and it will have done a decoding pass on it.

=============JUNIT Test Case============
package org.owasp.esapi;

import java.net.URI;
import java.net.URISyntaxException;

import org.junit.Test;

public class TestURLValidation {

    @Test
    public void test() throws URISyntaxException {
        Encoder enc = ESAPI.encoder();
        String input = "http://example.com/alpha?abc=def&phil=key%3dbdj";
        URI dirtyURI = new URI(input);
        enc.canonicalize(dirtyURI.getQuery());

    }

}
=================Answer for updated question=====================

There's no way around it: Encoder.canonicalize() is intended to reduce escaped character sequences into their reduced, native-to-Java form. URLs are most likely considered a special case so they were most likely deliberately excluded from consideration. Here's the way I would handle your case--without a whitelist, and it will guarantee that you are protected by Encoder.canonicalize().

Use the code above to get a URI representation of your input.

Step 1: Canonicalize all of the URI parts except URI.getQuery() Step 2: Use a library parser to parse the query string into a data structure. I would use httpclient-4.3.3.jar and httpcore-4.3.3.jar from commons. You'll then do something like this:
import java.net.URI;
import java.net.URISyntaxException;
import java.util.Iterator;
import java.util.List;

import javax.ws.rs.core.UriBuilder;

import org.apache.http.client.utils.URLEncodedUtils;
import org.junit.Test;
import org.owasp.esapi.ESAPI;
import org.owasp.esapi.Encoder;

public class TestURLValidation
{

  @Test
  public void test() throws URISyntaxException {
    Encoder enc = ESAPI.encoder();
    String input = "http://example.com/alpha?abc=def&phil=key%3dbdj";
    URI dirtyURI = new URI(input);
    UriBuilder uriData = UriBuilder.fromUri(enc.canonicalize(dirtyURI.getScheme()));
    uriData.path(enc.canonicalize(enc.canonicalize(dirtyURI.getAuthority() + dirtyURI.getPath())));
    println(uriData.build().toString());
    List<org.apache.http.NameValuePair> params = URLEncodedUtils.parse(dirtyURI, "UTF-8");
    Iterator<org.apache.http.NameValuePair> it = params.iterator();
    while(it.hasNext()) {
      org.apache.http.NameValuePair nValuePair = it.next();
      uriData.queryParam(enc.canonicalize(nValuePair.getName()), enc.canonicalize(nValuePair.getValue()));
    }
    String canonicalizedUrl = uriData.build().toString();
    println(canonicalizedUrl);
  }

  public static void println(String s) {
    System.out.println(s);
  }

}
What we're really doing here is using standard libraries to parse the inputURL (thus taking all the burden off of us) and then canonicalizing the parts after we've parsed each section.

Please note that the code I've listed won't work for all url types... there are more parts to a URL than scheme/authority/path/queries. (Missing is the possibility of userInfo or port, if you need those, modify this code accordingly.)

这篇关于用户提供的URL属性的ESAPI XSS预防的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

用户提供的URL属性的ESAPI XSS预防 [英] ESAPI XSS prevention for user supplied url property

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

用户提供的URL属性的ESAPI XSS预防 [英] ESAPI XSS prevention for user supplied url property

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭