为html和input字段安全地转义输出 [英] Escaping output safely for both html and input fields

查看:180
本文介绍了为html和input字段安全地转义输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的网络应用中,用户可以输入文本数据。该数据可以显示给其他用户,原始作者也可以返回并编辑其数据。我正在寻找正确的方式来安全地逃避这些数据。

In my web app, users can input text data. This data can be shown to other users, and the original author can also go back and edit their data. I'm looking for the correct way to safely escape this data.

我只是sql在进行清理,所以一切都存储,因为它读取。假设我在数据库中有déjàvu。或者,要更加极端,一个< script> 标签。这可能是有效的,甚至不是恶意的输入。

I'm only sql sanitizing on the way in, so everything is stored as it reads. Let's say I have "déjà vu" in the database. Or, to be more extreme, a <script> tag. It is possible that this may be valid, and not even maliciously intended, input.

我使用 htmlentities()在出路上,以确保一切都被转义。问题是html和输入字段处理方式不同。我想确保它在HTML中是安全的,但是作者在编辑文本时,确切地看到他们在输入字段中输入的内容。我还使用jQuery来动态填写表单域。

I'm using htmlentities() on the way out to make sure everything is escaped. The problem is that html and input fields treat things differently. I want to make sure it's safe in HTML, but that the author when editing the text, sees exactly what they typed in the input fields. I'm also using jQuery to fill form fields with the data dynamically.

如果我这样做:

 <p><?=htmlentities("déjà vu");?></p>
 <input type=text value="<?=htmlentities("déjà vu");?>">

页面源放置 d& eacute; j& agrave; vu 在这两个地方(我不得不反驳,或者你会看到d&eac​​ute; jà vu!)问题是,< p> code>是正确的,但输入只显示转义的文本。如果用户重新提交表单,那么它们将双重转义并将其输入消息。

The page source puts d&eacute;j&agrave; vu in both places (I had to backtick that or you would see "déjà vu"!) The problem is that the output in the <p> is correct, but the input just shows the escaped text. If the user resubmits their form, they double escape and ruin their input.

我知道我还需要对进入该字段的文本进行清理,否则可以结束该值报价做坏事我找到的唯一解决办法就是这样。再次,我使用jQuery。

I know I still have to sanitize text that goes into the field, otherwise you can end the value quote and do bad things. The only solution I found is this. Again, I'm using jQuery.

var temp = $("<div></div>").html("<?=htmlentities("déjà vu");?>");
$("input").val(temp.html());

这有效,因为它会导致div将转义的文本读取为编码字符,然后是jquery将这些编码的字符复制到输入标签,并正确保存。

This works, as it causes the div to read the escaped text as encoded characters, and then the jquery copies those encoded characters to the input tag, properly preserved.

所以我的问题是:这还是安全的,还是在某处有安全漏洞?更重要的是,这是唯一/正确的方法吗?我错过了一些关于HTML和字符编码的工作原理,这使得这是一个很小的问题来解决?

So my question: is this still safe, or is there a security hole somewhere? And more importantly, is this the only / correct way to do this? Am I missing something about how html and character encoding works that make this a trivial issue to solve?

这实际上是错误的,我简化了我的例子,以至于它不起作用。这个问题其实是因为我使用jQuery的val()来将文本插入字段。

This is actually wrong, I oversimplified my example to the point of it not working. The problem is actually because I'm using jQuery's val() to insert the text into the field.

<input>
<script>$("input").val("<?=htmlentities("déjà vu");?>");</script>

原因是表单是动态的 - 用户可以随意添加或删除字段,所以它们是在页面加载后生成的。

The reason for this is that the form is dynamic - the user can add or remove fields at will and so they are generated after page load.

所以看来,jQuery正在将数据转义为输入,但是还不够好 - 如果我不自己做任何事情,用户仍然可以放入< / script> 标签,杀死我的代码并插入恶意代码。但这里还有另一个论据。由于只有原作者才能在输入框中看到文字,应该甚至打扰吗?基本上,他们可以执行XSS攻击的唯一的人是他们自己。

So it seems that jQuery is escaping the data to go into the input, but it's not quite good enough - if I don't do anything myself, a user can still put in a </script> tag, killing my code and inserting malicious code. But there's another argument to be made here. Since only the original author can see the text in an input box anyway, should I even bother? Basically the only people they could execute an XSS attack against is themselves.

推荐答案

我很抱歉,但我无法重现你的行为描述。我一直使用 htmlspecialchars()(它实际上与 htmlentities()完全相同的任务),它永远不会导致以任何一种双重编码。页面来源显示 d& eacute; j& agrave;在这两个地方(当然!这是点!),但是渲染的页面显示了适当的值,这就是发送回服务器的那个。

I'm sorry but I cannot reproduce the behaviour you describe. I've always used htmlspecialchars() (which does essentially the same task as htmlentities()) and it's never lead to any sort of double-encoding. The page source shows d&eacute;j&agrave; vu in both places (of course! that's the point!) but the rendered page shows the appropriate values and that's what sent back to the server.

你能发布一个完整的自包含代码片段,表现出这样的行为吗?

Can you post a full self-contained code snippet that exhibits such behaviour?

更新:一些测试代码:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head><title></title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body>

<?php

$default_value = 'déjà vu <script> ¿foo?';

if( !isset($_GET['foo']) ){
    $_GET['foo'] = $default_value;
}

?>

<form action="" method="get">
    <p><?php echo htmlentities($_GET['foo']); ?></p>
    <input type="text" name="foo" value="<?php echo htmlentities($_GET['foo']); ?>">
    <input type="submit" value="Submit">
</form>

</body>
</html>



更新问题的答案



code> htmlentities()函数,如其名称所示,用于生成HTML输出。这就是为什么在第二个例子中没有什么用处:JavaScript是而不是 HTML。

Answer to updated question

The htmlentities() function, as its name suggests, is used when generating HTML output. That's why it's of little use in your second example: JavaScript is not HTML. It's a language of its own with its own syntax.

现在,您要修复的问题是如何根据这两个规则生成输出:

Now, the problem you want to fix is how to generate output that follows these two rules:


  1. 它是JavaScript中的一个有效字符串。

  2. 它可以安全地嵌入到HTML文档中。

我知道#1的最接近的PHP函数是 json_encode()。由于JSON语法是JavaScript的一个子集,如果您使用PHP字符串进行输入,则会输出一个JavaScript字符串。

The closest PHP function for #1 I'm aware of is json_encode(). Since JSON syntax is a subset of JavaScript, if you feed it with a PHP string it will output a JavaScript string.

关于#2,一旦浏览器输入JavaScript阻止它预期< / script> 标签才能保留。 json_encode()函数负责处理并正确转义(< \ / script> )。

As about #2, once the browser enters a JavaScript block it expects a </script> tag to leave it. The json_encode() function takes care of this and escapes it properly (<\/script>).

我修改过的测试代码:

<?php

$default_value = 'déjà vu </script> ¿foo?';

if( !isset($_GET['foo']) ){
    $_GET['foo'] = $default_value;
}

?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head><title></title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js"></script>
<script type="text/javascript"><!--
$(function(){
    $("input[type=text]").val(<?php echo json_encode(utf8_encode($_GET['foo'])); ?>);
});
//--></script>
</head>
<body>


<form action="" method="get">
    <p><?php echo htmlentities($_GET['foo']); ?></p>
    <input type="text" name="foo" value="(to be replaced)">
    <input type="submit" value="Submit">
</form>

</body>
</html>

注意: utf8_encode() -8859-1到UTF-8,如果您的数据已经在UTF-8(推荐),则不需要。

Note: utf8_encode() converts from ISO-8859-1 to UTF-8 and it isn't required if your data is already in UTF-8 (recommended).

这篇关于为html和input字段安全地转义输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆