在保护评论表单和相关API端点时,应在浏览器,服务器或两者中对输入进行清理,验证和编码? [英] When securing a comment form and related API endpoint, should input be sanitized, validated and encoded in browser, server or both?

查看:56
本文介绍了在保护评论表单和相关API端点时,应在浏览器,服务器或两者中对输入进行清理,验证和编码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在没有用户身份验证的非CMS环境中尽可能地保护评论表单.

I am trying to secure, as best as possible, a comment form in a non-CMS environment with no user authentication.

该表单应该对浏览器和curl/postman类型的请求都是安全的.

The form should be secure against both browser and curl/postman type requests.

环境

后端-Node.js,MongoDB Atlas和Azure Web应用程序.
前端-jQuery.

Backend - Node.js, MongoDB Atlas and Azure web app.
Frontend - jQuery.

以下是我目前正在执行的实施的详细说明,但希望不会太过铺天盖地.

Below is a detailed, but hopefully not too overwhelming, overview of my current working implementation.

以下是我对实施的疑问.

Following that are my questions about the implementation.

使用的相关库

头盔-通过设置各种HTTP标头来帮助保护Express应用程序,包括 reCaptcha v3 -防止垃圾邮件和其他类型的自动滥用
DOMPurify -XSS消毒器
validator.js -字符串验证程序和消毒程序库
-HTML实体编码器/解码器

Helmet - helps secure Express apps by setting various HTTP headers, including Content Security Policy
reCaptcha v3 - protects against spam and other types of automated abuse
DOMPurify - an XSS sanitizer
validator.js - a library of string validators and sanitizers
he - an HTML entity encoder/decoder

一般数据流为:

/*
on click event:  
- get sanitized data
- perform some validations
- html encode the values
- get recaptcha v3 token from google
- send all data, including token, to server
- send token to google to verify
- if the response 'score' is above 0.5, add the submission to the database  
- return the entry to the client and populate the DOM with the submission   
*/ 

POST请求-浏览器

// test input:  
// <script>alert("hi!")</script><h1>hello there!</h1> <a href="">link</a>

// sanitize the input  
var sanitized_input_1_text = DOMPurify.sanitize($input_1.val().trim(), { SAFE_FOR_JQUERY: true });
var sanitized_input_2_text = DOMPurify.sanitize($input_2.val().trim(), { SAFE_FOR_JQUERY: true });

// validation - make sure input is between 1 and 140 characters
var input_1_text_valid_length = validator.isLength(sanitized_input_1_text, { min: 1, max: 140 });
var input_2_text_valid_length = validator.isLength(sanitized_input_2_text, { min: 1, max: 140 });

// if validations pass
if (input_1_text_valid_length === true && input_2_text_valid_length === true) {

/* 
encode the sanitized input 
not sure if i should encode BEFORE adding to MongoDB  
or just add to database "as is" and encode BEFORE displaying in the DOM with $("#ouput").html(html_content);
*/  
var sanitized_encoded_input_1_text = he.encode(input_1_text);
var sanitized_encoded_input_2_text = he.encode(input_2_text);

// define parameters to send to database  
var parameters = {};
parameters.input_1_text = sanitized_encoded_input_1_text; 
parameters.input_2_text = sanitized_encoded_input_2_text; 

// get token from google and send token and input to database
// see:  https://developers.google.com/recaptcha/docs/v3#programmatically_invoke_the_challenge
grecaptcha.ready(function() {
    grecaptcha.execute('site-key-here', { action: 'submit' }).then(function(token) {
        parameters.token = token;
        jquery_ajax_call_to_my_api(parameters);
    });
});
}

POST请求-服务器

var secret_key = process.env.RECAPTCHA_SECRET_SITE_KEY;
var token = req.body.token;
var url = `https://www.google.com/recaptcha/api/siteverify?secret=${secret_key}&response=${token}`;

// verify recaptcha token with google
var response = await fetch(url);
var response_json = await response.json();
var score = response_json.score;
var document = {};

/*
if google's response 'score' is greater than 0.5, 
add submission to the database and populate client DOM with $("#output").prepend(html); 
see: https://developers.google.com/recaptcha/docs/v3#interpreting_the_score
*/
if (score >= 0.5) {

    // add submission to database 
    // return submisson to client to update the DOM
    // DOM will just display this text:  <h1>hello there!</h1> <a href="">link</a>
}); 

获取页面加载请求

逻辑/假设:

  • 获取所有提交的内容,返回客户端并使用 $(#output").html(html_content); .添加到DOM.
  • 在填充DOM之前不需要对值进行编码,因为值已经在数据库中编码了?

curl,邮递员等的POST请求

逻辑/假设:

  • 他们没有Google令牌,因此无法从服务器验证它,也无法将条目添加到数据库中?

服务器上的头盔配置

app.use(
    helmet({
        contentSecurityPolicy: {
            directives: {
                defaultSrc: ["'self'"],
                scriptSrc: ["'self'", "https://somedomain.io", "https://maps.googleapis.com", "https://www.google.com", "https://www.gstatic.com"],
                styleSrc: ["'self'", "fonts.googleapis.com", "'unsafe-inline'"],
                fontSrc: ["'self'", "fonts.gstatic.com"],
                imgSrc: ["'self'", "https://maps.gstatic.com", "https://maps.googleapis.com", "data:"],
                frameSrc: ["'self'", "https://www.google.com"]
            }
        },
    })
);

问题

  1. 我应该将值作为HTML编码的实体添加到MongoDB数据库中,还是按原样"存储它们?并在将它们填充到DOM之前对它们进行编码?

  1. Should I add values to the MongoDB database as HTML encoded entities OR store them "as is" and just encode them before populating the DOM with them?

如果将值 保存为MongoDB中的html实体,这将使搜索数据库中的内容变得困难,因为搜索例如"< h1>你好!</h1>< a href =" link</a> 不会返回任何结果,因为数据库中的值是&#x3C; h1&#x3E;你好!;#x3C;/h1&#x3E;&#x3C; a href =&#x22;&#x22;&#x3E; link&#x3C;/a&#x3E;

If the values were to be saved as html entities in MongoDB, would this make searching the database for content difficult because searching for, for example "<h1>hello there!</h1> <a href="">link</a> wouldn't return any results because the value in the database was &#x3C;h1&#x3E;hello there!&#x3C;/h1&#x3E; &#x3C;a href=&#x22;&#x22;&#x3E;link&#x3C;/a&#x3E;

在我阅读的有关保护Web表单安全的内容中,有很多关于客户端实践的说法,因为可以在DOM中进行任何更改,可以禁用JavaScript,并且可以使用以下方式直接向API端点发出请求:卷曲或邮递员,因此绕过任何客户端方法.

In my reading about securing web forms, much has been said about client side practises being fairly redundant as anything can be changed in the DOM, JavaScript can be disabled, and requests can be made directly to the API endpoint using curl or postman and therefore bypass any client side approaches.

上面所说的应该进行消毒( DOMPurify ),验证()可以执行以下操作之一:1)仅客户端2)客户端服务器端还是3)仅服务器端?

With that said should sanitization (DOMPurify), validation (validator.js) and encoding (he) be performed either: 1) client side only 2) client side and server side or 3) server side only?

为彻底起见,这是另一个相关问题:

For thoroughness, here is another related question:

从客户端向服务器发送数据时,以下任何组件是否进行任何自动转义或HTML编码?我问,因为如果这样做,可能会导致不必要的手动转义或编码.

Do any of the following components do any automatic escaping or HTML encoding when sending data from client to server? I ask because if they do, it may make some manual escaping or encoding unnecessary.

  • jQuery ajax()请求
  • Node.js
  • 快递
  • 头盔
  • bodyParser(节点程序包)
  • MongoDB本机驱动程序
  • MongoDB

推荐答案

在阅读了有关该主题的更多信息之后,这就是我想出的方法:

After reading more around the topic, this is the approach I came up with:

点击事件:

  • 消毒数据( DOMPurify )
  • 验证数据( validator.js )
  • 从Google获取Recaptcha v3令牌( reCaptcha v3 )
  • 将所有数据(包括令牌)发送到服务器
  • 服务器正在使用头盔
  • 服务器正在使用快速汇率限制 Cloudflare 代理后面,该代理提供了一些安全性和缓存功能(需要设置 app.set(信任代理",true)在节点服务器文件中,以便限速器获取用户的实际IP地址-请参见 reCaptcha v3 )
  • 如果响应得分"高于 0.5 ,请再次执行相同的排序和验证
  • 如果验证通过,则使用 moderated 标志值 false
  • 将条目添加到数据库中
  • Sanitize data (DOMPurify)
  • Validate data (validator.js)
  • Get recaptcha v3 token from google (reCaptcha v3)
  • Send all data, including token, to server
  • Server is using Helmet
  • Server is using Express Rate Limit and Rate Limit Mongo to limit POST requests on a certain route to X per X milliseconds (by IP address)
  • Server is behind Cloudflare proxy which provides some security and caching features (requires setting app.set('trust proxy', true) in node server file in order for rate limiter to pick up the user's actual IP address - see Express behind proxies)
  • Send token to google from server to verify (reCaptcha v3)
  • If the response 'score' is above 0.5, perform the same santization and validations again
  • If the validations pass, add entry to database with a moderated flag value of false

我决定立即进行手动审核,而不是立即将条目返回浏览器,而是将条目的 moderated 值更改为 true .尽管它消除了用户对响应的直接要求,但如果不立即发布响应,它也减少了垃圾邮件发送者的诱惑.

Rather than immediately return entries to the browser, I decided instead to require a process of manual moderation which involves changing the moderated value of an entry to true. Whilst it takes away the immediacy of the response for the user, it makes it less tempting for spammers etc if responses aren't immediately published.

  • 页面加载时的 GET 请求然后返回所有经过 moderate:true
  • 的条目
  • HTML在显示值之前对其进行编码()
  • 用HTML编码的条目填充DOM
  • The GET request on page load then returns all entries that are moderated: true
  • HTML encode the values before displaying them (he)
  • Populate the DOM with the HTML encoded entries

代码看起来像这样:

POST请求-浏览器

// sanitize the input  
var sanitized_input_1_text = DOMPurify.sanitize($input_1.val().trim(), { SAFE_FOR_JQUERY: true });
var sanitized_input_2_text = DOMPurify.sanitize($input_2.val().trim(), { SAFE_FOR_JQUERY: true });

// validation - make sure input is between 1 and 140 characters
var input_1_text_valid_length = validator.isLength(sanitized_input_1_text, { min: 1, max: 140 });
var input_2_text_valid_length = validator.isLength(sanitized_input_2_text, { min: 1, max: 140 });

// validation - regex to only allow certain characters
// for pattern, see:  https://stackoverflow.com/q/63895992
var pattern = /^(?!.*([ ,'-])\1)[a-zA-Z]+(?:[ ,'-]+[a-zA-Z]+)*$/;
var input_1_text_valid_characters = validator.matches(sanitized_input_1_text, pattern, "gm");
var input_2_text_valid_characters = validator.matches(sanitized_input_2_text, pattern, "gm");

// if validations pass
if (input_1_text_valid_length === true && input_2_text_valid_length === true && input_1_text_valid_characters === true && input_2_text_valid_characters === true) {

// define parameters to send to database  
var parameters = {};
parameters.input_1_text = sanitized_input_1_text; 
parameters.input_2_text = sanitized_input_2_text; 

// get token from google and send token and input to database
// see:  https://developers.google.com/recaptcha/docs/v3#programmatically_invoke_the_challenge
grecaptcha.ready(function() {
    grecaptcha.execute('site-key-here', { action: 'submit_entry' }).then(function(token) {
        parameters.token = token;
        jquery_ajax_call_to_my_api(parameters);
    });
});
}

POST请求-服务器

var secret_key = process.env.RECAPTCHA_SECRET_SITE_KEY;
var token = req.body.token;
var url = `https://www.google.com/recaptcha/api/siteverify?secret=${secret_key}&response=${token}`;

// verify recaptcha token with google
var response = await fetch(url);
var response_json = await response.json();
var score = response_json.score;
var document = {};

// if google's response 'score' is greater than 0.5, 
// see: https://developers.google.com/recaptcha/docs/v3#interpreting_the_score  

if (score >= 0.5) {

// perform all the same sanitizations and validations to protect against
// POST requests direct to the API via curl or postman etc  
// if validations pass, add entry to the database with `moderated: false` property   


}); 

获取请求-浏览器

逻辑:

  • 获取所有具有 moderated:true 属性
  • 的条目
  • HTML在填充DOM之前对值进行编码

服务器上的头盔配置

app.use(
    helmet({
        contentSecurityPolicy: {
            directives: {
                defaultSrc: ["'self'"],
                scriptSrc: ["'self'", "https://maps.googleapis.com", "https://www.google.com", "https://www.gstatic.com"],
                connectSrc: ["'self'", "https://some-domain.com", "https://some.other.domain.com"],
                styleSrc: ["'self'", "fonts.googleapis.com", "'unsafe-inline'"],
                fontSrc: ["'self'", "fonts.gstatic.com"],
                imgSrc: ["'self'", "https://maps.gstatic.com", "https://maps.googleapis.com", "data:", "https://another-domain.com"],
                frameSrc: ["'self'", "https://www.google.com"]
            }
        },
    })
);

在OP中回答我的问题:

In answer to my questions in the OP:

  1. 我应该将值作为HTML编码的实体添加到MongoDB数据库中吗或按原样"存储它们并在填充DOM之前对它们进行编码和他们在一起?

只要在客户端和服务器上都对输入进行了清理和验证,则只需要在填充DOM之前进行HTML编码即可.

As long as the input is sanitised and validated on both client and server, you should only need to HTML encode just before populating the DOM.

  1. 如果要将值另存为MongoDB中的html实体,这会否使搜索数据库中的内容变得困难,因为搜索例如,< h1>你好!< a href =""链接</a> 不会返回任何结果,因为数据库中的值是&#x3C; h1&#x3E;你好!&#x3C;/h1&#x3E;&#x3C; a href =&#x22;&#x22;&#x3E; link&#x3C;/a&#x3E;
  1. If the values were to be saved as html entities in MongoDB, would this make searching the database for content difficult because searching for, for example <h1>hello there!</h1> <a href="">link</a> wouldn't return any results because the value in the database was &#x3C;h1&#x3E;hello there!&#x3C;/h1&#x3E; &#x3C;a href=&#x22;&#x22;&#x3E;link&#x3C;/a&#x3E;

我认为,如果数据库条目中填充了HTML编码的值,这会使数据库条目看起来杂乱无章,因此我按原样存储经过清理,验证的条目.

I figured it would make database entries look messy if they were filled with HTML encoded values, so I store the sanitized, validated entries "as is".

  1. 在我阅读的有关保护Web表单安全的内容中,有很多关于客户端实践是相当多余的,因为任何事情都可以更改DOM,可以禁用JavaScript,并且可以使用curl或postman直接制作到API端点,因此绕过任何客户端方法.

  1. In my reading about securing web forms, much has been said about client side practises being fairly redundant as anything can be changed in the DOM, JavaScript can be disabled, and requests can be made directly to the API endpoint using curl or postman and therefore bypass any client side approaches.

上面所说的应该进行消毒(DOMPurify),验证(validator.js)和编码(他)可以执行以下任一操作:1)客户端仅2)客户端和服务器端,还是3)仅服务器端?

With that said should sanitization (DOMPurify), validation (validator.js) and encoding (he) be performed either: 1) client side only 2) client side and server side or 3) server side only?

选项 2 ,清除并验证客户端服务器上的输入.

Option 2, sanitize and validate input on client and server.

这篇关于在保护评论表单和相关API端点时,应在浏览器,服务器或两者中对输入进行清理,验证和编码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆