URL 中哪些字符是有效的? [英] What characters are valid in a URL?

查看:25
本文介绍了URL 中哪些字符是有效的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试删除大字符串的非 URL 部分.我发现的大多数正则表达式都类似于 [A-Za-z0-9-_.!~*'()],但是 url 可以包含更多内容.比如http://127.0.0.1:8080/test?v=123#this

I'm trying to remove the non-URL part of a big string. Most of the regexes I found are like [A-Za-z0-9-_.!~*'()], but there are more things that can a url contain. Like http://127.0.0.1:8080/test?v=123#this for example

那么有效 URL 的最新字符是什么?

So what are the latest characters for a valid URL?

推荐答案

所有血腥细节都可以在当前的 RFC 中找到:RFC 3986(统一资源标识符 (URI):通用语法)

All the gory details can be found in the current RFC on the topic: RFC 3986 (Uniform Resource Identifier (URI): Generic Syntax)

基于这个相关答案,您正在寻找在看起来像的列表中:AZ, az, 0-9, -, ., _, ~, :, /, ?, #, [, ], @, !, $, &, ', (, ), *, +;%=.其他所有内容都必须url 编码.此外,其中一些字符只能存在于 URI 中非常特定的位置,并且在这些位置之外必须进行 url 编码(例如,% 只能与 中的 url 编码结合使用%20),RFC 包含所有这些细节.

Based on this related answer, you are looking at a list that looks like: A-Z, a-z, 0-9, -, ., _, ~, :, /, ?, #, [, ], @, !, $, &, ', (, ), *, +, ,, ;, %, and =. Everything else must be url-encoded. Also, some of these characters can only exist in very specific spots in a URI and outside of those spots must be url-encoded (e.g. % can only be used in conjunction with url encoding as in %20), the RFC has all of these specifics.

这篇关于URL 中哪些字符是有效的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆