使用 htaccess 从 URL 中删除字符 [英] Remove Characters from URL with htaccess

查看:56
本文介绍了使用 htaccess 从 URL 中删除字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

希望有人能看到我做错了什么,但这就是故事......

Hopefully someone can see what I'm doing wrong, but here's the story...

我当前的网站 URL 是由电子商务软件根据产品和类别名称自动生成的,因此,如果产品/类别名称包含非字母数字字符,则会在 URL 中进行编码,这很麻烦.EG:

My current site URL's are auto-generated by the ecommerce software from the product and category names, therefore if the product/category name includes a non-alphanumeric characer, this is encoded in the URL which is a pain. EG:

mysite.com/Shop/Furniture-Set-Large-Table%2C-4-Chairs.html

我正在转向一种新的电子商务解决方案,该解决方案还会根据产品名称自动生成 URL,但它足够聪明,可以删除所有非字母数字字符.它还转换为小写,我设法找到了一个 htaccess 解决方案,用于将大写重定向到小写.它也没有 URL 的商店"部分,我也设法通过 htaccess 解决了这个问题.EG:

I am moving to a new ecommerce solution, which also autogenerates the URL's from the product name, but is clever enough to remove all non-alphanumeric characters. It also converts to lowercase, which I have managed to find a htaccess solution for redirecting uppercase to lowercase. It also does not have the 'Shop' part of the URL, which I have also managed to solve via htaccess. EG:

mysite.com/furniture-set-large-table-4-chairs.html

要删除商店"部分:

RedirectMatch 301 ^/Shop/(.*)$ http://www.mysite.com/$1

用小写替换大写以防止 404 错误:

To replace uppercase with lowercase to prevent a 404 error:

RewriteCond %{REQUEST_URI} [A-Z]
RewriteCond %{REQUEST_FILENAME} !\.(?:png|gif|ico|swf|jpg|jpeg|js|css|php|pdf)$
RewriteRule (.*) ${lc:http://www.mysite.com/$1} [R=301,L]

这两个都完美无缺.

所以我需要一个 htaccess 规则,或者可能有几个,从 URL 中删除这些编码字符.我不需要替换它们,只需将它们删除即可,因为软件会将 URL 创建为Table%2C-4-Chairs"——因此只需要删除 %2C.

So I need an htaccess rule, or possibly several, to remove these encoded characters from the URL. I don't need to replace them, just remove them, because the software creates the URL as "Table%2C-4-Chairs" - so only the %2C needs removed.

我需要从 URL 中删除某些字符编码,例如:

I need to remove certain character encodings from the URL, such as:

逗号 (%2C)、撇号 (%27)、冒号 (%3A) 等

comma (%2C), apostrophe (%27), colon (%3A), etc.

任何人都可以为此建议一个或多个合适的 htaccess 规则吗?

Can anyone advise a suitable htaccess rule or rules for this?

提前致谢.

推荐答案

URI 在通过重写引擎发送之前经过 url 解码,因此您希望匹配实际字符而不是其编码对应物:

The URI is url-decoded before it's sent through the rewrite engine, so you want to match the actual characters and not their encoded counterparts:

RewriteRule ^(.*),(.*)$ /$1$2 [L]
RewriteRule ^(.*):(.*)$ /$1$2 [L]
RewriteRule ^(.*)\'(.*)$ /$1$2 [L]
RewriteRule ^(.*)\"(.*)$ /$1$2 [L]
# etc...

RewriteCond %{ENV:REDIRECT_STATUS} 200
RewriteRule ^(.*)$ http://www.mysite.com/$1 [L,R=301]

重定向状态让 mod rewrite 知道如果应用了上述任何规则(从而使内部重定向状态值 = 200),那么我们需要重定向,但在清除之前我们不会到达规则的那部分所有特殊字符检查.

The redirect status lets mod rewrite know that if any of the above rules got applied (thus making the internal redirect status value = 200) then we need to redirect, but we won't reach that part of the rules until it's cleared all of the special character checks.

您希望所有这些规则在任何重定向之前,以便规则可以循环并删除任何这些字符的多个实例.然后,一旦没有更多的特殊字符,重写引擎就可以深入到您的重定向所在的位置.

You'd want these rules all before any of the redirects so that the rules can loop and remove multiple instances of any of those characters. Then, once there are no more special characters, the rewrite engine can trickle down to where your redirects are.

我建议您删除 mod_alias RedirectMatch 指令并将其替换为重写规则.有时,将这两个模块组合在一起并使它们都影响单个 URI 会导致意外结果.所以上述所有规则之前,您必须:

I'd suggest that you remove the mod_alias RedirectMatch directive and replace it with a rewrite rule. Sometimes combining the 2 modules and having both of them affect a single URI can lead to unexpected results. so before all of the above rules, you'd have:

RewriteRule ^Shop/(.*)$ /$1 [L]

在特殊字符链中添加去除/Shop/.那么你的最后一条规则将遵循:

adding the removal of /Shop/ in the chain of special characters. Then your last rule would follow:

RewriteCond %{REQUEST_URI} [A-Z]
RewriteCond %{REQUEST_FILENAME} !\.(?:png|gif|ico|swf|jpg|jpeg|js|css|php|pdf)$
RewriteRule (.*) ${lc:http://www.mysite.com/$1} [R=301,L]

这篇关于使用 htaccess 从 URL 中删除字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆