从URL删除字符与htaccess的 [英] Remove Characters from URL with htaccess

查看:316
本文介绍了从URL删除字符与htaccess的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

希望有人能看到我在做什么错了,但这里的故事...

Hopefully someone can see what I'm doing wrong, but here's the story...

我的当前网站的URL被从产品和类别名称的电子商务软件自动生成的,因此如果产品/类别名称包含非字母数字characer,这是恩codeD中这是一种痛苦的网址。例如:

My current site URL's are auto-generated by the ecommerce software from the product and category names, therefore if the product/category name includes a non-alphanumeric characer, this is encoded in the URL which is a pain. EG:

mysite.com/Shop/Furniture-Set-Large-Table%2C-4-Chairs.html

我移动到一个新的电子商务解决方案,这也是自动生成的URL的从产品的名称,但非常聪明,删除所有非字母数字字符。它也转换成小写,这是我设法找到重定向大写字母为小写htaccess的解决方案。它也没有网址,我还设法通过htaccess的解决的'店铺'的一部分。例如:

I am moving to a new ecommerce solution, which also autogenerates the URL's from the product name, but is clever enough to remove all non-alphanumeric characters. It also converts to lowercase, which I have managed to find a htaccess solution for redirecting uppercase to lowercase. It also does not have the 'Shop' part of the URL, which I have also managed to solve via htaccess. EG:

mysite.com/furniture-set-large-table-4-chairs.html

要删除店铺的一部分:

RedirectMatch 301 ^/Shop/(.*)$ http://www.mysite.com/$1

要与小写字母代替大写,以prevent 404错误:

To replace uppercase with lowercase to prevent a 404 error:

RewriteCond %{REQUEST_URI} [A-Z]
RewriteCond %{REQUEST_FILENAME} !\.(?:png|gif|ico|swf|jpg|jpeg|js|css|php|pdf)$
RewriteRule (.*) ${lc:http://www.mysite.com/$1} [R=301,L]

这些既很好地工作。

所以,我需要一个htaccess规则,或可能的几个,除去这些连接的URL codeD字符。我并不需要更换,只是删除它们,因为该软件创建的URL为表%2C-4把椅子 - 所以才有了%2C需要删除

So I need an htaccess rule, or possibly several, to remove these encoded characters from the URL. I don't need to replace them, just remove them, because the software creates the URL as "Table%2C-4-Chairs" - so only the %2C needs removed.

我需要删除URL某些字符编码,如:

I need to remove certain character encodings from the URL, such as:

逗号(%2C),撇号(27%),冒号(%3A),等等。

comma (%2C), apostrophe (%27), colon (%3A), etc.

任何人都可以建议一个合适的htaccess的规则或规则呢?

Can anyone advise a suitable htaccess rule or rules for this?

在此先感谢。

推荐答案

URI是URL德codeD之前,它通过重写引擎发送的,所以要符合实际的人物,而不是他们的EN codeD同行:

The URI is url-decoded before it's sent through the rewrite engine, so you want to match the actual characters and not their encoded counterparts:

RewriteRule ^(.*),(.*)$ /$1$2 [L]
RewriteRule ^(.*):(.*)$ /$1$2 [L]
RewriteRule ^(.*)\'(.*)$ /$1$2 [L]
RewriteRule ^(.*)\"(.*)$ /$1$2 [L]
# etc...

RewriteCond %{ENV:REDIRECT_STATUS} 200
RewriteRule ^(.*)$ http://www.mysite.com/$1 [L,R=301]

重定向状态允许mod-rewrite知道,如果上述任何规则得到了应用(从而使内部重定向状态值= 200),那么我们需要重定向,但我们不会达到规则的一部分,直到它被清除所有的特殊字符检查。

The redirect status lets mod rewrite know that if any of the above rules got applied (thus making the internal redirect status value = 200) then we need to redirect, but we won't reach that part of the rules until it's cleared all of the special character checks.

您会希望这些规则所有的之前任何重定向,使得规则可以循环,并删除任何这些字符的多个实例。然后,一旦有没有更多的特殊字符,重写引擎可以向下滴到您的重定向的。

You'd want these rules all before any of the redirects so that the rules can loop and remove multiple instances of any of those characters. Then, once there are no more special characters, the rewrite engine can trickle down to where your redirects are.

我建议您删除mod_alias中 RedirectMatch 指令,并将其与一个重写规则取代。有时结合2个模块,并具有两者的影响单个URI可能会导致意想不到的结果。这样的的上述所有的规则,你必须:

I'd suggest that you remove the mod_alias RedirectMatch directive and replace it with a rewrite rule. Sometimes combining the 2 modules and having both of them affect a single URI can lead to unexpected results. so before all of the above rules, you'd have:

RewriteRule ^Shop/(.*)$ /$1 [L]

添加删除 /店/ 的特殊字符链。那么你的最后一条规则将遵循:

adding the removal of /Shop/ in the chain of special characters. Then your last rule would follow:

RewriteCond %{REQUEST_URI} [A-Z]
RewriteCond %{REQUEST_FILENAME} !\.(?:png|gif|ico|swf|jpg|jpeg|js|css|php|pdf)$
RewriteRule (.*) ${lc:http://www.mysite.com/$1} [R=301,L]

这篇关于从URL删除字符与htaccess的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆