UTF-8编码URL [英] UTF-8 encode URLs

查看:339
本文介绍了UTF-8编码URL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

信息:



我有一个为Google网站管理员工具生成XML站点地图的程序(以及其他内容)。

GWTs给我某些网站地图的错误,因为网址包含字符序列,例如ã¾,ã<,ã...等。



GWTs说:


我们要求您的Sitemap文件必须采用UTF-8编码(通常在保存文件时可以这样做)。与所有XML文件一样,任何数据值(包括网址)都必须为字符& ' <


特殊字符在XML (含HTML实体)。

XML文件片段:

 <?xml version =1.0 encoding =UTF-8?> 
< urlset xmlns =http://www.sitemaps.org/schemas/sitemap/0.9>
< url>
< loc> http:// domain / folder / listing-&#227;&#129; .shtml< / loc>
...
/ pre>




我的网址是否编码为UTF-8?



如果没有,如何在Java 中执行此操作?

以下是程序中我将网址添加到站点地图的行:

  siteMap.addUrl(StringEscapeUtils.escapeXml(countryName +/+ twoCharFile.getRelativeFileName()。toLowerCase())); 






** =可能是前两个例子。



我为所有编辑道歉。

解决方案

尝试使用 URLEncoder.encode(stringToBeEncoded,UTF-8)编码网址。


Info:

I've a program which generates XML sitemaps for Google Webmaster Tools (among other things).
GWTs is giving me errors for some sitemaps because the URLs contain character sequences like ã¾, ã‹, ã€, etc. **

GWTs says:

We require your Sitemap file to be UTF-8 encoded (you can generally do this when you save the file). As with all XML files, any data values (including URLs) must use entity escape codes for the characters: &, ', ", <, >.

The special characters are excaped in the XML files (with HTML entities).
XML file snippet:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <url>
        <loc>http://domain/folder/listing-&#227;&#129;.shtml</loc>
        ...


Are my URLs UTF-8 encoded?

If not, How do I do this in Java?
The following is the line in my program where I add the URL to the sitemap:

    siteMap.addUrl(StringEscapeUtils.escapeXml(countryName+"/"+twoCharFile.getRelativeFileName().toLowerCase()));


** = I'm not sure which ones are causing the error, probably the first two examples.

I apologize for all the editing.

解决方案

Try using URLEncoder.encode(stringToBeEncoded, "UTF-8") to encode the url.

这篇关于UTF-8编码URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆