报价转向问号 [英] Quotation marks turn to question marks
问题描述
所以我有一个 ruby 脚本来解析HTML页面,并将提取的字符串保存到一个DB ...
中,但是我得到了被忽略的字符(通常是问号)而不是普通的文本...
So I have a ruby script that parses HTML pages and saves the extracted string into a DB... but i'm getting weired charcters (usually question marks) instead of plain text...
例如:一些TEXT而不是一些文本
Eg : ‘SOME TEXT’ instead of 'Some Text'
I我试过HTML实体和CGI :: unescape ...但无济于事
做了一些谷歌设置$ KCODE ='u'&需要'jcode'
仍然无法正常工作...
I've tried HTML entities and CGI::unescape ... but to no avail... did some googling n set $KCODE = 'u' & require 'jcode' still not working...
任何建议/指针都会很好
any suggestions /pointers would be great
谢谢
PS:使用mysql 5.1
PS : using mysql 5.1
推荐答案
您的脚本是在数据库中存储用于引号的Unicode转义序列(而不是ASCII引号)。
Your script is storing the Unicode escape sequences for quotation marks (instead of ASCII quotation marks) in the database.
这实际上很好 - 这表明数据库本身工作正常,尽管最好的结果应该是确保表被设置为使用'utf8_collation_ci',以便字符串排序正常工作。
That's actually good - it shows that the DB itself is working fine, although for best results you should ensure that the table is set to use 'utf8_collation_ci' so that string sorting works properly.
输出是显示的事实作为â€只是意味着您的终端(和/或网页)输出编码不正确。
The fact that the output is displayed as "‘" just means that your terminal (and/or web page) output encoding is incorrect.
如果是终端输出,请确保将 $ ENV {'LANG'}
设置为适当的UTF8编码(例如 en.UTF-8
),并且终端模拟器本身设置方式相同。
If it's terminal output, make sure that $ENV{'LANG'}
is set to the appropriate UTF8 encoding (e.g. en.UTF-8
) and that the terminal emulator itself is set the same way.
如果是HTML输出,请使否则页面编码也设置为 UTF-8
,即:
If it's HTML output, make sure that the page encoding is set to UTF-8
as well, i.e.:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
这篇关于报价转向问号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!