在文件中读取中文字符并将其发送到浏览器 [英] Reading Chinese characters in a file and sending them to a browser

查看:107
本文介绍了在文件中读取中文字符并将其发送到浏览器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个程序:

I'm trying to make a program that:


  • 从文件中读取中文字符列表,从中生成一个字典(将符号与其含义相关联)。

  • 选择一个随机字符,并在获得 BaseHTTPServer 模块时将其发送到浏览器一个GET请求。

  • reads a list of Chinese characters from a file, makes a dictionary from them (associating a sign with its meaning).
  • picks a random character and sends it to the browser using the BaseHTTPServer module when it gets a GET request.

一旦我设法正确阅读和存储标志(我试着将它们写入另一个文件以检查我是否有它们是正确的并且它有效)我无法弄清楚如何将它们发送到我的浏览器。

Once I managed to read and store the signs properly (I tried writing them into another file to check that I got them right and it worked) I couldn't figure out how to send them to my browser.

我连接到127.0.0.1:4321并且我管理的最好是一个(据说)url编码的中文字符及其翻译。

I connect to 127.0.0.1:4321 and the best I've managed is to get a (supposedly) url-encoded Chinese character, with its translation.

代码:

# -*- coding: utf-8 -*-
import codecs
from BaseHTTPServer import HTTPServer, BaseHTTPRequestHandler
from SocketServer import ThreadingMixIn
import threading
import random
import urllib

source = codecs.open('./signs_db.txt', 'rb', encoding='utf-16')

# Checking utf-16 works fine with chinese characters and stuff :
#out = codecs.open('./test.txt', 'wb', encoding='utf-16')
#for line in source:
#   out.write(line)

db = {}
next(source)
for line in source:
    if not line.isspace():
            tmp = line.split('\t')
            db[tmp[0]] = tmp[1].strip()

class Handler(BaseHTTPRequestHandler):

    def do_GET(self):
        self.send_response(200)
        self.end_headers()
        message =  threading.currentThread().getName()
        rKey = random.choice(db.keys())
        self.wfile.write(urllib.quote(rKey.encode("utf-8")) + ' : ' + db[rKey])
        self.wfile.write('\n')
        return

class ThreadedHTTPServer(ThreadingMixIn, HTTPServer):
    """Handle requests in a separate thread."""

if __name__ == '__main__':
    server = ThreadedHTTPServer(('localhost', 4321), Handler)
    print 'Starting server, use <Ctrl-C> to stop'
    server.serve_forever()

如果我不对中文字符进行urlencode ,我从python收到错误:

If I don't urlencode the chinese character, I get an error from python :

self.wfile.write(rKey + ' : ' + db[rKey])

这给了我这个:


UnicodeEncodeError:'ascii'编解码器不能编码位置0的字符u'\\\三':序数不在范围内(128)

UnicodeEncodeError : 'ascii' codec can't encode character u'\u4e09' in position 0 : ordinal not in range(128)

我也尝试使用'utf-16'编码/解码,我仍然会收到那种错误消息。

I've also tried encoding/decoding with 'utf-16', and I still get that kind of error messages.

这是我的测试档案:

Sign    Translation

一   One
二   Two
三   Three
四   Four
五   Five
六   Six
七   Seven
八   Eight
九   Nine
十   Ten

所以,我的问题是:如何让我的脚本中的中文字符在我的浏览器中正确显示?

So, my question is: "How can I get the Chinese characters coming from my script to display properly in my browser"?

推荐答案

通过编写元标记声明页面编码,并确保以UTF-8编码整个Unicode字符串:

Declare the encoding of your page by writing a meta tag and make sure to encode the entire Unicode string in UTF-8:

self.wfile.write(u'''\
    <html>
    <headers>
    <meta http-equiv="content-type" content="text/html;charset=UTF-8">
    </headers>
    <body>
    {} : {}
    </body>
    </html>'''.format(rKey,db[rKey]).encode('utf8'))

和/或声明HTTP内容类型:

And/or declare the HTTP content type:

self.send_response(200)
self.send_header('Content-Type','text/html; charset=utf-8')
self.end_headers()

这篇关于在文件中读取中文字符并将其发送到浏览器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆