UnicodeEncodeError:'ascii'编解码器无法在位置34处编码字符u'\ u201c':序数不在范围内(128) [英] UnicodeEncodeError: 'ascii' codec can't encode character u'\u201c' in position 34: ordinal not in range(128)

查看:163
本文介绍了UnicodeEncodeError:'ascii'编解码器无法在位置34处编码字符u'\ u201c':序数不在范围内(128)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在研究一个程序,以从堆栈溢出中检索问题.直到昨天该程序运行良好,但是从今天开始我就收到了错误

I have been working on a program to retrieve questions from stack overflow. Till yesterday the program was working fine, but since today I'm getting the error

"Message    File Name   Line    Position    
Traceback               
<module>    C:\Users\DPT\Desktop\questions.py   13      
UnicodeEncodeError: 'ascii' codec can't encode character u'\u201c' in position 34: ordinal not in range(128)"

当前正在显示问题,但我似乎无法将输出复制到新的文本文件中.

Currently the Questions are being displayed but I seem to be unable to copy the output to a new text file.

import sys
sys.path.append('.')
import stackexchange
so = stackexchange.Site(stackexchange.StackOverflow)
term= raw_input("Enter the keyword for Stack Exchange")
print 'Searching for %s...' % term,
sys.stdout.flush()
qs = so.search(intitle=term)
print '\r--- questions with "%s" in title ---' % (term)
for q in qs:
  print '%8d %s' % (q.id, q.title)
  with open('E:\questi.txt', 'a+') as question:
     question.write(q.title)

 time.sleep(10)
 with open('E:\questi.txt') as intxt:
   data = intxt.read()

regular = re.findall('[aA-zZ]+', data)
print(regular)

tokens = set(regular)

with open('D:\Dictionary.txt', 'r') as keywords:
  keyset = set(keywords.read().split())


with open('D:\Questionmatches.txt', 'w') as matches:
  for word in keyset:
    if word in tokens:
        matches.write(word + '\n')

推荐答案

q.title是Unicode字符串.将其写入文件时,首先需要对其进行编码,最好是完全支持Unicode的编码,例如UTF-8(如果不这样做,Python将默认使用不支持任何字符的ASCII编解码器127上方的代码点.)

q.title is a Unicode string. When writing that to a file, you need to encode it first, preferably a fully Unicode-capable encoding such as UTF-8 (if you don't, Python will default to using the ASCII codec which doesn't support any character codepoint above 127).

question.write(q.title.encode("utf-8"))

应该解决问题.

顺便说一句,程序在字符(U+201C)上跳了起来.

By the way, the program tripped up on character " (U+201C).

这篇关于UnicodeEncodeError:'ascii'编解码器无法在位置34处编码字符u'\ u201c':序数不在范围内(128)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆