python3 UnicodeEncodeError：'charmap'编解码器无法在位置95-98处编码字符：字符映射为< undefined> [英] python3 UnicodeEncodeError: 'charmap' codec can't encode characters in position 95-98: character maps to <undefined>

查看：149 发布时间：2020/10/28 18:37:43 python-3.x elasticsearch-plugin python-unicode

本文介绍了python3 UnicodeEncodeError：'charmap'编解码器无法在位置95-98处编码字符：字符映射为< undefined>的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

一个月前，我遇到了这个Github： https://github.com/taraslayshchuk/es2csv

A month ago I encountered this Github: https://github.com/taraslayshchuk/es2csv

我在Linux ubuntu中通过pip3安装了此软件包。当我想使用该软件包时，遇到了该软件包用于python2的问题。我深入研究代码，很快就发现了问题。

I installed this package via pip3 in Linux ubuntu. When I wanted to use this package, I encountered the problem that this package is meant for python2. I dived into the code and soon I found the problem.

                for line in open(self.tmp_file, 'r'):
                timer += 1
                bar.update(timer)
                line_as_dict = json.loads(line)
                line_dict_utf8 = {k: v.encode('utf8') if isinstance(v, unicode) else v for k, v in line_as_dict.items()}
                csv_writer.writerow(line_dict_utf8)
            output_file.close()
            bar.finish()
        else:
            print('There is no docs with selected field(s): %s.' % ','.join(self.opts.fields))

该代码检查了unicode，在python3中这不是必需的，因此，我将代码更改为以下代码。结果，该软件包在Ubuntu 16下正常工作。

The code did a check for unicode, this is not necessary within python3 Therefore, I changed the code to the code below. As result, The package worked properly under Ubuntu 16.

                for line in open(self.tmp_file, 'r'):
                timer += 1
                bar.update(timer)
                line_as_dict = json.loads(line)
                # line_dict_utf8 = {k: v.encode('utf8') if isinstance(v, unicode) else v for k, v in line_as_dict.items()}
                csv_writer.writerow(line_as_dict)
            output_file.close()
            bar.finish()
        else:
            print('There is no docs with selected field(s): %s.' % ','.join(self.opts.fields))

但是一个月后，有必要使es2csv软件包在Windows 10操作系统上运行。在Windows 10下使用es2csv进行完全相同的调整后，在尝试运行es2csv之后收到以下错误消息：

But a month later, it was necessary to get the es2csv package working on a Windows 10 operating system. After doing the exact same adjustments with es2csv under Windows 10, I received the following error message after I tried to run es2csv:

    PS C:\> es2csv -u 192.168.230.151:9200 -i scrapy -o database.csv -q '*'
Found 218 results
Run query [#######################################################################################################################] [218/218] [100%] [0:00:00] [Time: 0:00:00] [  2.3 Kidocs/s]
Write to csv [#                                                                                                                     ] [2/218] [  0%] [0:00:00] [ETA: 0:00:00] [  3.9 Kilines/s]T
raceback (most recent call last):
  File "C:\Users\admin\AppData\Local\Programs\Python\Python36\Scripts\es2csv-script.py", line 11, in <module>
    load_entry_point('es2csv==5.2.1', 'console_scripts', 'es2csv')()
  File "c:\users\admin\appdata\local\programs\python\python36\lib\site-packages\es2csv.py", line 284, in main
    es.write_to_csv()
  File "c:\users\admin\appdata\local\programs\python\python36\lib\site-packages\es2csv.py", line 238, in write_to_csv
    csv_writer.writerow(line_as_dict)
  File "c:\users\admin\appdata\local\programs\python\python36\lib\csv.py", line 155, in writerow
    return self.writer.writerow(self._dict_to_list(rowdict))
  File "c:\users\admin\appdata\local\programs\python\python36\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 95-98: character maps to <undefined>

是否有人知道如何解决此错误消息？

Does anyone has an idea how to fix this error message?

推荐答案

这是由于Python 3中 open 的默认行为所致。默认情况下，Python 3将在文本模式，这意味着它还必须为读取的每个字符应用文本解码，例如utf-8或ASCII。

It's due to the default behaviour of open in Python 3. By default, Python 3 will open files in Text mode, which means that it also has to apply a text decoding, such as utf-8 or ASCII, for every character it reads.

Python将使用您的语言环境来确定最合适的编码。在OS X和Linux上，这通常是UTF-8。在Windows上，它将使用8位字符集（例如Windows-1252）来匹配记事本的行为。

Python will use your locale to determine the most suitable encoding. On OS X and Linux, this is usually UTF-8. On Windows, it'll use an 8-bit character set, such windows-1252, to match the behaviour of Notepad.

因为8位字符集仅具有字符数有限，最终很容易尝试编写字符集不支持的字符。例如，如果您尝试使用Windows-1252（西欧字符集）编写希伯来语字符。

As an 8-bit character set only has a limited number of characters, it's very easy to end up trying to write a character not supported by the character set. For example, if you tried to write a Hebrew character with Windows-1252, the Western European character set.

要解决您的问题，只需覆盖自动编码在打开中进行选择并对其进行硬编码以使用UTF-8：

To resolve your problem, you simply need to override the automatic encoding selection in open and hardcode it to use UTF-8:

for line in open(self.tmp_file, 'r', encoding='utf-8'):

这篇关于python3 UnicodeEncodeError：'charmap'编解码器无法在位置95-98处编码字符：字符映射为< undefined>的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

python3 UnicodeEncodeError：'charmap'编解码器无法在位置95-98处编码字符：字符映射为< undefined> [英] python3 UnicodeEncodeError: 'charmap' codec can't encode characters in position 95-98: character maps to <undefined>

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

python3 UnicodeEncodeError：'charmap'编解码器无法在位置95-98处编码字符：字符映射为&lt; undefined&gt; [英] python3 UnicodeEncodeError: &#39;charmap&#39; codec can&#39;t encode characters in position 95-98: character maps to &lt;undefined&gt;

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

python3 UnicodeEncodeError：'charmap'编解码器无法在位置95-98处编码字符：字符映射为< undefined> [英] python3 UnicodeEncodeError: 'charmap' codec can't encode characters in position 95-98: character maps to <undefined>

登录关闭