Python / Pandas：如何在cp1252中读取要删除第一行的csv？ [英] Python/Pandas : how to read a csv in cp1252 with a first row to delete?

查看：536 发布时间：2020/10/12 20:39:06 python pandas csv

本文介绍了Python / Pandas：如何在cp1252中读取要删除第一行的csv？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

请参见答案，它不是在CP1252中编码的，而是在UTF-16中编码的。解决方案代码为：

See answer, it was not encoded in CP1252 but in UTF-16 . Solution code is :

import pandas as pd

df = pd.read_csv('my_file.csv', sep='\t', header=1, encoding='utf-16')

也可用于 encoding ='utf-16-le'

更新：前3行的输出（以字节为单位）：

Update : output of the first 3 lines in bytes :

In : import itertools 
...:  print(list(itertools.islice(open('file_T.csv', 'rb'), 3)))

Out : [b'\xff\xfe"\x00D\x00u\x00 \x00m\x00e\x00r\x00c\x00r\x00e\x00d\x00i\x00 \x000\x005\x00 \x00j\x00u\x00i\x00n\x00 \x002\x000\x001\x009\x00 \x00a\x00u\x00 \x00m\x00e\x00r\x00c\x00r\x00e\x00d\x00i\x00 \x000\x005\x00 \x00j\x00u\x00i\x00n\x00 \x002\x000\x001\x009\x00\n', b'\x00"\x00\t\x00\t\x00\t\x00\t\x00\t\x00\t\x00\t\x00\t\x00\t\x00\n', b'\x00C\x00o\x00d\x00e\x00 \x00M\x00C\x00U\x00\t\x00I\x00m\x00m\x00a\x00t\x00r\x00i\x00c\x00u\x00l\x00a\x00t\x00i\x00o\x00n\x00\t\x00D\x00a\x00t\x00e\x00\t\x00h\x00e\x00u\x00r\x00e\x00\t\x00V\x00i\x00t\x00e\x00s\x00s\x00e\x00\t\x00L\x00a\x00t\x00i\x00t\x00u\x00d\x00e\x00\t\x00L\x00o\x00n\x00g\x00i\x00t\x00u\x00d\x00e\x00\t\x00T\x00y\x00p\x00e\x00\t\x00E\x00n\x00t\x00r\x00\xe9\x00e\x00\t\x00E\x00t\x00a\x00t\x00\n']

我正在处理原始格式为的csv文件：

I'm working with csv files whose raw form is :

问题在于它具有两个共同引起问题的功能：

The problem is that it has two features raising a problem together :

第一行不是标题

the first row is not the header

标题Entrée中带有重音符号，这会引起UnicodeDecode错误如果我不精确编码cp1252

There is an accent in header "Entrée", which raises an UnicodeDecode Error if I don't precise the encoding cp1252

我正在使用Python 3.X和熊猫处理这些文件。

I'm using Python 3.X and pandas to deal with these files.

但是当我尝试阅读时

import pandas as pd 

df_T = pd.read_csv('file_T.csv', header=1, sep=';', encoding = 'cp1252')
print(df_T)

我得到以下输出（与 header = 0 相同）：

I get the following output (same with header=0):

为了正确读取csv，我需要：

In order to read the csv correctly, I need to :

消除重音

并忽略/删除第一行（无论如何我都不需要）。

如何实现？ p：

How can I achieve that ?

PS：我知道我可以为此制作VBA程序或其他东西，但是我宁愿
也不能。我有兴趣将其包含在我的Python程序或
中，以确保这是不可能的。

PS : I know I could make a VBA program or something for this, but I'd rather not. I'm interested in including it in my Python program, or in knowing for sure that it is not possible.

Python / Pandas：如何在cp1252中读取要删除第一行的csv？ [英] Python/Pandas : how to read a csv in cp1252 with a first row to delete?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python / Pandas：如何在cp1252中读取要删除第一行的csv？ [英] Python/Pandas : how to read a csv in cp1252 with a first row to delete?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭