python pandas特殊字符作为分隔符 [英] python pandas special character as delimiter
问题描述
我有一个带有特殊字符[˛]作为分隔符的文本文件.我在read_csv命令中复制粘贴此特殊字符作为分隔符,但出现以下错误:
I have a text file with a special character [˛] as a delimiter. I copy pasted this special character as a delimiter in my read_csv command and I am getting the following error:
ParserWarning: Falling back to the 'python' engine because the
separator encoded in utf-8 is > 1 char long, and the 'c' engine does
not support such separators; you can avoid this warning by specifying
engine='python'.
"""Entry point for launching an IPython kernel.
您知道在阅读文本文件时如何使用特殊字符吗?
Any idea how to use a special character while reading a text file?
推荐答案
您只会收到警告,并且删除它的解决方案非常简单-添加engine='python'
.
You get only warning and solution for remove it is very easy - add engine='python'
.
在内部,pandas使用在
C
中实现的快速高效的解析器以及目前功能更完整的python实现.可能的情况下,熊猫使用C解析器(指定为engine='c'
),但如果指定了C不支持的选项,则熊猫可能会使用python.目前,C不支持的选项包括:
Under the hood pandas uses a fast and efficient parser implemented in
C
as well as a python implementation which is currently more feature-complete. Where possible pandas uses the C parser (specified asengine='c'
), but may fall back to python if C-unsupported options are specified. Currently, C-unsupported options include:
- 除单个字符外的其他字符(例如正则表达式分隔符)
- skipfooter
- sep =无,delim_whitespace = False
- sep other than a single character (e.g. regex separators)
- skipfooter
- sep=None with delim_whitespace=False
指定以上任何选项都将产生ParserWarning,除非使用
engine='python'
明确选择了python引擎.
Specifying any of the above options will produce a ParserWarning unless the python engine is selected explicitly using
engine='python'
.
import pandas as pd
from pandas.compat import StringIO
temp=u"""a˛b˛c
1˛3˛5
7˛8˛1
"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), sep="˛", engine='python')
print (df)
a b c
0 1 3 5
1 7 8 1
这篇关于python pandas特殊字符作为分隔符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!