使用pd.read_clipboard时,如何处理其中包含空格的列名称? [英] How do you handle column names having spaces in them when using pd.read_clipboard?

查看:140
本文介绍了使用pd.read_clipboard时,如何处理其中包含空格的列名称?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我很长一段时间以来遇到的一个实际问题.

This is a real problem I've faced for a long time.

采用此数据框:

         A         B  THRESHOLD
       NaN       NaN        NaN
 -0.041158 -0.161571   0.329038
  0.238156  0.525878   0.110370
  0.606738  0.854177  -0.095147
  0.200166  0.385453   0.166235

使用pd.read_clipboard复制非常容易.但是,如果列名之一带有空格:

It is easy enough to copy using pd.read_clipboard. However, if one of the column names has a space:

         A         B     Col #3
       NaN       NaN        NaN
 -0.041158 -0.161571   0.329038
  0.238156  0.525878   0.110370
  0.606738  0.854177  -0.095147
  0.200166  0.385453   0.166235

然后,它的读法如下:

          A         B       Col  #3
0       NaN       NaN       NaN NaN
1 -0.041158 -0.161571  0.329038 NaN
2  0.238156  0.525878  0.110370 NaN
3  0.606738  0.854177 -0.095147 NaN
4  0.200166  0.385453  0.166235 NaN

如何预防?

推荐答案

在这种情况下,我要做的是将所有列分开两个或多个空格,然后将sep ='\ s \ s +'用作分隔符,这样,当我确实具有单个空格的列标题(例如上方的Col#3)时,会将其视为一列.

What I do in this situation is that I make all my columns two or more spaces apart, then I use sep='\s\s+' for my delimiter, this way when I do have column headings with a single space such as, Col #3 above it treats it as one column.

         A         B     Col #3
       NaN       NaN        NaN
 -0.041158  -0.161571   0.329038
  0.238156   0.525878   0.110370
  0.606738   0.854177  -0.095147
  0.200166   0.385453   0.166235

df = pd.read_clipboard(sep='\s\s+')

您确实会收到此警告,但是您可以忽略此警告,因为这样做的正确性.或者,如果您的OCD表现最好,您可以放下engine='python'. :)

You do get this warning, but you can ignore it since it as done it right. Or you could put the engine='python' if your OCD gets the best of you. :)

C:\ Program 文件\ Anaconda3 \ lib \ site-packages \ pandas \ io \ clipboards.py:63: ParserWarning:回退到"python"引擎,因为"c" 引擎不支持正则表达式分隔符(分隔符> 1个字符,并且 与'\ s +'不同的被解释为regex);你可以避免这个 通过指定engine ='python'发出警告.返回 read_table(StringIO(text),sep = sep,** kwargs)

C:\Program Files\Anaconda3\lib\site-packages\pandas\io\clipboards.py:63: ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators (separators > 1 char and different from '\s+' are interpreted as regex); you can avoid this warning by specifying engine='python'. return read_table(StringIO(text), sep=sep, **kwargs)

print(df)

          A         B    Col #3
0       NaN       NaN       NaN
1 -0.041158 -0.161571  0.329038
2  0.238156  0.525878  0.110370
3  0.606738  0.854177 -0.095147
4  0.200166  0.385453  0.166235

这篇关于使用pd.read_clipboard时,如何处理其中包含空格的列名称?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆