使用Pandas读取数据(.dat文件) [英] Read data (.dat file) with Pandas

查看：1916 发布时间：2020/5/23 21:40:54 python pandas dataframe

本文介绍了使用Pandas读取数据(.dat文件)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

如何使用Pandas读取以下(两列)数据(来自.dat文件)

How do I read the following (two columns) data (from a .dat file) with Pandas

TIME                      XGSM
2004 006 01 00 01 37 600  1
2004 006 01 00 02 32 800  5
2004 006 01 00 03 28 000  8
2004 006 01 00 04 23 200  11
2004 006 01 00 05 18 400  17

列分隔符至少为2个空格.

Column separator is (at least) 2 spaces.

我尝试了

df = pd.read_table("test.dat", sep="\s+", usecols=['TIME', 'XGSM'])
print df

但可以打印

推荐答案

您可以按列顺序使用参数usecols:

You can use parameter usecols with order of columns:

import pandas as pd
from pandas.compat import StringIO

temp=u"""TIME             XGSM
2004 006 01 00 01 37 600  1
2004 006 01 00 02 32 800  5
2004 006 01 00 03 28 000  8
2004 006 01 00 04 23 200  11
2004 006 01 00 05 18 400  17"""
#after testing replace StringIO(temp) to filename
df = pd.read_csv(StringIO(temp), 
                 sep="\s+", 
                 skiprows=1, 
                 usecols=[0,7], 
                 names=['TIME','XGSM'])

print (df)
   TIME  XGSM
0  2004     1
1  2004     5
2  2004     8
3  2004    11
4  2004    17

您可以使用分隔符regex-2个及更多空格，然后添加engine='python'，因为警告:

You can use separator regex - 2 and more spaces and then add engine='python' because warning:

ParserWarning:回退到"python"引擎，因为"c"引擎不支持正则表达式分隔符(分隔符> 1个字符且与"\ s +"不同的分隔符被解释为正则表达式)；您可以通过指定engine ='python'来避免此警告.

ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators (separators > 1 char and different from '\s+' are interpreted as regex); you can avoid this warning by specifying engine='python'.

import pandas as pd
from pandas.compat import StringIO

temp=u"""TIME              XGSM
2004 006 01 00 01 37 600   1
2004 006 01 00 02 32 800   5
2004 006 01 00 03 28 000   8
2004 006 01 00 04 23 200   11
2004 006 01 00 05 18 400   17"""
#after testing replace StringIO(temp) to filename
df = pd.read_csv(StringIO(temp), sep=r'\s{2,}', engine='python')

print (df)
                       TIME  XGSM
0  2004 006 01 00 01 37 600     1
1  2004 006 01 00 02 32 800     5
2  2004 006 01 00 03 28 000     8
3  2004 006 01 00 04 23 200    11
4  2004 006 01 00 05 18 400    17

这篇关于使用Pandas读取数据(.dat文件)的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用Pandas读取数据(.dat文件) [英] Read data (.dat file) with Pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用Pandas读取数据(.dat文件) [英] Read data (.dat file) with Pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭