从python中的.dat文件读取和做计算 [英] reading and doing calculation from .dat file in python

查看:14904
本文介绍了从python中的.dat文件读取和做计算的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在python中读取一个.dat文件,共有12列和数百万行的行。我需要将列2,3和4与列1分开,以进行计算。所以在加载.dat文件之前,我需要删除所有其他不需要的列吗?如果没有,我如何选择性地声明列,并要求python做数学?



.dat文件的示例为
data.dat



我是python的新用户,所以一些小指令打开,读取和计算



我已经添加了我使用的代码,作为您的建议:

  from sys import argv 

import pandas as pd



script,filename = argv

txt = open(filename)

print这是你的文件%r:%filename
print txt.read()

def your_func :
return row ['x-momentum'] / row ['mass']

columns_to_keep = ['mass','x-momentum']
dataframe = pd。 read_csv('〜/ Pictures',delimiter =,,usecols = columns_to_keep)
dataframe ['new_column'] = dataframe.apply(your_func,axis = 1)



,以及我遇到的错误:

  Traceback(最近一次调用):
文件flash.py,第18行,在< module>
dataframe = pd.read_csv('〜/ Pictures',delimiter =,,usecols = columns_to_keep)
文件/home/trina/anaconda2/lib/python2.7/site-packages/pandas /io/parsers.py,line 529,在parser_f
中return _read(filepath_or_buffer,kwds)
文件/home/trina/anaconda2/lib/python2.7/site-packages/pandas/io /parsers.py,第295行,在_read
中parser = TextFileReader(filepath_or_buffer,** kwds)
文件/home/trina/anaconda2/lib/python2.7/site-packages/pandas/ io / parsers.py,行612,在__init__
self._make_engine(self.engine)
文件/home/trina/anaconda2/lib/python2.7/site-packages/pandas/io /parsers.py,行747,在_make_engine
中self._engine = CParserWrapper(self.f,** self.options)
文件/home/trina/anaconda2/lib/python2.7/ site-packages / pandas / io / parsers.py,第1119行,在__init__
self._reader = _parser.TextReader(src,** kwds)
文件pandas / parser.pyx,行518,in pandas.parser.TextReader .__ cinit__(pandas / parser.c:5030)
ValueError:没有要从文件中解析的列


解决方案

查看你的 flash.dat 文件后,很明显你需要做一点清理之前你处理它。以下代码将其转换为CSV文件:

  import csv 

#读取flash.dat列表列表
datContent = [i.strip()。split()for i in open(./ flash.dat)。readlines()]

#一个新的CSV文件
打开(./ flash.csv,wb)为f:
writer = csv.writer(f)
writer.writerows(datContent)

现在,使用Pandas计算新列。

  import pandas as pd 

def your_func(row):
return row ['x-momentum'] / row ['mass']

columns_to_keep = ['#time','x-momentum','mass']
dataframe = pd.read_csv(./ flash.csv,usecols = columns_to_keep)
dataframe ['new_column'] = dataframe.apply(your_func,axis = 1)

print dataframe


I need to read a .dat file in python which has 12 columns in total and millions of lines of rows. I need to divide column 2,3 and 4 with column 1 for my calculation. So before I load that .dat file, do I need to delete all the other unwanted columns? If not, how do I selectively declare the column and ask python to do the math?

an example of the .dat file would be data.dat

I am new to python , so a little instruction to open , read and calculation would be appreciated.

I have added the code I am using as a starter from your suggestion:

from sys import argv

import pandas as pd



script, filename = argv

txt = open(filename)

print "Here's your file %r:" % filename
print txt.read()

def your_func(row):
    return row['x-momentum'] / row['mass']

columns_to_keep = ['mass', 'x-momentum']
dataframe = pd.read_csv('~/Pictures', delimiter="," , usecols=columns_to_keep)
dataframe['new_column'] = dataframe.apply(your_func, axis=1)

and also the error I get through it:

Traceback (most recent call last):
  File "flash.py", line 18, in <module>
    dataframe = pd.read_csv('~/Pictures', delimiter="," , usecols=columns_to_keep)
  File "/home/trina/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.py", line 529, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/home/trina/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.py", line 295, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/home/trina/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.py", line 612, in __init__
    self._make_engine(self.engine)
  File "/home/trina/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.py", line 747, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "/home/trina/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.py", line 1119, in __init__
    self._reader = _parser.TextReader(src, **kwds)
  File "pandas/parser.pyx", line 518, in pandas.parser.TextReader.__cinit__ (pandas/parser.c:5030)
ValueError: No columns to parse from file

解决方案

After looking at your flash.dat file, it's clear you need to do a little clean up before you process it. The following code converts it to a CSV file:

import csv

# read flash.dat to a list of lists
datContent = [i.strip().split() for i in open("./flash.dat").readlines()]

# write it as a new CSV file
with open("./flash.csv", "wb") as f:
    writer = csv.writer(f)
    writer.writerows(datContent)

Now, use Pandas to compute new column.

import pandas as pd

def your_func(row):
    return row['x-momentum'] / row['mass']

columns_to_keep = ['#time', 'x-momentum', 'mass']
dataframe = pd.read_csv("./flash.csv", usecols=columns_to_keep)
dataframe['new_column'] = dataframe.apply(your_func, axis=1)

print dataframe

这篇关于从python中的.dat文件读取和做计算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆