从python中的.dat文件读取和做计算 [英] reading and doing calculation from .dat file in python

查看：14904 发布时间：2017/2/24 21:34:57 python csv

本文介绍了从python中的.dat文件读取和做计算的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要在python中读取一个.dat文件，共有12列和数百万行的行。我需要将列2,3和4与列1分开，以进行计算。所以在加载.dat文件之前，我需要删除所有其他不需要的列吗？如果没有，我如何选择性地声明列，并要求python做数学？

.dat文件的示例为
data.dat

我是python的新用户，所以一些小指令打开，读取和计算

我已经添加了我使用的代码，作为您的建议：

  from sys import argv 
 
 import pandas as pd 
 
 
 
 script，filename = argv 
 
 txt = open（filename）
 
 print这是你的文件％r：％filename 
 print txt.read（）
 
 def your_func ：
 return row ['x-momentum'] / row ['mass'] 
 
 columns_to_keep = ['mass'，'x-momentum'] 
 dataframe = pd。 read_csv（'〜/ Pictures'，delimiter =，，usecols = columns_to_keep）
 dataframe ['new_column'] = dataframe.apply（your_func，axis = 1）
  
 
 
 ，以及我遇到的错误：
  Traceback（最近一次调用）：
文件flash.py，第18行，在< module> 
 dataframe = pd.read_csv（'〜/ Pictures'，delimiter =，，usecols = columns_to_keep）
文件/home/trina/anaconda2/lib/python2.7/site-packages/pandas /io/parsers.py，line 529，在parser_f 
中return _read（filepath_or_buffer，kwds）
文件/home/trina/anaconda2/lib/python2.7/site-packages/pandas/io /parsers.py，第295行，在_read 
中parser = TextFileReader（filepath_or_buffer，** kwds）
文件/home/trina/anaconda2/lib/python2.7/site-packages/pandas/ io / parsers.py，行612，在__init__ 
 self._make_engine（self.engine）
文件/home/trina/anaconda2/lib/python2.7/site-packages/pandas/io /parsers.py，行747，在_make_engine 
中self._engine = CParserWrapper（self.f，** self.options）
文件/home/trina/anaconda2/lib/python2.7/ site-packages / pandas / io / parsers.py，第1119行，在__init__ 
 self._reader = _parser.TextReader（src，** kwds）
文件pandas / parser.pyx，行518，in pandas.parser.TextReader .__ cinit__（pandas / parser.c：5030）
 ValueError：没有要从文件中解析的列
  
 
 
解决方案
查看你的 flash.dat 文件后，很明显你需要做一点清理之前你处理它。以下代码将其转换为CSV文件：
  import csv 
 
＃读取flash.dat列表列表
 datContent = [i.strip（）。split（）for i in open（./ flash.dat）。readlines（）] 
 
＃一个新的CSV文件
打开（./ flash.csv，wb）为f：
 writer = csv.writer（f）
 writer.writerows（datContent）
  
现在，使用Pandas计算新列。
  import pandas as pd 
 
 def your_func（row）：
 return row ['x-momentum'] / row ['mass'] 
 
 columns_to_keep = ['#time'，'x-momentum'，'mass'] 
 dataframe = pd.read_csv（./ flash.csv，usecols = columns_to_keep）
 dataframe ['new_column'] = dataframe.apply（your_func，axis = 1）
 
 print dataframe 
  
 
I need to read a .dat file in python which has 12 columns in total and millions of lines of rows. I need to divide column 2,3 and 4 with column 1 for my calculation. So before I load that .dat file, do I need to delete all the other unwanted columns? If not, how do I selectively declare the column and ask python to do the math? 

an example of the .dat file would be
data.dat

I am new to python , so a little instruction to open , read and calculation would be appreciated.

I have added the code I am using as a starter from your suggestion:
from sys import argv

import pandas as pd



script, filename = argv

txt = open(filename)

print "Here's your file %r:" % filename
print txt.read()

def your_func(row):
    return row['x-momentum'] / row['mass']

columns_to_keep = ['mass', 'x-momentum']
dataframe = pd.read_csv('~/Pictures', delimiter="," , usecols=columns_to_keep)
dataframe['new_column'] = dataframe.apply(your_func, axis=1)
and also the error I get through it:
Traceback (most recent call last):
  File "flash.py", line 18, in <module>
    dataframe = pd.read_csv('~/Pictures', delimiter="," , usecols=columns_to_keep)
  File "/home/trina/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.py", line 529, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/home/trina/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.py", line 295, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/home/trina/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.py", line 612, in __init__
    self._make_engine(self.engine)
  File "/home/trina/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.py", line 747, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "/home/trina/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.py", line 1119, in __init__
    self._reader = _parser.TextReader(src, **kwds)
  File "pandas/parser.pyx", line 518, in pandas.parser.TextReader.__cinit__ (pandas/parser.c:5030)
ValueError: No columns to parse from file

 解决方案 
After looking at your flash.dat file, it's clear you need to do a little clean up before you process it. The following code converts it to a CSV file:
import csv

# read flash.dat to a list of lists
datContent = [i.strip().split() for i in open("./flash.dat").readlines()]

# write it as a new CSV file
with open("./flash.csv", "wb") as f:
    writer = csv.writer(f)
    writer.writerows(datContent)
Now, use Pandas to compute new column.
import pandas as pd

def your_func(row):
    return row['x-momentum'] / row['mass']

columns_to_keep = ['#time', 'x-momentum', 'mass']
dataframe = pd.read_csv("./flash.csv", usecols=columns_to_keep)
dataframe['new_column'] = dataframe.apply(your_func, axis=1)

print dataframe


                        
这篇关于从python中的.dat文件读取和做计算的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

从python中的.dat文件读取和做计算 [英] reading and doing calculation from .dat file in python

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

从python中的.dat文件读取和做计算 [英] reading and doing calculation from .dat file in python

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭