使用python读取.dat文件 [英] Read .dat file using python

查看:279
本文介绍了使用python读取.dat文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 MITgcm 进行了涉及内波和水中移动粒子的模拟.每个时间步的输出如下所示:

 -9999 0.0000000000000000000 #时间步长(0.00000000000000秒)1308.2021183321899 -14.999709364517091 # 粒子 1 (X,Z)1308.2020142528656 -24.999521595698688 # 粒子 2 (X,Z)1308.2018600072618 -34.999345597877536 #.1308.2016593336587 -44.999185870669805 # .1308.2014165588744 -54.999046508237896 # .1308.2011370083103 -64.9989310762488941308.2008269116873 -74.9988424903057051308.2004933548124 -84.9987829257974851308.2001441978532 -94.9987537640869561308.1997879652938 -104.998755573847591308.1994336881464 -114.998788122805821308.1990906721119 -124.998850413282111308.1987681881285 -134.998940734615621308.1984750963150 -144.999056726946411308.1982194336249 -154.999195452947021308.1980080134056 -164.999353474767331308.1978461242272 -174.999526936941121308.1977378137256 -184.999711634924691308.2000000000000 -195.000000000000005232.8000000000002 -15.0000389162903525232.8000000000002 -25.0000641536843035232.8000000000002 -35.0000892861571635232.8000000000002 -45.0001142702935235232.8000000000002 -55.000139061712051 # 粒子 57

其中-9999 #number为时间步长(以秒为单位),左栏为X位置,右栏为Z位置(以米为单位);并且每一行都是不同的粒子(-9999 除外).因此,对于每个时间步长和每个粒子,我们都会有大量类似这样的线条.

我想绘制粒子位置的时间演化图.我该怎么做?如果这太难了,我会对所有粒子位置的不同时间步长的静态图感到满意.

非常感谢.

Edit1:我试图做的是这个,但我之前没有展示它,因为它远非正确:

 from matplotlib import numpy导入 matplotlib.pyplot 作为绘图plot.plot(*np.loadtxt('data.dat',unpack=True), linewidth=2.0)

或者这个:

 plot.plotfile('data.dat', delimiter='', cols=(0, 1), names=('col1', 'col2'), marker='o')

解决方案

我会使用 numpy.loadtxt 来读取输入,但这只是因为后处理也需要 numpy.您可以将所有数据读入内存,然后找到分隔线,然后重塑其余数据以适合您的粒子数量.以下假设没有任何粒子完全x=-9999,这应该是一个合理(虽然不是万无一失)的假设.

将 numpy 导入为 np文件名 = 'input.dat'indata = np.loadtxt(filename, usecols=(0,1)) # 确保忽略其余部分tlines_bool = indata[:,0]==-9999Nparticles = np.diff(np.where(tlines_bool)[0][:2])[0] - 1# TODO: 错误处理:diff(np.where(tlines_bool)) 应该是常量时间 = indata[tlines_bool,1]位置 = indata[np.logical_not(tlines_bool),:].reshape(-1,Nparticles,2)

上面的代码产生了一个 Nt 元素数组 times 和一个形状为 (Nt,Nparticles,2) 的数组 position 每个粒子在每个时间步的 2d 位置.通过计算粒子的数量,我们可以让 numpy 确定第一维的大小(这就是 reshape() 中的 -1 索引的作用).

对于绘图,您只需要切入您的 positions 数组以提取您确切需要的内容.如果是 2d x 数据和 2d y 数据,matplotlib.pyplot.plot() 将自动尝试绘制输入数组的列作为彼此的函数.以下是如何使用实际输入数据进行可视化的示例:

将 matplotlib.pyplot 导入为 pltt_indices = slice(None,None,500) # 每 500 个时间步particle_indices = slice(None) # 每个粒子#particle_indices = slice(None,5) # 前 5 个粒子#particle_indices = slice(-5,None) # 最后 5 个粒子plt.figure()_ = plt.plot(times[myslice],positions[myslice,particle_indices,0])plt.xlabel('t')plt.ylabel('x')plt.figure()_ = plt.plot(times[myslice],positions[myslice,particle_indices,1])plt.xlabel('t')plt.ylabel('z')plt.figure()_ = plt.plot(positions[myslice,particle_indices,0],positions[myslice,particle_indices,1])plt.xlabel('x')plt.ylabel('z')plt.show()

每一行对应一个粒子.前两个图分别显示了 xz 分量的时间演化,第三个图显示了 z(x) 轨迹.请注意,您的数据中有很多粒子根本不动:

<预><代码>>>>sum([~np.diff(positions[:,k,:],axis=0).any() for k in range(positions.shape[1])])15

(这会计算每个粒子的两个坐标的时间方向差异,一个接一个,并计算两个维度上的每个差异都为 0 的粒子数量,即粒子不移动.).这解释了前两个图中的所有水平线;这些静止的粒子根本没有出现在第三个图中(因为它们的轨迹是一个点).

我特意引入了一些花哨的索引,这样可以更轻松地处理您的数据.如您所见,索引看起来像这样:times[myslice], positions[myslice,particle_indices,0],其中两个切片都是根据...定义的,一个 slice.您应该查看文档,但简短的故事是 arr[slice(from,to,stride)] 等价于 arr[from:to:stride],如果其中任何一个变量是None,则对应的索引为空:arr[slice(-5,None)] 等价于 arr[-5:],即它将对数组的最后 5 个元素进行切片.

因此,如果您使用较少数量的轨迹进行绘图(因为 57 很多),您可能会考虑添加图例(这仅在 matplotlib 的默认颜色循环允许您区分粒子时才有意义,否则您必须设置手动颜色或更改轴的默认颜色循环).为此,您必须保留从 plot 返回的句柄:

particle_indices = slice(None,5) # 前 5 个粒子plt.figure()行 = plt.plot(positions[myslice,particle_indices,0],positions[myslice,particle_indices,1])plt.xlabel('x')plt.ylabel('z')plt.legend(lines,['particle {}'.format(k) for k in range(len(t))])plt.show()

I've got a simulation involving internal waves and moving particles in the water, using the MITgcm. The output of this looks something like this for each time step:

   -9999 0.0000000000000000000  #Time step (0.00000000000000 seconds)
 1308.2021183321899       -14.999709364517091 # Particle 1 (X,Z)
 1308.2020142528656       -24.999521595698688 # Particle 2 (X,Z)
 1308.2018600072618       -34.999345597877536 # .
 1308.2016593336587       -44.999185870669805 # .
 1308.2014165588744       -54.999046508237896 # .
 1308.2011370083103       -64.998931076248894
 1308.2008269116873       -74.998842490305705
 1308.2004933548124       -84.998782925797485
 1308.2001441978532       -94.998753764086956
 1308.1997879652938       -104.99875557384759
 1308.1994336881464       -114.99878812280582
 1308.1990906721119       -124.99885041328211
 1308.1987681881285       -134.99894073461562
 1308.1984750963150       -144.99905672694641
 1308.1982194336249       -154.99919545294702
 1308.1980080134056       -164.99935347476733
 1308.1978461242272       -174.99952693694112
 1308.1977378137256       -184.99971163492469
 1308.2000000000000       -195.00000000000000
 5232.8000000000002       -15.000038916290352
 5232.8000000000002       -25.000064153684303
 5232.8000000000002       -35.000089286157163
 5232.8000000000002       -45.000114270293523
 5232.8000000000002       -55.000139061712051 # Particle 57

Where -9999 #number is the time step (in seconds), left column is X position and right column is Z position (in meters); and every line is a different particle (except the -9999 one). So we'll have an enormous amount of lines with something like this for every time step and every particle.

I would like to plot the time-evolution of the position of my particles. How can I do it? If that's too hard, I would be happy with static plots of different time-steps with all particles position.

Thank you so much.

Edit1: What I tried to do is this, but I didn't show it before because it is far from proper:

 from matplotlib import numpy
 import matplotlib.pyplot as plot
 plot.plot(*np.loadtxt('data.dat',unpack=True), linewidth=2.0)

or this:

 plot.plotfile('data.dat', delimiter=' ', cols=(0, 1), names=('col1', 'col2'), marker='o')

解决方案

I would use numpy.loadtxt for reading input, but only because post-processing would also need numpy. You can read all your data to memory, then find the separator lines, then reshape the rest of your data to fit your number of particles. The following assumes that none of the particles ever reach exactly x=-9999, which should be a reasonable (although not foolproof) assumption.

import numpy as np
filename = 'input.dat'
indata = np.loadtxt(filename, usecols=(0,1)) # make sure the rest is ignored
tlines_bool = indata[:,0]==-9999
Nparticles = np.diff(np.where(tlines_bool)[0][:2])[0] - 1
# TODO: error handling: diff(np.where(tlines_bool)) should be constant
times = indata[tlines_bool,1]
positions = indata[np.logical_not(tlines_bool),:].reshape(-1,Nparticles,2)

The above code produces an Nt-element array times and an array position of shape (Nt,Nparticles,2) for each particle's 2d position at each time step. By computing the number of particles, we can let numpy determine the size of the first dimension (this iswhat the -1 index in reshape() is for).

For plotting you just have to slice into your positions array to extract what you exactly need. In case of 2d x data and 2d y data, matplotlib.pyplot.plot() will automatically try to plot the columns of the input arrays as a function of each other. Here's an example of how you can visualize, using your actual input data:

import matplotlib.pyplot as plt
t_indices = slice(None,None,500)  # every 500th time step
particle_indices = slice(None)    # every particle
#particle_indices = slice(None,5)   # first 5 particles
#particle_indices = slice(-5,None)  # last 5 particles

plt.figure()
_ = plt.plot(times[myslice],positions[myslice,particle_indices,0])
plt.xlabel('t')
plt.ylabel('x')

plt.figure()
_ = plt.plot(times[myslice],positions[myslice,particle_indices,1])
plt.xlabel('t')
plt.ylabel('z')

plt.figure()
_ = plt.plot(positions[myslice,particle_indices,0],positions[myslice,particle_indices,1])
plt.xlabel('x')
plt.ylabel('z')
plt.show()

Each line corresponds to a single particle. The first two plots show the time-evolution of the x and z components, respectively, and the third plot shows the z(x) trajectories. Note that there are a lot of particles in your data that don't move at all:

>>> sum([~np.diff(positions[:,k,:],axis=0).any() for k in range(positions.shape[1])])
15

(This computes the time-oriented difference of both coordinates for each particle, one after the other, and counts the number of particles for which every difference in both dimensions is 0, i.e. the particle doesn't move.). This explains all those horizontal lines in the first two plots; these stationary particles don't show up at all in the third plot (since their trajectory is a single point).

I intentionally introduced a bit fancy indexing which makes it easier to play around with your data. As you can see, indexing looks like this: times[myslice], positions[myslice,particle_indices,0], where both slices are defined in terms of...well, a slice. You should look at the documentation, but the short story is that arr[slice(from,to,stride)] is equivalent to arr[from:to:stride], and if any of the variables is None, then the corresponding index is empty: arr[slice(-5,None)] is equivalent to arr[-5:], i.e. it will slice the final 5 elements of the array.

So, in case you use a reduced number of trajectories for plotting (since 57 is a lot), you might consider adding a legend (this only makes sense as long as the default color cycle of matplotlib lets you distinguish between particles, otherwise you have to either set manual colors or change the default color cycle of your axes). For this you will have to keep the handles that are returned from plot:

particle_indices = slice(None,5)   # first 5 particles
plt.figure()
lines = plt.plot(positions[myslice,particle_indices,0],positions[myslice,particle_indices,1])
plt.xlabel('x')
plt.ylabel('z')
plt.legend(lines,['particle {}'.format(k) for k in range(len(t))])
plt.show()

这篇关于使用python读取.dat文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆