读取文件并用Python绘制CDF [英] Read file and plot CDF in Python

查看:653
本文介绍了读取文件并用Python绘制CDF的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要以秒为单位读取带有时间戳的长文件,并使用numpy或scipy绘制CDF.我确实尝试过numpy,但似乎输出不是应该的.下面的代码:任何建议表示赞赏.

I need to read long file with timestamp in seconds, and plot of CDF using numpy or scipy. I did try with numpy but seems the output is NOT what it is supposed to be. The code below: Any suggestions appreciated.

import numpy as np
import matplotlib.pyplot as plt

data = np.loadtxt('Filename.txt')
sorted_data = np.sort(data)
cumulative = np.cumsum(sorted_data)

plt.plot(cumulative)
plt.show()

推荐答案

您有两个选择:

1:您可以先对数据进行装箱.使用numpy.histogram函数可以轻松完成此操作:

1: you can bin the data first. This can be done easily with the numpy.histogram function:


import numpy as np
import matplotlib.pyplot as plt

data = np.loadtxt('Filename.txt')

# Choose how many bins you want here
num_bins = 20

# Use the histogram function to bin the data
counts, bin_edges = np.histogram(data, bins=num_bins, normed=True)

# Now find the cdf
cdf = np.cumsum(counts)

# And finally plot the cdf
plt.plot(bin_edges[1:], cdf)

plt.show()

2:而不是使用numpy.cumsum,只需针对小于该数组中每个元素的项目数绘制sorted_data数组即可(请参阅此答案以获取更多详细信息https://stackoverflow.com/a/11692365/588071 ):

2: rather than use numpy.cumsum, just plot the sorted_data array against the number of items smaller than each element in the array (see this answer for more details https://stackoverflow.com/a/11692365/588071):


import numpy as np

import matplotlib.pyplot as plt

data = np.loadtxt('Filename.txt')

sorted_data = np.sort(data)

yvals=np.arange(len(sorted_data))/float(len(sorted_data)-1)

plt.plot(sorted_data,yvals)

plt.show()

这篇关于读取文件并用Python绘制CDF的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆