使用matplotlib / pandas / python,我不能将数据可视化为每30分钟和每天的值 [英] Using matplotlib/pandas/python, I cannot visualize data as values per 30mins and per days
问题描述
我正在使用Matplotlib / Python分析CSV文件。
导入CSV文件后,我成功绘制了每30分钟的图表和可视化能耗, (谢谢!)使用Matplotlib,显示CSV数据)
从matplotlib导入样式
从matplotlib import pylab as plt
import numpy as np
style.use('ggplot')
filename ='total_watt.csv'
date = []
number = []
import csv
with open(filename,'rb')as csvfile:
csvreader = csv.reader(csvfile,delimiter =',',quotechar ='|')
for row in csvreader:
if len(row)== 2:
date.append(row [0])
number.append(row [1])$ b
$ b number = np .nray(number)
import datetime
在范围内(len(date)):
date [ii] = datetime.datetime.strptime(date [ii] '%Y-%m-%d%H:%M:%S')
plt.plot(date,number)
plt.title )
plt.ylabel('Y axis')
plt.xlabel('X axis')
plt.show()
但事实是,我无法想象每天的能源消耗...
------------已编辑(谢谢Florian !!)------------
我安装了pandas
现在,我的代码看起来像下面这样:
从matplotlib导入样式
从matplotlib导入pylab as plt
导入numpy为np
导入pandas为pd
style.use ('ggplot')
filename ='total_watt.csv'
date = []
number = []
import csv
with open(filename,'rb')as csvfile:
df = pd.read_csv('total_watt.csv',parse_dates = [0],index_col = [0])
df。重新取样('1D',how ='sum')
对于df中的行:
如果len(row)== 2:
date.append(row [0])
number.append(row [1])$ b
$ b number = np.array(number)
import datetime
for ii in range(len(date)):
date [ii] = datetime.datetime.strptime(date [ii],'%Y-%m-%d%H:%M:%S ')
plt.plot(date,number)
plt.title('Example')
plt.ylabel('Y axis')
plt.xlabel('X axis')
plt.show()
和当我实现这个代码。我没有错误。
解决方案使用
pandas
和resample
函数可以使您的生活更轻松。
资料
import io
import pandas as pd
content ='''timestamp value
2011-04-18 16:52:00 152.684299188514
2011-04-18 17:22:00 327.579073188405
2011-04-18 17:52:00 156.826945856169
2011-04-18 18:22:00 330.202764488018
2011-04-18 18:52:00 1118.60404324133
2011-04-18 19:22:00 243.972250782998
2011-04-18 19:52:00 852.88815851216
2011-04-18 20:22:00 491.859992982456
2011-04-18 20:52:00 466.738983617709
2011-04-18 21:22:00 659.670303375527
2011-04-18 21:52:00 576.304871428571
2011-04-18 22:22:00 2497.20620579196
2011-04-18 22:52:00 2790.20392088608
2011-04-18 23:22:00 1092.20906629318
2011-04-18 23:52:00 825.994417375886
2011-04-19 00:22:00 2397.16672089666
2011-04-19 00:52:00 1411.66659265233
2011-04-19 01:22:00 2379.18391111111
2011-04-19 01:52:00 841.224212511672
2011-04-19 02:22:00 471.520308479532
2011-04-19 02:52:00 1189.78122544232
2011-04-19 03:22:00 343.7574197609
2011-04-19 03:52:00 336.486834795322
2011-04-19 04:22:00 541.401434220355
2011-04-19 04:52:00 316.106452883263
2011-04-19 05:22:00 502.502274561404
2011-04-19 05:52:00 314.832323976608
'''
df = pd.read_table(io.BytesIO(content.encode('UTF-8')),sep ='\s {2,}' parse_dates = [0],index_col = [0],engine ='python')
请参阅这里的文件: http://pandas-docs.github.io/pandas-docs-travis/
$ 30每30分钟
df = df.resample('30min',how ='sum')
Out [496]:
value
timestamp
2011-04-18 16:30:00 152.684299
2011-04-18 17:00:00 327.579073
2011-04-18 17:30:00 156.826946
2011 -04-18 18:00:00 330.202764
2011-04-18 18:30:00 1118.604043
2011-04-18 19:00:00 243.972251
2011-04-18 19 :30:00 852.888159
2011-04-18 20:00:00 491.859993
2011-04-18 20:30:00 466.738984
2011-04-18 21:00:00 659.670303
2011-04-18 21:30:00 576.304871
2011-04-18 22:00:00 2497.206206
2011-04-18 22:30:00 2790.203921
2011 -04-18 23:00:00 1092.209066
2011-04-18 23:30:00 825.994417
2011-04-19 00:00:00 2397.166721
2011-04-19 00 :30:00 1411.666593
2011-04-19 01:00:00 2379.183911
2011-04-19 01:30:00 841.224213
2011-04-19 02:00:00 471.520308
2011-04-19 02:30:00 1189.781225
2011-04-19 03:00:00 343.757420
2011-04-19 03:30:00 336.486835
2011 -04-19 04:00:00 541.401434
2011-04-19 04:30:00 316.106453
2011-04-19 05:00:00 502.502275
2011-04-19 05 :30:00 314.832324
每天
df = df.resample('1D',how ='sum')
Out [497]:
value
timestamp
2011 -04-18 12582.945297
2011-04-19 11045.629711
Plot
希望它有帮助!
I am analysing a CSV file with Matplotlib/Python.
This is the CSV file. https://github.com/camenergydatalab/EnergyDataSimulationChallenge/blob/master/challenge2/data/total_watt.csv
Importing a CSV file, I successfully plotted a graph and visualised energy consumption per 30 minutes with the following code.(Thank you guys!! Using Matplotlib, visualize CSV data)
from matplotlib import style from matplotlib import pylab as plt import numpy as np style.use('ggplot') filename='total_watt.csv' date=[] number=[] import csv with open(filename, 'rb') as csvfile: csvreader = csv.reader(csvfile, delimiter=',', quotechar='|') for row in csvreader: if len(row) ==2 : date.append(row[0]) number.append(row[1]) number=np.array(number) import datetime for ii in range(len(date)): date[ii]=datetime.datetime.strptime(date[ii], '%Y-%m-%d %H:%M:%S') plt.plot(date,number) plt.title('Example') plt.ylabel('Y axis') plt.xlabel('X axis') plt.show()
But the thing is, I cannot visualize the energy consumption per days...
------------Edited (Thank you Florian!!)------------
I installed pandas and added a code for pandas to my code.
Now, my code is look like as following;
from matplotlib import style from matplotlib import pylab as plt import numpy as np import pandas as pd style.use('ggplot') filename='total_watt.csv' date=[] number=[] import csv with open(filename, 'rb') as csvfile: df = pd.read_csv('total_watt.csv', parse_dates=[0], index_col=[0]) df.resample('1D', how='sum') for row in df: if len(row) == 2 : date.append(row[0]) number.append(row[1]) number=np.array(number) import datetime for ii in range(len(date)): date[ii]=datetime.datetime.strptime(date[ii], '%Y-%m-%d %H:%M:%S') plt.plot(date,number) plt.title('Example') plt.ylabel('Y axis') plt.xlabel('X axis') plt.show()
and when I implemented this code. I got no error. But in my graph, nothing is drawn..How can I solve it..?
解决方案Using
pandas
and theresample
function could make your life easier.Data
import io import pandas as pd content = '''timestamp value 2011-04-18 16:52:00 152.684299188514 2011-04-18 17:22:00 327.579073188405 2011-04-18 17:52:00 156.826945856169 2011-04-18 18:22:00 330.202764488018 2011-04-18 18:52:00 1118.60404324133 2011-04-18 19:22:00 243.972250782998 2011-04-18 19:52:00 852.88815851216 2011-04-18 20:22:00 491.859992982456 2011-04-18 20:52:00 466.738983617709 2011-04-18 21:22:00 659.670303375527 2011-04-18 21:52:00 576.304871428571 2011-04-18 22:22:00 2497.20620579196 2011-04-18 22:52:00 2790.20392088608 2011-04-18 23:22:00 1092.20906629318 2011-04-18 23:52:00 825.994417375886 2011-04-19 00:22:00 2397.16672089666 2011-04-19 00:52:00 1411.66659265233 2011-04-19 01:22:00 2379.18391111111 2011-04-19 01:52:00 841.224212511672 2011-04-19 02:22:00 471.520308479532 2011-04-19 02:52:00 1189.78122544232 2011-04-19 03:22:00 343.7574197609 2011-04-19 03:52:00 336.486834795322 2011-04-19 04:22:00 541.401434220355 2011-04-19 04:52:00 316.106452883263 2011-04-19 05:22:00 502.502274561404 2011-04-19 05:52:00 314.832323976608 ''' df = pd.read_table(io.BytesIO(content.encode('UTF-8')), sep='\s{2,}', parse_dates=[0], index_col=[0], engine='python')
Using resample function
See documentation here : http://pandas-docs.github.io/pandas-docs-travis/
per 30 min
df = df.resample('30min', how='sum') Out[496]: value timestamp 2011-04-18 16:30:00 152.684299 2011-04-18 17:00:00 327.579073 2011-04-18 17:30:00 156.826946 2011-04-18 18:00:00 330.202764 2011-04-18 18:30:00 1118.604043 2011-04-18 19:00:00 243.972251 2011-04-18 19:30:00 852.888159 2011-04-18 20:00:00 491.859993 2011-04-18 20:30:00 466.738984 2011-04-18 21:00:00 659.670303 2011-04-18 21:30:00 576.304871 2011-04-18 22:00:00 2497.206206 2011-04-18 22:30:00 2790.203921 2011-04-18 23:00:00 1092.209066 2011-04-18 23:30:00 825.994417 2011-04-19 00:00:00 2397.166721 2011-04-19 00:30:00 1411.666593 2011-04-19 01:00:00 2379.183911 2011-04-19 01:30:00 841.224213 2011-04-19 02:00:00 471.520308 2011-04-19 02:30:00 1189.781225 2011-04-19 03:00:00 343.757420 2011-04-19 03:30:00 336.486835 2011-04-19 04:00:00 541.401434 2011-04-19 04:30:00 316.106453 2011-04-19 05:00:00 502.502275 2011-04-19 05:30:00 314.832324
Per day
df = df.resample('1D', how='sum') Out[497]: value timestamp 2011-04-18 12582.945297 2011-04-19 11045.629711
Plot
Hope it helps!
这篇关于使用matplotlib / pandas / python,我不能将数据可视化为每30分钟和每天的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!