使用matplotlib / pandas / python，我不能将数据可视化为每30分钟和每天的值 [英] Using matplotlib/pandas/python, I cannot visualize data as values per 30mins and per days

查看：339 发布时间：2017/2/24 22:12:07 python csv pandas matplotlib

本文介绍了使用matplotlib / pandas / python，我不能将数据可视化为每30分钟和每天的值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用Matplotlib / Python分析CSV文件。

这是CSV文件。
https://github.com/camenergydatalab/EnergyDataSimulationChallenge /blob/master/challenge2/data/total_watt.csv

导入CSV文件后，我成功绘制了每30分钟的图表和可视化能耗，（谢谢！）使用Matplotlib，显示CSV数据）

 从matplotlib导入样式
从matplotlib import pylab as plt 
 import numpy as np 
 
 style.use（'ggplot'）
 
 filename ='total_watt.csv'
 date = [] 
 number = [] 
 
 import csv 
 with open（filename，'rb'）as csvfile：
 csvreader = csv.reader（csvfile，delimiter ='，'，quotechar ='|'）
 for row in csvreader： 
 if len（row）== 2：
 date.append（row [0]）
 number.append（row [1]）$ b 
 $ b number = np .nray（number）
 
 import datetime 
在范围内（len（date））：
 date [ii] = datetime.datetime.strptime（date [ii] '％Y-％m-％d％H：％M：％S'）
 
 plt.plot（date，number）
 
 plt.title ）
 plt.ylabel（'Y axis'）
 plt.xlabel（'X axis'）
 
 plt.show（）
  
 
 
 但事实是，我无法想象每天的能源消耗... 
 
 
  ------------已编辑（谢谢Florian !!）------------ 
 
 
 我安装了pandas 
 
 
 现在，我的代码看起来像下面这样：
 从matplotlib导入样式
从matplotlib导入pylab as plt 
导入numpy为np 
导入pandas为pd 
 
 style.use （'ggplot'）
 
 filename ='total_watt.csv'
 date = [] 
 number = [] 
 
 import csv 
 with open（filename，'rb'）as csvfile：
 
 df = pd.read_csv（'total_watt.csv'，parse_dates = [0]，index_col = [0]）
 df。重新取样（'1D'，how ='sum'）
 
 
 
对于df中的行：
如果len（row）== 2：
 date.append（row [0]）
 number.append（row [1]）$ b 
 $ b number = np.array（number）
 
 import datetime 
 for ii in range（len（date））：
 date [ii] = datetime.datetime.strptime（date [ii]，'％Y-％m-％d％H：％M：％S '）
 
 plt.plot（date，number）
 
 plt.title（'Example'）
 plt.ylabel（'Y axis'）
 plt.xlabel（'X axis'）
 
 plt.show（）
  
和当我实现这个代码。我没有错误。 
解决方案
使用 pandas 和 resample 函数可以使您的生活更轻松。
 
 
 资料
 
 
 
  import io 
 import pandas as pd 
 content ='''timestamp value 
 2011-04-18 16:52:00 152.684299188514 
 2011-04-18 17:22:00 327.579073188405 
 2011-04-18 17:52:00 156.826945856169 
 2011-04-18 18:22:00 330.202764488018 
 2011-04-18 18:52:00 1118.60404324133 
 2011-04-18 19:22:00 243.972250782998 
 2011-04-18 19:52:00 852.88815851216 
 2011-04-18 20:22:00 491.859992982456 
 2011-04-18 20:52:00 466.738983617709 
 2011-04-18 21:22:00 659.670303375527 
 2011-04-18 21:52:00 576.304871428571 
 2011-04-18 22:22:00 2497.20620579196 
 2011-04-18 22:52:00 2790.20392088608 
 2011-04-18 23:22:00 1092.20906629318 
 2011-04-18 23:52:00 825.994417375886 
 2011-04-19 00:22:00 2397.16672089666 
 2011-04-19 00:52:00 1411.66659265233 
 2011-04-19 01:22:00 2379.18391111111 
 2011-04-19 01:52:00 841.224212511672 
 2011-04-19 02:22:00 471.520308479532 
 2011-04-19 02:52:00 1189.78122544232 
 2011-04-19 03:22:00 343.7574197609 
 2011-04-19 03:52:00 336.486834795322 
 2011-04-19 04:22:00 541.401434220355 
 2011-04-19 04:52:00 316.106452883263 
 2011-04-19 05:22:00 502.502274561404 
 2011-04-19 05:52:00 314.832323976608 
'''
 
 df = pd.read_table（io.BytesIO（content.encode（'UTF-8'）），sep ='\s {2，}' parse_dates = [0]，index_col = [0]，engine ='python'）
  
 
 
 
 
 
 
 请参阅这里的文件： http://pandas-docs.github.io/pandas-docs-travis/  
 $ 30每30分钟
 
  
 
 
 
  df = df.resample（'30min'，how ='sum'）
 Out [496]：
 value 
 timestamp 
 2011-04-18 16:30:00 152.684299 
 2011-04-18 17:00:00 327.579073 
 2011-04-18 17:30:00 156.826946 
 2011 -04-18 18:00:00 330.202764 
 2011-04-18 18:30:00 1118.604043 
 2011-04-18 19:00:00 243.972251 
 2011-04-18 19 ：30：00 852.888159 
 2011-04-18 20:00:00 491.859993 
 2011-04-18 20:30:00 466.738984 
 2011-04-18 21:00:00 659.670303 
 2011-04-18 21:30:00 576.304871 
 2011-04-18 22:00:00 2497.206206 
 2011-04-18 22:30:00 2790.203921 
 2011 -04-18 23:00:00 1092.209066 
 2011-04-18 23:30:00 825.994417 
 2011-04-19 00:00:00 2397.166721 
 2011-04-19 00 ：30：00 1411.666593 
 2011-04-19 01:00:00 2379.183911 
 2011-04-19 01:30:00 841.224213 
 2011-04-19 02:00:00 471.520308 
 2011-04-19 02:30:00 1189.781225 
 2011-04-19 03:00:00 343.757420 
 2011-04-19 03:30:00 336.486835 
 2011 -04-19 04:00:00 541.401434 
 2011-04-19 04:30:00 316.106453 
 2011-04-19 05:00:00 502.502275 
 2011-04-19 05 ：30：00 314.832324 
  
 
 
 
每天
 
 
 
  df = df.resample（'1D'，how ='sum'）
 Out [497]：
 value 
 timestamp 
 2011 -04-18 12582.945297 
 2011-04-19 11045.629711 
  
 
 
 
 Plot 
 
 
   
 
 
 希望它有帮助！
 
I am analysing a CSV file with Matplotlib/Python.

This is the CSV file.
https://github.com/camenergydatalab/EnergyDataSimulationChallenge/blob/master/challenge2/data/total_watt.csv

Importing a CSV file, I successfully plotted a graph and visualised energy consumption per 30 minutes with the following code.(Thank you guys!! Using Matplotlib, visualize CSV data)
from matplotlib import style
from matplotlib import pylab as plt
import numpy as np

style.use('ggplot')

filename='total_watt.csv'
date=[]
number=[]

import csv
with open(filename, 'rb') as csvfile:
    csvreader = csv.reader(csvfile, delimiter=',', quotechar='|')
    for row in csvreader:
        if len(row) ==2 :
            date.append(row[0])
            number.append(row[1])

number=np.array(number)

import datetime
for ii in range(len(date)):
    date[ii]=datetime.datetime.strptime(date[ii], '%Y-%m-%d %H:%M:%S')

plt.plot(date,number)

plt.title('Example')
plt.ylabel('Y axis')
plt.xlabel('X axis')

plt.show()
But the thing is, I cannot visualize the energy consumption per days...

------------Edited (Thank you Florian!!)------------

I installed pandas and added a code for pandas to my code.

Now, my code is look like as following;
from matplotlib import style
from matplotlib import pylab as plt
import numpy as np
import pandas as pd

style.use('ggplot')

filename='total_watt.csv'
date=[]
number=[]

import csv
with open(filename, 'rb') as csvfile:

    df = pd.read_csv('total_watt.csv', parse_dates=[0], index_col=[0])
    df.resample('1D', how='sum')



for row in df:
        if len(row) == 2 :
            date.append(row[0])
            number.append(row[1])

number=np.array(number)

import datetime
for ii in range(len(date)):
    date[ii]=datetime.datetime.strptime(date[ii], '%Y-%m-%d %H:%M:%S')

plt.plot(date,number)

plt.title('Example')
plt.ylabel('Y axis')
plt.xlabel('X axis')

plt.show()
and when I implemented this code. I got no error.
But in my graph, nothing is drawn..How can I solve it..?
 解决方案 
Using pandas and the resample function could make your life easier.

Data

import io
import pandas as pd
content = '''timestamp  value
2011-04-18 16:52:00     152.684299188514
2011-04-18 17:22:00     327.579073188405
2011-04-18 17:52:00     156.826945856169
2011-04-18 18:22:00     330.202764488018
2011-04-18 18:52:00     1118.60404324133
2011-04-18 19:22:00     243.972250782998
2011-04-18 19:52:00     852.88815851216
2011-04-18 20:22:00     491.859992982456
2011-04-18 20:52:00     466.738983617709
2011-04-18 21:22:00     659.670303375527
2011-04-18 21:52:00     576.304871428571
2011-04-18 22:22:00     2497.20620579196
2011-04-18 22:52:00     2790.20392088608
2011-04-18 23:22:00     1092.20906629318
2011-04-18 23:52:00     825.994417375886
2011-04-19 00:22:00     2397.16672089666
2011-04-19 00:52:00     1411.66659265233
2011-04-19 01:22:00     2379.18391111111
2011-04-19 01:52:00     841.224212511672
2011-04-19 02:22:00     471.520308479532
2011-04-19 02:52:00     1189.78122544232
2011-04-19 03:22:00     343.7574197609
2011-04-19 03:52:00     336.486834795322
2011-04-19 04:22:00     541.401434220355
2011-04-19 04:52:00     316.106452883263
2011-04-19 05:22:00     502.502274561404
2011-04-19 05:52:00     314.832323976608
'''

df = pd.read_table(io.BytesIO(content.encode('UTF-8')), sep='\s{2,}', parse_dates=[0], index_col=[0], engine='python')


Using resample function

See documentation here : http://pandas-docs.github.io/pandas-docs-travis/

per 30 min

df = df.resample('30min', how='sum')
Out[496]: 
                           value
timestamp                       
2011-04-18 16:30:00   152.684299
2011-04-18 17:00:00   327.579073
2011-04-18 17:30:00   156.826946
2011-04-18 18:00:00   330.202764
2011-04-18 18:30:00  1118.604043
2011-04-18 19:00:00   243.972251
2011-04-18 19:30:00   852.888159
2011-04-18 20:00:00   491.859993
2011-04-18 20:30:00   466.738984
2011-04-18 21:00:00   659.670303
2011-04-18 21:30:00   576.304871
2011-04-18 22:00:00  2497.206206
2011-04-18 22:30:00  2790.203921
2011-04-18 23:00:00  1092.209066
2011-04-18 23:30:00   825.994417
2011-04-19 00:00:00  2397.166721
2011-04-19 00:30:00  1411.666593
2011-04-19 01:00:00  2379.183911
2011-04-19 01:30:00   841.224213
2011-04-19 02:00:00   471.520308
2011-04-19 02:30:00  1189.781225
2011-04-19 03:00:00   343.757420
2011-04-19 03:30:00   336.486835
2011-04-19 04:00:00   541.401434
2011-04-19 04:30:00   316.106453
2011-04-19 05:00:00   502.502275
2011-04-19 05:30:00   314.832324


Per day

df = df.resample('1D', how='sum')
Out[497]: 
                   value
timestamp               
2011-04-18  12582.945297
2011-04-19  11045.629711


Plot



Hope it helps!

                        这篇关于使用matplotlib / pandas / python，我不能将数据可视化为每30分钟和每天的值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用matplotlib / pandas / python，我不能将数据可视化为每30分钟和每天的值 [英] Using matplotlib/pandas/python, I cannot visualize data as values per 30mins and per days

问题描述

资料

每天

Plot

Data

Using resample function

per 30 min

Per day

Plot

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用matplotlib / pandas / python，我不能将数据可视化为每30分钟和每天的值 [英] Using matplotlib/pandas/python, I cannot visualize data as values per 30mins and per days

问题描述

资料

每天

Plot

Data

Using resample function

per 30 min

Per day

Plot

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭