Power BI中的Python可视化中时间序列的最佳数据格式是什么? [英] What is the best data format for a time series in a Python Visualization in Power BI?

查看:127
本文介绍了Power BI中的Python可视化中时间序列的最佳数据格式是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

截至2018年8月9日,Power BI支持Python可视化。他们以前曾支持R可视化,但是我仍然觉得这些集成有点尴尬。让我告诉你我的意思:






假设您有一个包含时间序列数据的表,其中第一行包含名称 Date和 Value,内容分别是格式为yyyy-mm-dd的日期和数字:

  Date,Value 
2017-01-12,1
2017-01-13,4
2017-01-14,2
2017-01-15, 4
2017-01-16,2
2017-01-17,2
2017-01-18,2
2017-01-19,5
2017 -01-20,5
2017-01-21,5
2017-01-22,5
2017-01-23,6
2017-01-24,3
2017-01-25,6
2017-01-26,6
2017-01-27,5
2017-01-28,8
2017- 01-29,4
2017-01-30,2

如果存储该数据集作为文本文件,如 timerseries.csv 并使用导入数据| Text / CSV ,您会在 VISUALIZATIONS |下找到表格FIELDS ,如下所示:





您可以使用 VISUALIZATIONS检查表|表并获取:





通过此设置,您应该认为自己已经准备好使用此美丽的新功能来释放 Py VISUALIZATION 的功能:





如果单击该按钮,您将获得以下信息:



< a href = https://i.stack.imgur.com/PRC28.png rel = nofollow noreferrer>



然后您会被告知


将字段拖放到可视化窗格的值区域中以开始
脚本编写


如果以<$开头c $ c>值,您将在编辑器中获得此默认设置:





并且如果您遵循Power BI团队在



但这对我来说在这里就结束了。



如果编辑器中的默认数据框具有标准数据框的功能,则您应该能够引用该数据框中的一列,并使用此代码段轻松进行打印:

 将matplotlib.pyplot导入为plt 
plt.plot(dataset ['Value'])
plt.show()

但是当您运行它时,它会返回错误:





细节至少可以说得很详尽。



ve还尝试导入日期,并且我尝试使用<$ c $直接绘制数据框c> dataset.plot(),但似乎没有任何效果。我还尝试过通过这种方式将日期层次结构剥离为简单日期:





那么,关于数据格式,导入方法和/或代码片段的任何想法吗?



谢谢您的任何建议!



编辑1-根据Foxan Ng的回答:



在值字段中添加两列:





这仍然会返回带有以下内容的错误:


TypeError:from_bounds()接受4个位置参数,但给出了6个



解决方案

我没有遇到您提到的错误。您是否在两列中都放入了

  import matplotlib.pyplot as plt 
plt.plot(dataset ['Date'],dataset ['Value'])
plt.show()






已用M更新查询:

  let 
源= Csv.Document(File.Contents( C:\您的目录。 .\timerseries.csv),[Delimiter =,,Columns = 2,Encoding = 1252,QuoteStyle = QuoteStyle.None]),
# Promoted Headers = Table.PromoteHeaders(Source,[PromoteAllScalars = true]),
#更改的类型 = Table.TransformColumnTypes(#提升的标题,{{日期,类型日期},{值,Int64.Type}})

#更改的类型


As of today, August 9 2018, Power BI supports Python Visualizations. They've had support for R Visualizations before, but I still find these integrations to be a bit awkward. Let me show you what I mean:


Let's say that you have a table with time series data, where the top row containts the names 'Date' and 'Value', and the contents are dates of the form yyyy-mm-dd and a number, respectively:

Date,Value
2017-01-12,1
2017-01-13,4
2017-01-14,2
2017-01-15,4
2017-01-16,2
2017-01-17,2
2017-01-18,2
2017-01-19,5
2017-01-20,5
2017-01-21,5
2017-01-22,5
2017-01-23,6
2017-01-24,3
2017-01-25,6
2017-01-26,6
2017-01-27,5
2017-01-28,8
2017-01-29,4
2017-01-30,2

If you store that dataset as a textfile like timerseries.csv and import it using Get Data | Text/CSV, you get a table uner VISUALIZATIONS | FIELDS, like this:

You can inspect your table using VISUALIZATIONS | Table and get:

With this setup, one should think that you were all set for unleashing the power of a Py VISUALIZATION using this beautiful new feature:

If you click that, you get this:

And you're told to

Drag fields into the Values area in the Visualization pane to start scripting

If you start with Value, you get this default setup in the editor:

And if you follow the instructions given by the Power BI team in the August 2018 feature summary you should be able to make a matplotlib plot quite easily.

But this is where it ends for me at the time being.

If the default dataframe in the editor shares the features of a standard dataframe, you should be able to reference a column in that dataframe and easily make a plot with this snippet:

import matplotlib.pyplot as plt
plt.plot(dataset['Value'])
plt.show()

But when you run it, it onlu returns an error:

And the details are elaborate to say the least.

I've also tried to import both Dates and Values, and I've tried plotting the dataframe directly with dataset.plot(), but nothing seems to be working. I've also tried stripping the date hierarchy down to simple dates this way:

So, any ideas on the dataformat, import method and/or the snippet?

Thank you for any suggestions!

EDIT 1 - Following the answer from Foxan Ng:

Add both columns in the Value field:

This still returns an error edning with:

TypeError: from_bounds() takes 4 positional arguments but 6 were given

解决方案

I didn't encounter errors that you've mentioned. Have you dropped in both columns into Values?

import matplotlib.pyplot as plt
plt.plot(dataset['Date'], dataset['Value'])
plt.show()


UPDATED with M query:

let
    Source = Csv.Document(File.Contents("C:\your-directory..\timerseries.csv"),[Delimiter=",", Columns=2, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Date", type date}, {"Value", Int64.Type}})
in
    #"Changed Type"

这篇关于Power BI中的Python可视化中时间序列的最佳数据格式是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆