如何使用Matplotlib绘制pyspark SQL结果 [英] How to use matplotlib to plot pyspark sql results

查看:395
本文介绍了如何使用Matplotlib绘制pyspark SQL结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是pyspark的新手.我想使用matplotlib绘制结果,但不确定要使用哪个函数.我搜索了一种将sql结果转换为pandas然后使用plot的方法.

I am new to pyspark. I want to plot the result using matplotlib, but not sure which function to use. I searched for a way to convert sql result to pandas and then use plot.

推荐答案

您好,我已经找到解决方案.我将sql数据框转换为pandas数据框,然后能够绘制图形.以下是示例代码.来自

Hi Team I have found the solution for this. I converted sql dataframe to pandas dataframe and then I was able to plot the graphs. below is the sample code.from

pyspark.sql import Row
from pyspark.sql import HiveContext
import pyspark
from IPython.display import display
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline 
sc = pyspark.SparkContext()
sqlContext = HiveContext(sc)
test_list = [(1, 'hasan'),(2, 'nana'),(3, 'dad'),(4, 'mon')]
rdd = sc.parallelize(test_list)
people = rdd.map(lambda x: Row(id=int(x[0]), name=x[1]))
schemaPeople = sqlContext.createDataFrame(people)
# Register it as a temp table
sqlContext.registerDataFrameAsTable(schemaPeople, "test_table")
df1=sqlContext.sql("Select * from test_table")
pdf1=df1.toPandas()
pdf1.plot(kind='barh',x='name',y='id',colormap='winter_r')

这篇关于如何使用Matplotlib绘制pyspark SQL结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆