向pyspark Dataframe添加新行 [英] Add new rows to pyspark Dataframe
本文介绍了向pyspark Dataframe添加新行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
是非常新的pyspark,但对熊猫很熟悉. 我有一个pyspark数据框
Am very new pyspark but familiar with pandas. I have a pyspark Dataframe
# instantiate Spark
spark = SparkSession.builder.getOrCreate()
# make some test data
columns = ['id', 'dogs', 'cats']
vals = [
(1, 2, 0),
(2, 0, 1)
]
# create DataFrame
df = spark.createDataFrame(vals, columns)
希望添加新的行(4,5,7),以便输出:
wanted to add new Row (4,5,7) so it will output:
df.show()
+---+----+----+
| id|dogs|cats|
+---+----+----+
| 1| 2| 0|
| 2| 0| 1|
| 4| 5| 7|
+---+----+----+
推荐答案
由于 thebluephantom 已经说工会是要走的路.我只是在回答您的问题,为您提供pyspark示例:
As thebluephantom has already said union is the way to go. I'm just answering your question to give you a pyspark example:
# if not already created automatically, instantiate Sparkcontext
spark = SparkSession.builder.getOrCreate()
columns = ['id', 'dogs', 'cats']
vals = [(1, 2, 0), (2, 0, 1)]
df = spark.createDataFrame(vals, columns)
newRow = spark.createDataFrame([(4,5,7)], columns)
appended = df.union(newRow)
appended.show()
也请查看databricks常见问题解答: https://kb.databricks.com/data/append-a-row-to-rdd-or-dataframe.html
Please have also a lookat the databricks FAQ: https://kb.databricks.com/data/append-a-row-to-rdd-or-dataframe.html
这篇关于向pyspark Dataframe添加新行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文