星火/斯卡拉扁平化和flatMap不工作在数据框 [英] Spark/Scala flatten and flatMap is not working on DataFrame
问题描述
我有包含相同类型(相同的镶木架构)三DataFrames一个数据帧。他们只在它们被包含内容/值不同:
I have a DataFrame containing three DataFrames of the same type (same parquet schema). They only differ in the content/values they are containing:
我想变平的结构中,使三DataFrames越来越合并成一个单一的平面数据帧包含所有内容/值
I want to flatten the structure, so that the three DataFrames are getting merged into one single Parquet DataFrame containing all of the content/values.
我和扁平化和flatMap试了一下,但我总是得到错误:
I tried it with flatten and flatMap, but with that I always get the error:
错误:从org.apache.spark.sql.DataFrame =&GT无隐观;穿越[U] .parquetsFiles.flatten
错误:没有足够的论据,扁平化的方法(如隐状育苗:org.apache.spark.sql.DataFrame => Traversable的[U],隐含L:scala.reflect.ClassTag [U]未指定值的参数asTrav,男。 parquetFiles.flatten
我也它转换成一个列表,然后试图趋于平坦,这也产生了同样的错误。
你有什么想法如何解决这个问题还是什么问题在这里?
谢谢,亚历克斯
I also converted it to a List and then tried to flatten and this is also producing the same error. Do you have any idea how to solve it or what is the problem here? Thanks, Alex
推荐答案
所以好像你想加入这三个 DataFrames
在一起,做到这一点的 unionAll
函数会工作得很好。你可以做 parquetFiles.reduce((X,Y)=> x.unionAll(Y))
(注意:这会发生爆炸一个空的列表上,但如果你可能有只看褶皱之一,而不是减少)。
So it seems like you want to join these three DataFrames
together, to do that the unionAll
function would work really well. You could do parquetFiles.reduce((x, y) => x.unionAll(y))
(note this will explode on an empty list but if you might have that just look at one of the folds instead of reduce).
这篇关于星火/斯卡拉扁平化和flatMap不工作在数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!