Spark - 如何处理名称中有空格的列 [英] Spark - How to deal with columns that have blank space in the name
问题描述
我想知道如何从名称中包含空格的 Row
访问属性.
I would like to know how to access an attribute from a Row
that has a blank space in the name.
例如,我有这个 Row
对象
For example, I have this Row
object
Row(ONE CATEGORY=u'category')
如何访问 ONE CATEGORY
值.通常我会使用 row.oneCategory
来访问它,但在这种情况下,由于空格,这是不可能的.如果可能,我更喜欢 Python 中的建议.
How can I access the ONE CATEGORY
value. Normally I would use row.oneCategory
to access it, but in this case that's not possible because of the blank space. If possible, I prefer the suggestions in Python.
谢谢
推荐答案
在 Python 中可以使用 getattr
函数:
In Python can use getattr
function:
row = Row("ONE CATEGORY")("category")
row
## Row(ONE CATEGORY='category')
getattr(row, u"ONE CATEGORY")
## 'category'
或 Row.asDict
方法:
row.asDict()["ONE CATEGORY"]
## 'category'
由于您不能在 Scala 中使用点语法,这不是一个真正的问题,但是如果您想按名称访问字段,您可以使用 Row.getAs
Since you cannot use dot syntax in Scala it is not really an issue, but if you want to access fields by name you can use Row.getAs
val row = sc.parallelize(Tuple1("category") :: Nil).toDF("ONE CATEGORY").first
row.getAs[String]("ONE CATEGORY")
或 Row.getValuesMap
:
row.getValuesMap[String](Seq("ONE CATEGORY"))("ONE CATEGORY")
在 Python 和 Scala 中,您都可以通过索引访问值:
In both Python and Scala, you can access value by index:
## row[0]
'category'
row(0)
// Any = category
row.getString(0)
// String = category
最后你可以在选择过程中使用 alias
方法来完全避免这个问题:
Finally you can use alias
method during select to avoid the issue completely:
df.select(col("ONE CATEGORY").alias("ONE_CATEGORY"))
这篇关于Spark - 如何处理名称中有空格的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!