如何RDD [(字符串,字符串)转换成RDD [阵列[字符串]? [英] How to convert RDD[(String, String)] into RDD[Array[String]]?

查看:2482
本文介绍了如何RDD [(字符串,字符串)转换成RDD [阵列[字符串]?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图附加文件名,文件中的每个记录。我想如果是RDD数组这本来是容易的,我做这件事。

一些帮助转换RDD类型或解决这一问题将是非常美联社preciated!

在(字符串,字符串)键入

 斯卡拉> myRDD.first()(1)
斯卡拉><&控制台GT;:24:错误:(字符串,字符串)不带parametersmyRDD.first()(1)

在阵列(串)

 斯卡拉> myRDD.first()(1)
斯卡拉> RES1:字符串= ABCDEFGH

我的功能:

 高清appendKeyToValue(X:数组[数组[字符串]){
    为(ⅰ&下; -0至(x.length - 1)){
        变种键= X(ⅰ)(0)
        VAL模式=新的正则表达式(\\\\)。
        VAL键2 =格局replaceAllIn(KEY1,|)
        变种tempvalue = X(ⅰ)(1)
        VAL finalval = tempvalue.split(\\ n)
        为(AB&下; -0至(finalval.length -1)){
            VAL结果=(我想追加文件名,每个记录在filekey2 +|+ finalval(AB))
            }
        }
}


解决方案

如果你有一个 RDD [(字符串,字符串)] ,您可以访问的第一个元组领域通过调用第一个元组

  VAL firstTupleField:字符串= myRDD.first()._ 1

如果你想转换一个 RDD [(字符串,字符串)] RDD [数组[字符串]] 你可以做以下

  VAL arrayRDD:RDD [阵列[字符串] = myRDD.map(X =>阵列(x._1,x._2))

您也可以使用部分功能,以解构的元组:

  VAL arrayRDD:RDD [阵列[字符串] = {myRDD.map案(A,B)=>阵列(A,B)}

I am trying to append filename to each record in the file. I thought if the RDD is Array it would have been easy for me to do it.

Some help with converting RDD type or solving this problem would be much appreciated!

In (String, String) type

scala> myRDD.first()(1)    
scala><console>:24: error: (String, String) does not take parametersmyRDD.first()(1)  

In Array(string)

scala> myRDD.first()(1)    
scala> res1: String = abcdefgh

My function:

def appendKeyToValue(x: Array[Array[String]){
    for (i<-0 to (x.length - 1)) {
        var key = x(i)(0)
        val pattern = new Regex("\\.")
        val key2 = pattern replaceAllIn(key1,"|")
        var tempvalue = x(i)(1)
        val finalval = tempvalue.split("\n")
        for (ab <-0 to (finalval.length -1)){
            val result = (I am trying to append filename to each record in the filekey2+"|"+finalval(ab))
            }  
        }
}

解决方案

If you have a RDD[(String, String)], you can access the first tuple field of the first tuple by calling

val firstTupleField: String = myRDD.first()._1

If you want to convert a RDD[(String, String)] into a RDD[Array[String]] you can do the following

val arrayRDD: RDD[Array[String]] = myRDD.map(x => Array(x._1, x._2))

You may also employ a partial function to destructure the tuples:

val arrayRDD: RDD[Array[String]] = myRDD.map { case (a,b) => Array(a, b) }

这篇关于如何RDD [(字符串,字符串)转换成RDD [阵列[字符串]?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆