从Apache Spark textFileStream读取文件 [英] Reading files from Apache Spark textFileStream

查看：133 发布时间：2021/4/8 20:01:45 scala apache-spark spark-streaming

本文介绍了从Apache Spark textFileStream读取文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试从Hadoop文件系统目录中读取/监视txt文件.但是我注意到该目录中的所有txt文件都是目录本身，如下面的示例所示:

I'm trying to read/monitor txt files from a Hadoop file system directory. But I've noticed all txt files inside this directory are directories themselves as showed in this example bellow:

/crawlerOutput/b6b95b75148cdac44cd55d93fe2bbaa76aa5cccecf3d723c5e47d361b28663be-1427922269.txt/_SUCCESS   
/crawlerOutput/b6b95b75148cdac44cd55d93fe2bbaa76aa5cccecf3d723c5e47d361b28663be-1427922269.txt/part-00000
/crawlerOutput/b6b95b75148cdac44cd55d93fe2bbaa76aa5cccecf3d723c5e47d361b28663be-1427922269.txt/part-00001

我想读取零件文件中的所有数据.我正在尝试使用此代码段中显示的以下代码:

I'd want read all the data inside the part's files. I'm trying to use the following code as showed in this snippet:

val testData = ssc.textFileStream("/crawlerOutput/*/*")

但是，不幸的是，它说它不存在/crawlerOutput/*/* . textFileStream 不接受通配符吗?我该怎么办才能解决这个问题?

But, unfortunately it said it doesn't exist /crawlerOutput/*/*. Doesn't textFileStream accept wildcards? What should I do to solve this problem?

从Apache Spark textFileStream读取文件 [英] Reading files from Apache Spark textFileStream

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

从Apache Spark textFileStream读取文件 [英] Reading files from Apache Spark textFileStream

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭