Spark如何使用带下划线的文件名开头读取文件? [英] How Spark read file with underline the beginning of the file name?

查看：389 发布时间：2020/9/4 4:31:00 scala apache-spark

本文介绍了Spark如何使用带下划线的文件名开头读取文件?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

当我使用Spark解析日志文件时，我注意到如果filename的第一个字符为_，则结果将为空.这是我的测试代码:

When I use Spark to parse log files, I notice that if the first character of filename is _ , the result will be empty. Here is my test code:

SparkSession spark = SparkSession
  .builder()
  .appName("TestLog")
  .master("local")
  .getOrCreate();
JavaRDD<String> input = spark.read().text("D:\\_event_2.log").javaRDD();
System.out.println("size : " + input.count());

如果我将文件名修改为event_2.log，则代码将正确运行它. 我发现text函数定义为:

If I modify the file name to event_2.log, the code will run it correctly. I found that the text function is defined as:

@scala.annotation.varargs
def text(paths: String*): Dataset[String] = {
  format("text").load(paths : _*).as[String](sparkSession.implicits.newStringEncoder)
}

我认为这可能是由于_是scala的placeholder.我该如何避免这个问题?

I think it could be due to _ being scala's placeholder. How can I avoid this problem?

Spark如何使用带下划线的文件名开头读取文件? [英] How Spark read file with underline the beginning of the file name?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Spark如何使用带下划线的文件名开头读取文件? [英] How Spark read file with underline the beginning of the file name?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭