Spark 1.5.2和SLF4J StaticLoggerBinder [英] Spark 1.5.2 and SLF4J StaticLoggerBinder

查看:201
本文介绍了Spark 1.5.2和SLF4J StaticLoggerBinder的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

虽然这并不能阻止我的代码正常工作,但我只是为了解这个警告发生的原因而疯了。我使用的是Scala 2.11.7,ScalaIDE,SBT 0.13.9。

  15/11/20 12:17:05 INFO akka.event.slf4j.Slf4jLogger:Slf4jLogger开始
15/11/20 12:17:06信息远程处理:启动远程处理
15/11/20 12:17:06信息远程处理:远程处理已开始;监听地址:[akka.tcp://sparkDriver@0.0.0.0:36509]
SLF4J:未能加载类org.slf4j.impl.StaticLoggerBinder。
SLF4J:默认为无操作(NOP)记录器实现
SLF4J:有关详细信息,请参阅http://www.slf4j.org/codes.html#StaticLoggerBinder。

[阶段0:====================================== =================> (31 + 1)/ 32]
[阶段0:================================== =======================(32 + 0)/ 32]

现在我明白了为什么会出现这个错误,但是我根本没有搞错Spark的日志记录。现在,如果我添加说slf4j-simple到我的项目,它抱怨多个SLF4j绑定,但不是这个警告。我不能为我的生活想出一个办法,使这两件事情都很好。我的代码本身是使用log4j 2.4进行自己的日志记录。



我曾尝试过,但无济于事


  1. 排除Spark的Logging并包含我自己的内容。
  2. 使用log4j2将SLF4j调用路由到log4j2并排除Spark的SLF4j

  3. 将SLF4j jar添加到我的类路径中,spark的驱动器和执行程序classpath

  4. ol>

    如果我尝试排除Spark日志记录,我将从Spark中得到ClassNotFound问题,但对于我的生活,我无法弄清楚到底在做什么。 p>

    只是更多的细节,我使用的是Spark,但是我排除并包含了我自己的Hadoop版本(2.7.1)

    这里是我认为与系统类加载器相关的jar。

     〜/。 ivy2 / cache / org.slf4j / slf4j-api / jars / slf4j-api-1.7.10.jar 
    〜/ .ivy2 / cache / org.slf4j / slf4j-log4j12 / jars / slf4j-log4j12-1.7.10.jar
    〜/ .ivy2 / cache / log4j / log4j / bundles / log4j-1.2.17.jar
    〜/。 ivy2 / cache / org.slf4j / jul-to-slf4j / jars / jul-to-slf4j-1.7.10.jar
    〜/ .ivy2 / cache / org.slf4j / jcl-over-slf4j / jars / jcl-over-slf4j-1.7.10.jar
    〜/ .ivy2 / cache / com.typesafe.akka / akka-slf4j_2.11 / jars / akka-slf4j_2.11-2.3.11.jar
    〜/ .ivy2 / cache / org.apache.logging.log4j / log4j-api / jars / log4j-api-2.4.1.jar
    〜/ .ivy2 / cache / org.apache.logging.log4j / log4j-core / jars / log4j-core-2.4.1.jar
    〜/ .ivy2 / cache / com.typesafe.akka / akka-slf4j_2.11 / jars / akka-slf4j_2.11-2.4.0。 jar

    有没有人对此有所了解?

      log4j:尝试使用上下文classloader查找[log4j.xml] sun.misc.Launcher$AppClassLoader@42a57993 。 
    log4j:尝试使用sun.misc.Launcher$AppClassLoader@42a57993类加载器查找[log4j.xml]。
    log4j:尝试使用ClassLoader.getSystemResource()查找[log4j.xml]。
    log4j:尝试使用上下文类加载器sun.misc.Launcher$AppClassLoader@42a57993查找[log4j.properties]。
    log4j:使用URL [file:/home/scarman/workspace-scala/Ingestions/ingestion/bin/log4j.properties]进行自动log4j配置。
    log4j:从URL文件中读取配置:/home/scarman/workspace-scala/Ingestions/ingestion/bin/log4j.properties
    log4j:解析[root]的值= [INFO,console]。
    log4j:等级令牌是[INFO]。
    log4j:将根类别设置为INFO
    log4j:解析名为console的appender。
    log4j:解析console的布局选项。
    log4j:将属性[conversionPattern]设置为[%d {yy / MM / dd HH:mm:ss}%p%c:%m%n]。
    log4j:console的解析结束。
    log4j:将属性[目标]设置为[System.err]。
    log4j:解析控制台选项。
    log4j:分析值为[WARN]的[org.spark-project.jetty]。
    log4j:等级令牌是[WARN]。
    log4j:将org.spark-project.jetty设置为WARN
    log4j:处理log4j.additivity.org.spark-project.jetty = [null]
    log4j:解析[org。 spark-project.jetty.util.component.AbstractLifeCycle],其值= [ERROR]。
    log4j:级别令牌是[ERROR]。
    log4j:Category org.spark-project.jetty.util.component.AbstractLifeCycle设置为ERROR
    log4j:处理log4j.additivity.org.spark-project.jetty.util.component.AbstractLifeCycle = [null ]
    log4j:分析值为[WARN]的[org.apache.spark]。
    log4j:等级令牌是[WARN]。
    log4j:类别org.apache.spark设置为WARN
    log4j:处理log4j.additivity.org.apache.spark = [null]
    log4j:解析[org.apache.hadoop。 hive.metastore.RetryingHMSHandler],值为[FATAL]。
    log4j:等级令牌是[致命]。
    log4j:Category org.apache.hadoop.hive.metastore.RetryingHMSHandler设置为FATAL
    log4j:处理log4j.additivity.org.apache.hadoop.hive.metastore.RetryingHMSHandler = [null]
    log4j:解析值为[INFO]的[parquet]。
    log4j:等级令牌是[INFO]。
    log4j:将parquet设置为INFO
    log4j:处理log4j.additivity.parquet = [null]
    log4j:解析[org.apache.hadoop]的值= [WARN]。
    log4j:等级令牌是[WARN]。
    log4j:类别org.apache.hadoop设置为WARN
    log4j:处理log4j.additivity.org.apache.hadoop = [null]
    log4j:解析[org.apache.spark。 repl.SparkILoop $ SparkILoopInterpreter],其值= [INFO]。
    log4j:等级令牌是[INFO]。
    log4j:Category org.apache.spark.repl.SparkILoop $ SparkILoopInterpreter设置为INFO
    log4j:处理log4j.additivity.org.apache.spark.repl.SparkILoop $ SparkILoopInterpreter = [null]
    log4j:分析值为[INFO]的[org.apache.spark.repl.SparkIMain $ exprTyper]。
    log4j:等级令牌是[INFO]。
    log4j:Category org.apache.spark.repl.SparkIMain $ exprTyper设置为INFO
    log4j:处理log4j.additivity.org.apache.spark.repl.SparkIMain $ exprTyper = [null]
    log4j:解析[org.apache.parquet]的值= [ERROR]。
    log4j:级别令牌是[ERROR]。
    log4j:类别org.apache.parquet设置为ERROR
    log4j:处理log4j.additivity.org.apache.parquet = [null]
    log4j:解析[org.apache.hadoop。 hive.ql.exec.FunctionRegistry],值= [ERROR]。
    log4j:级别令牌是[ERROR]。
    log4j:将类别org.apache.hadoop.hive.ql.exec.FunctionRegistry设置为ERROR
    log4j:处理log4j.additivity.org.apache.hadoop.hive.ql.exec.FunctionRegistry = [null ]
    log4j:完成配置

    添加slf4j在加载时定位的类绑定...

      jar:file:/home/scarman/.ivy2/cache/org.slf4j/slf4j-log4j12/jars/slf4j- log4j12-1.7.10.jar!/org/slf4j/impl/Log4jLoggerFactory.class 
    org.slf4j.impl.Log4jLoggerFactory@7cef4e59
    org.slf4j.impl.Log4jLoggerFactory


    解决方案

    更新:这仍然适用于Spark 1.6.1



    只要跟进并回答此问题,以防其他人想知道。所以我注意到这个警告只是在使用Spark的镶木地板界面时发生的。我测试了一下以确认它,并且还发现有人已经在 SPARK-10057 。这个问题的问题是其他开发商无法复制它,但公平地说,原来的记者在描述这个问题时相当模糊。

    无论哪种方式,我决定除了满足我对这些问题的强迫症状外,没有任何理由追踪它。

    因此,我在S3和我的本地磁盘上使用这两个文件进行了测试。文本和JSON文件没有触发此警告,但实木复合地板使用情况触发了此警告,无论文件是本地文件还是S3文件。这是为了阅读和写作实木复合地板文件。看着 ParquetRelation.scala 我们在那里看到了对SLF4j的唯一引用。

      // Parquet在静态块中初始化自己的JUL记录器,该静态块始终打印到stdout。这里
    //我们通过SLF4J JUL桥处理程序重定向JUL记录器。
    val redirectParquetLogsViaSLF4J:Unit = {
    def redirect(logger:JLogger):Unit = {
    logger.getHandlers.foreach(logger.removeHandler)
    logger.setUseParentHandlers(false)
    logger.addHandler(新的SLF4JBridgeHandler)
    }

    断言Parquet的JUL日志和SLF4j桥梁之间的桥梁正在引发此警告。我想它会初始化网桥,并发生无法加载正确静态记录器活页夹的情况。我不得不多花一点时间研究Spark的代码并进行测试以发现问题,但那至少是由于什么导致的。如果时间允许的话,我会尽力解决。



    最后,这里是一个用于本地复制警告的代码示例。

    scala> sc.setLogLevel(WARN)

    scala> val d = sc.parallelize(Array [Int](1,2,3,4,5))
    d:org.apache.spark.rdd.RDD [Int] = ParallelCollectionRDD [0] at parallelize at< console>:21

    scala> val ddf = d.toDF()
    ddf:org.apache.spark.sql.DataFrame = [_1:int]

    scala> ddf.write.parquet(/ home / scarman / data / test.parquet)
    SLF4J:未能加载类org.slf4j.impl.StaticLoggerBinder。
    SLF4J:默认为无操作(NOP)记录器实现
    SLF4J:有关详细信息,请参阅http://www.slf4j.org/codes.html#StaticLoggerBinder。


    While this doesn't stop my code from functioning, I am going insane just trying to understand why this warning occurs. I'm using Scala 2.11.7, ScalaIDE, SBT 0.13.9.

    15/11/20 12:17:05 INFO akka.event.slf4j.Slf4jLogger: Slf4jLogger started
    15/11/20 12:17:06 INFO Remoting: Starting remoting
    15/11/20 12:17:06 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@0.0.0.0:36509]
    SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
    SLF4J: Defaulting to no-operation (NOP) logger implementation
    SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
    
    [Stage 0:=======================================================> (31 + 1) / 32]
    [Stage 0:=========================================================(32 + 0) / 32]
    

    Now I understand normally why this error occurs, but the problem that I haven't messed with Spark's logging at all. Now if I add say slf4j-simple to my project, it complains of multiple SLF4j bindings, but not this warning. I cannot for the life of me figure out a way to make both of these things play nice. My code itself is using log4j 2.4 for my own logging.

    I have attempted, but to not avail

    1. Excluding Spark's Logging and including my own.
    2. Using log4j2 to route the SLF4j calls to log4j2 and excluding Spark's SLF4j
    3. Including literally every SLF4j binding in an attempt to make one pick it up.
    4. Adding the SLF4j jars, to my classpath, spark's drive and executor classpath

    If I try and exclude the Spark logging I will get ClassNotFound issues from Spark, but for the life of me I can't figure out what the hell is doing this.

    Just some more details, I am using Spark, but I am excluding and including my own version of Hadoop (2.7.1)

    Here is my are the jars I think are relevant that are provided according to the system classloader.

    ~/.ivy2/cache/org.slf4j/slf4j-api/jars/slf4j-api-1.7.10.jar
    ~/.ivy2/cache/org.slf4j/slf4j-log4j12/jars/slf4j-log4j12-1.7.10.jar
    ~/.ivy2/cache/log4j/log4j/bundles/log4j-1.2.17.jar
    ~/.ivy2/cache/org.slf4j/jul-to-slf4j/jars/jul-to-slf4j-1.7.10.jar
    ~/.ivy2/cache/org.slf4j/jcl-over-slf4j/jars/jcl-over-slf4j-1.7.10.jar
    ~/.ivy2/cache/com.typesafe.akka/akka-slf4j_2.11/jars/akka-slf4j_2.11-2.3.11.jar
    ~/.ivy2/cache/org.apache.logging.log4j/log4j-api/jars/log4j-api-2.4.1.jar
    ~/.ivy2/cache/org.apache.logging.log4j/log4j-core/jars/log4j-core-2.4.1.jar
    ~/.ivy2/cache/com.typesafe.akka/akka-slf4j_2.11/jars/akka-slf4j_2.11-2.4.0.jar
    

    Does anyone have any insight into this? I appreciate it.

    log4j: Trying to find [log4j.xml] using context classloader sun.misc.Launcher$AppClassLoader@42a57993.
    log4j: Trying to find [log4j.xml] using sun.misc.Launcher$AppClassLoader@42a57993 class loader.
    log4j: Trying to find [log4j.xml] using ClassLoader.getSystemResource().
    log4j: Trying to find [log4j.properties] using context classloader sun.misc.Launcher$AppClassLoader@42a57993.
    log4j: Using URL [file:/home/scarman/workspace-scala/Ingestions/ingestion/bin/log4j.properties] for automatic log4j configuration.
    log4j: Reading configuration from URL file:/home/scarman/workspace-scala/Ingestions/ingestion/bin/log4j.properties
    log4j: Parsing for [root] with value=[INFO, console].
    log4j: Level token is [INFO].
    log4j: Category root set to INFO
    log4j: Parsing appender named "console".
    log4j: Parsing layout options for "console".
    log4j: Setting property [conversionPattern] to [%d{yy/MM/dd HH:mm:ss} %p %c: %m%n].
    log4j: End of parsing for "console".
    log4j: Setting property [target] to [System.err].
    log4j: Parsed "console" options.
    log4j: Parsing for [org.spark-project.jetty] with value=[WARN].
    log4j: Level token is [WARN].
    log4j: Category org.spark-project.jetty set to WARN
    log4j: Handling log4j.additivity.org.spark-project.jetty=[null]
    log4j: Parsing for [org.spark-project.jetty.util.component.AbstractLifeCycle] with value=[ERROR].
    log4j: Level token is [ERROR].
    log4j: Category org.spark-project.jetty.util.component.AbstractLifeCycle set to ERROR
    log4j: Handling log4j.additivity.org.spark-project.jetty.util.component.AbstractLifeCycle=[null]
    log4j: Parsing for [org.apache.spark] with value=[WARN].
    log4j: Level token is [WARN].
    log4j: Category org.apache.spark set to WARN
    log4j: Handling log4j.additivity.org.apache.spark=[null]
    log4j: Parsing for [org.apache.hadoop.hive.metastore.RetryingHMSHandler] with value=[FATAL].
    log4j: Level token is [FATAL].
    log4j: Category org.apache.hadoop.hive.metastore.RetryingHMSHandler set to FATAL
    log4j: Handling log4j.additivity.org.apache.hadoop.hive.metastore.RetryingHMSHandler=[null]
    log4j: Parsing for [parquet] with value=[INFO].
    log4j: Level token is [INFO].
    log4j: Category parquet set to INFO
    log4j: Handling log4j.additivity.parquet=[null]
    log4j: Parsing for [org.apache.hadoop] with value=[WARN].
    log4j: Level token is [WARN].
    log4j: Category org.apache.hadoop set to WARN
    log4j: Handling log4j.additivity.org.apache.hadoop=[null]
    log4j: Parsing for [org.apache.spark.repl.SparkILoop$SparkILoopInterpreter] with value=[INFO].
    log4j: Level token is [INFO].
    log4j: Category org.apache.spark.repl.SparkILoop$SparkILoopInterpreter set to INFO
    log4j: Handling log4j.additivity.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=[null]
    log4j: Parsing for [org.apache.spark.repl.SparkIMain$exprTyper] with value=[INFO].
    log4j: Level token is [INFO].
    log4j: Category org.apache.spark.repl.SparkIMain$exprTyper set to INFO
    log4j: Handling log4j.additivity.org.apache.spark.repl.SparkIMain$exprTyper=[null]
    log4j: Parsing for [org.apache.parquet] with value=[ERROR].
    log4j: Level token is [ERROR].
    log4j: Category org.apache.parquet set to ERROR
    log4j: Handling log4j.additivity.org.apache.parquet=[null]
    log4j: Parsing for [org.apache.hadoop.hive.ql.exec.FunctionRegistry] with value=[ERROR].
    log4j: Level token is [ERROR].
    log4j: Category org.apache.hadoop.hive.ql.exec.FunctionRegistry set to ERROR
    log4j: Handling log4j.additivity.org.apache.hadoop.hive.ql.exec.FunctionRegistry=[null]
    log4j: Finished configuring
    

    Adding my class bindings that slf4j locates when loading...

    jar:file:/home/scarman/.ivy2/cache/org.slf4j/slf4j-log4j12/jars/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/Log4jLoggerFactory.class
    org.slf4j.impl.Log4jLoggerFactory@7cef4e59
    org.slf4j.impl.Log4jLoggerFactory
    

    解决方案

    Update: This still applies to Spark 1.6.1

    Just a follow up and answer to this in case anyone else was wondering. So I noticed that this warning was only happening during the use of Spark's parquet interfaces. I tested this out to confirm it and also found someone had already written about it in SPARK-10057. The problem with that issue was that the other developers could not replicate it, but in all fairness the original reporter was rather vague when describing the problem.

    Either way, I decided to track it down for no reason other than to satisfy my OCD-ness about these issues.

    So I tested using both files in S3 and on my local disk. Text and JSON files did not trigger this warning, but parquet usage triggered this warning whether the files were local or in S3. This was for both reading and writing parquet files. Looking at ParquetRelation.scala we see the only reference to SLF4j there.

      // Parquet initializes its own JUL logger in a static block which always prints to stdout.  Here
      // we redirect the JUL logger via SLF4J JUL bridge handler.
      val redirectParquetLogsViaSLF4J: Unit = {
        def redirect(logger: JLogger): Unit = {
          logger.getHandlers.foreach(logger.removeHandler)
          logger.setUseParentHandlers(false)
          logger.addHandler(new SLF4JBridgeHandler)
        }
    

    So it seems reasonable to me to assert that the bridge between Parquet's JUL logging and the SLF4j bridge is causing this warning to appear. I suppose it initializes the bridge and something happens where it can't load the proper Static Logger binder. I'd have to dig a bit more into Spark's code and test to find out, but that's at least what is causing it. I'll try and get a fix together if time permits.

    Finally, here is a code sample for local reproduction of the warning.

    scala> sc.setLogLevel("WARN")
    
    scala> val d = sc.parallelize(Array[Int](1,2,3,4,5))
    d: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize at <console>:21
    
    scala> val ddf = d.toDF()
    ddf: org.apache.spark.sql.DataFrame = [_1: int]
    
    scala> ddf.write.parquet("/home/scarman/data/test.parquet")
    SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
    SLF4J: Defaulting to no-operation (NOP) logger implementation
    SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
    

    这篇关于Spark 1.5.2和SLF4J StaticLoggerBinder的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆