如何抑制在EMR上运行的spark-sql的INFO消息? [英] How to suppress INFO messages for spark-sql running on EMR?
问题描述
我正在EMR上运行Spark,如在Amazon Elastic MapReduce上运行Spark和Spark SQL 所述:
本教程将引导您快速安装和操作Spark 以及Amazon EMR上用于大规模数据处理的通用引擎 簇.您还将使用以下方法在Amazon S3中创建和查询数据集: Spark SQL,并了解如何在Amazon EMR集群上监视Spark 使用Amazon CloudWatch.
我试图通过编辑$HOME/spark/conf/log4j.properties
来抑制INFO
日志,但无济于事.
输出如下:
$ ./spark/bin/spark-sql
Spark assembly has been built with Hive, including Datanucleus jars on classpath
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/.versions/2.4.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/.versions/spark-1.1.1.e/lib/spark-assembly-1.1.1-hadoop2.4.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2014-12-14 20:59:01,819 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1009)) - mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
2014-12-14 20:59:01,825 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1009)) - mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
2014-12-14 20:59:01,825 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1009)) - mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
2014-12-14 20:59:01,825 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1009)) - mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
如何隐藏上面的INFO消息?
如果您知道要禁止记录新EMR集群,也可以在创建集群时仅添加配置选项.
EMR将配置选项接受为JSON,您可以将其直接输入到AWS控制台,或者在使用CLI时通过文件传递.
在这种情况下,为了将日志级别更改为WARN
,这是JSON:
[
{
"classification": "spark-log4j",
"properties": {"log4j.rootCategory": "WARN, console"}
}
]
在控制台中,您将在创建的第一步中添加它:
或者如果您正在使用CLI创建集群:
aws emr create-cluster <options here> --configurations config_file.json
您可以在EMR文档中阅读更多的. > I'm running Spark on EMR as described in Run Spark and Spark SQL on Amazon Elastic MapReduce: This tutorial walks you through installing and operating Spark, a fast
and general engine for large-scale data processing, on an Amazon EMR
cluster. You will also create and query a dataset in Amazon S3 using
Spark SQL, and learn how to monitor Spark on an Amazon EMR cluster
with Amazon CloudWatch. I'm trying to suppress the Output looks like: How to suppress the INFO messages above? You can also just add the configuration option at cluster creation, if you know you want to suppress logging for a new EMR cluster. EMR accepts configuration options as JSON, which you can enter directly into the AWS console, or pass in via a file when using the CLI. In this case, in order to change the log level to In the console, you'd add this in the first creation step: Or if you're creating the cluster using the CLI: You can read more in the EMR documentation. 这篇关于如何抑制在EMR上运行的spark-sql的INFO消息?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
INFO
logs by editing $HOME/spark/conf/log4j.properties
to no avail. $ ./spark/bin/spark-sql
Spark assembly has been built with Hive, including Datanucleus jars on classpath
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/.versions/2.4.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/.versions/spark-1.1.1.e/lib/spark-assembly-1.1.1-hadoop2.4.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2014-12-14 20:59:01,819 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1009)) - mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
2014-12-14 20:59:01,825 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1009)) - mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
2014-12-14 20:59:01,825 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1009)) - mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
2014-12-14 20:59:01,825 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1009)) - mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
WARN
, here's the JSON:[
{
"classification": "spark-log4j",
"properties": {"log4j.rootCategory": "WARN, console"}
}
]
aws emr create-cluster <options here> --configurations config_file.json