如何处理AWS Glue中映射函数中的错误? [英] How do I handle errors in mapped functions in AWS Glue?

查看:73
本文介绍了如何处理AWS Glue中映射函数中的错误?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用DynamicFrame的map方法(或等效地Map.apply方法).我注意到,传递给这些函数的函数中的任何错误都会被忽略,并导致返回的DynamicFrame为空.

I'm using the map method of DynamicFrame (or, equivalently, the Map.apply method). I've noticed that any errors in the function that I pass to these functions are silently ignored and cause the returned DynamicFrame to be empty.

说我有一个像这样的工作脚本:

Say I have a job script like this:

import sys
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.transforms import *

glueContext = GlueContext(SparkContext.getOrCreate())
dyF = glueContext.create_dynamic_frame.from_catalog(database="radixdemo", table_name="census_csv")

def my_mapper(rec):
    import logging
    logging.error("[RADIX] An error-log from in the mapper!")
    print "[RADIX] from in the mapper!"
    raise Exception("[RADIX] A bug!")
dyF = dyF.map(my_mapper, 'my_mapper')

print "Count:  ", dyF.count()
dyF.printSchema()
dyF.toDF().show()

如果我使用gluepython在Glue Dev Endpoint中运行此脚本,则会得到如下输出:

If I run this script in my Glue Dev Endpoint with gluepython, I get output like this:

[glue@ip-172-31-83-196 ~]$ gluepython gluejob.py
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/share/aws/glue/etl/jars/glue-assembly.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
18/05/23 20:56:46 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console.
Count:   0
root

++
||
++
++

关于此输出的注释:

  • 我没有看到print语句或logging.error语句的结果.
  • 没有迹象表明my_mapper引发了异常.
  • printSchema调用显示所生成的DynamicFrame上没有架构元数据
  • show方法也不会产生任何输出,表明所有行均已消失.
  • I don't see the result of the print statement or the logging.error statement.
  • There's no indication that my_mapper raised an exception.
  • The printSchema call is showing that there is no schema metadata on the produced DynamicFrame
  • the show method also isn't producing any output, indicating that all the rows are gone.

同样,当我将此脚本另存为AWS Glue控制台中的作业并运行它时,该作业并不表示发生任何错误-作业状态为成功".值得注意的是,我 do 得到print语句,并且logging.error调用输出到作业日志,但仅在常规的日志"中,而不是在错误日志"中.

Likewise, when I save this script as a job in the AWS Glue console, and run it, the job doesn't indicate any error occurred -- The Job Status is "Succeeded". Notably, I do get the print statements and logging.error calls output to the job logs, but only in the regular "Logs", not the "Error Logs".

我想要的是能够指示我的工作失败了,并且能够轻松找到这些错误日志.最重要的是仅表明它已失败.

What I want is to be able to indicate that my job has failed, and to be able to easily find these error logs. Most important is to just indicate that it has failed.

是否有一种方法可以记录映射函数中的错误,以使Glue将其记录为错误日志"(并将其放在单独的AWS CloudWatch Logs路径中)?如果发生这种情况,是否会自动将整个作业标记为失败"?还是有其他方法可以从映射函数中明确使作业失败?

Is there a way to log an error within a mapped function in such a way that Glue will pick it up as an "Error Log" (and put it in that separate AWS CloudWatch Logs path)? If this happens, will it automatically mark the entire Job as Failing? Or is there some other way to explicitly fail the job from within a mapped function?

(我的计划是,如果有一种方法可以记录错误和/或将作业标记为失败,则是创建一个装饰器或其他实用程序函数,该函数将自动捕获映射函数中的异常并确保将其记录为&标记为失败).

(my plan, if there is a way to log errors and/or mark the job as failed, is to create a decorator or other utility function that will automatically catch exceptions in my mapped functions and ensure that they are logged & marked as a failure).

推荐答案

我发现使胶水作业显示为失败"的唯一方法是在主脚本中引发异常(不是在映射器或过滤器函数中,因为它们似乎被分解到了数据处理单元中.)

The only way I have discovered to make a Glue job show up as "Failed" is to raise an exception from the main script (not inside a mapper or filter function, as those seem to get spun out to the Data Processing Units).

幸运的是,有一种 方法可以使用DynamicFrame.stageErrorsCount()方法来检测是否在映射或过滤器函数内部发生了异常.它将返回一个数字,指示在运行最近的转换时引发了多少个异常.

Fortunately, there is a way to detect if an exception occurred inside of a map or filter function: using the DynamicFrame.stageErrorsCount() method. It will return a number indicating how many exceptions were raised while running the most recent transformation.

解决所有问题的正确方法是

So the correct way to solve all the problems:

  • 确保您的地图或变换函数明确记录其中发生的所有异常.最好通过使用装饰器函数或通过其他可重用机制来完成此操作,而不是依赖于在编写的每个函数中都放置try/except语句.
  • 要捕获错误的每个转换之后,调用stageErrorsCount()方法并检查其是否大于0.如果要中止该作业,只需引发一个异常.
  • make sure your map or transform function explicitly logs any exceptions that occur inside of it. This is best done by using a decorator function or via some other reusable mechanism, instead of relying on putting try/except statements in every single function you write.
  • after every single transformation that you want to catch errors in, call the stageErrorsCount() method and check if it's greater than 0. If you want to abort the job, just raise an exception.

例如:

import logging

def log_errors(inner):
    def wrapper(*args, **kwargs):
        try:
            inner(*args, **kwargs)
        except Exception as e:
            logging.exception('Error in function: {}'.format(inner))
            raise
    return wrapper

@log_errors
def foo(record):
    1 / 0

然后,在您的工作中,您将执行以下操作:

Then, inside your job, you'd do something like:

df = df.map(foo, "foo")
if df.stageErrorsCount() > 0:
    raise Exception("Error in job! See the log!")

请注意,由于某种原因,即使从mapper函数内部调用logging.exception仍然不会将日志写入AWS CloudWatch Logs中的 error 日志中.它被写入常规成功日志.但是,使用这种技术,您至少会看到作业失败,并且能够在日志中找到信息.另一个警告:Dev Endpoints似乎没有显示来自映射器或过滤器功能的任何日志.

Note that even calling logging.exception from inside the mapper function still doesn't write the logs to the error log in AWS CloudWatch Logs, for some reason. It gets written to the regular success logs. However, with this technique you will at least see that the job failed and be able to find the info in the logs. Another caveat: Dev Endpoints don't seem to show ANY logs from the mapper or filter functions.

这篇关于如何处理AWS Glue中映射函数中的错误?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆