使用 Exec 或 Eval 在 AWS-LAMBDA 中运行不受信任的 Python 代码 [英] Running Untrusted Python Code in AWS-LAMBDA using Exec or Eval

查看:20
本文介绍了使用 Exec 或 Eval 在 AWS-LAMBDA 中运行不受信任的 Python 代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一直在做大量的研究.我只是一个学徒,但是,我有一个项目,我必须从网站运行用户不受信任的 Python3 代码.

如果这个问题有一些活动部分,我也提前道歉.

  • 我正在寻找一种尽可能安全的方法.这不需要 100% 完美,除非存在泄露极其敏感数据的巨大风险.

主要问题:

  • 我的 AWS-lambda 计划是否存在泄露敏感数据的极端风险?
  • 我是否应该采取任何其他简单的预防措施来使 AWS-lambda 中的这项工作更安全?
  • exec() 是否有办法脱离 AWS-lambda 容器并建立任何其他网络连接,如果我连接到它的是单个 AWS-api-gateway 用于 REST 调用?
  • 我是否需要限制 __builtins__ 和 locals,或者 AWS-lambda 容器是否足够安全?

背景

似乎大多数公司使用 KubernetesDocker 容器 来执行不受信任的 Python 代码(例如 LeetcodeProgramiz> 或 hackerRank).

查看这些有用的链接:

  • 解决方案

    我也在研究这个,并且对是否可以使用 AWS lambda 来运行不受信任的代码感兴趣.大多数情况下你的逻辑看起来很合理,但有几件事很突出

    <块引用>

    使用正则表达式来确保没有人传入双下划线不好的东西

    不要这样做! 不可能(或几乎不可能)检查包含 python 代码的字符串是否是恶意的.想想有人要你打电话给这个:

    eval(base64.b64decode(b'cHJpbnQoInRoaXMgaXNuJ3QgYmFkLCBidXQgeW91IGRvbid0IGtub3cgdGhhdCEiKQ=='))

    那安全吗?它会被正则表达式接收吗?很好,你可以模拟 eval() 但接下来呢?我绝对相信一个半能干的 Python 开发人员可以找到绕过它的路线.

    半安全总比不安全还要糟糕

    您会开始信任它,构建另一个依赖于半安全层的功能,但最终会被黑客入侵.


    解决方案

    1. 使用很多人和大公司都依赖的语言或功能,这样被破坏的风险就很小(如果有人确实发现了零日漏洞,那么就会有人比你更有利可图).如果您可以使用 Javascript,这里可能是一个不错的选择,您也可以尝试将 pypy 或 RustPython 编译为 Web 程序集.
    2. 接受您的代码根本不安全,并依靠 OS/PAAS 来保护您.换句话说,让用户在 AWS lambda 中运行他们喜欢的任何东西,而让亚马逊将其隔离.

    我的问题

    我想使用第二个选项,但我担心有敌意的开发人员有能力污染 lambda 容器,然后另一个成为受害者的无辜开发人员会使用它.

    假设我们有一个 AWS lambda,它可以有效地调用 eval(user_written_code) 并返回结果,它还可能设置了一堆环境变量,这些变量可以在 user_written_code.

    当代码被调用时,lambda 会阻止他们做任何恶意的事情——例如lambda 用户凭据无权访问任何可怕的东西.它们可以发出请求、耗尽内存、占用 CPU、挂起 - 所有这些都由 AWS 处理.

    但是我如何停止以下向量:

    1. 恶意用户执行模拟内置方法的代码,当调用该方法时,当前代码上下文和环境变量被发布到某个服务器.
    2. 无辜的用户在同一个 lambda 容器上执行他们的代码,AWS lambda 巧妙地重复使用该过程来提高性能.
    3. 无辜用户的代码和凭据(环境变量)被发布到恶意用户服务器.

    我很想知道是否有其他人以一种或另一种方式确认是否有一种有效的方法可以防止 lambda 容器像这样受到污染?

    Been doing a ton of research. I am a mere padawan, however, I have a project where I must run a user's untrusted Python3 code from a website.

    I also apologize in advance if this question has some moving parts.

    • I am looking for an as safe as possible approach. This doesn't need to be 100% perfect unless there is a big risk of leaking extremely sensitive data.

    Main questions:

    • Does my AWS-lambda plan run an extreme risk for leaking sensitive data?
    • Are there any other simple precautions that I should take which could make this work safer in AWS-lambda?
    • Are there ways for exec() to break out of the AWS-lambda container and make any other network connections if all I have connected to it is the single AWS-api-gateway for the REST call?
    • Do I even need to limit __builtins__ and locals, or are AWS-lambda containers safe enough?

    BackGround

    It seems most companies use Kubernetes and Docker containers to execute untrusted python code (such a Leetcode, Programiz, or hackerRank).

    See these helpful links:

    My Plan

    I am thinking that I can POST my arbitrary Python code to an AWS Lambda Function as a microservice, using their containerization/scaling rather than build my own. In the Lambda container, I can just run the code through a simple exec or eval function, perhaps with some limitation like this:

    "

    safe_list = ['math','acos', 'asin', 'atan', 'print','atan2', 'ceil', 'cos', 'cosh', 'de grees', 'e', 'exp', 'fabs', 'floor', 'fmod', 'frexp', 'hypot', 'ldexp', 'log', 'log10', 'modf', 'pi', 'pow', 'radians', 'sin', 'sinh', 'sqrt', 'tan', 'tanh'] 
        safe_dict = dict([ (k, locals().get(k, None)) for k in safe_list ]) 
        safe_dict['abs'] = abs
        exec(userCode,{"**__builtins__"**:None},safe_dict )
    

    Special Note:

    • I am not too concerned about infinite loops or crashing things, because I will just timeout and tell the user to try again.
    • All I need to do is run pretty simple python code (generally less than a few lines) and return exceptions, stdout, prints, and run a check on the result. Need to run:
      • Math operators, lists, loops, lambda functions, maps, filters, declare methods, declare classes with properties, print.
    • This doesn't need to be a perfect project for hundreds of thousands of users. I just want to have a live site for a resume booster and maybe make a little money on ads to help with costs.
    • If there are severe limitations, I can eventually implement it in Kubernetes (as in the above link), but hopefully, this solution will work well enough.
    • I just want this to work relatively well and not take too long to build or cost too much money.
    • I do not want to leak any sensitive information.

    Security things I am already planning on doing:

    • AWS lambda: Limit the time out to around 1-2 seconds
    • AWS lambda: Limit the memory usage to 128mb
    • My Own Code: Use regex to make sure no one is passing in double underscores badstuff
    • Keeping this microservice as minimal as possible (only connecting a single AWS-API-gateway).

    Other notes:

    • I don't think I can use restrictedPython or PyPy's sandbox feature in AWS Lambda because I don't have access to those dependencies OOB. I'm hoping that those are not necessary for this use case.
    • If it's impossible to do this with exec(), are there safe python interpreters on GitHub or someplace that I can literally copy-paste into files in AWS-lambda and just call them?
    • I am planning on allowing the user to print from exec with something like this:

    "

    @contextlib.contextmanager
    def stdoutIO(stdout=None):
        old = sys.stdout
        if stdout is None:
            stdout = StringIO()
        sys.stdout = stdout
        yield stdout
        sys.stdout = old
    
        
    with stdoutIO() as s:
        try:
            exec(userCode)
        except:
            print("Something wrong with the code")
    print( s.getvalue())
    print(i)
    

    Please let me know if you have any questions or suggestions.

    ___ Edit ** adding architecture diagram ___

    解决方案

    I'm looking into this too and interested in whether AWS lambda can be used to run untrusted code. Mostly your logic looks reasonable, but a couple of things stick out

    Use regex to make sure no one is passing in double underscores badstuff

    Don't do this! It's impossible (or virtually impossible) to check if a string containing python code is malicious or not. Think of someone asking you to call this:

    eval(base64.b64decode(b'cHJpbnQoInRoaXMgaXNuJ3QgYmFkLCBidXQgeW91IGRvbid0IGtub3cgdGhhdCEiKQ=='))
    

    Is that safe? Is it going to be picked up by a regex? Fine you could mock eval() but what next? I've absolutely no doubt that a half competent python developer could find a route round it.

    Half secure is worse than not secure at all

    You'll start to trust it, build another feature that relies on the semi-secure layer and end up getting hacked.


    Solutions

    1. Use a language or feature that lots of people and big companies rely on, that way the risk of a breach is small (and if someone does find a zero-day vulnerability, there will be someone more profitable to hack than you). Javascript might be a good option here if you can use it, you might also try pypy or RustPython compiled to web assembly.
    2. Accept that your code is not at all secure, and rely on the OS/PAAS to protect you. In other words, let the user run whatever they like in the AWS lambda, and let Amazon isolate it.

    My Problem

    I want to use the second option, but I'm worried about a hostile developers capacity to taint a lambda container, which is then used by another innocent developer who falls victim.

    Let's say we have an AWS lambda that effectively calls eval(user_written_code) and returns the result, it might also set a bunch of environment variables which can be references in user_written_code.

    The lambda prevents them doing anything malicious when the code is called - e.g. the lambda user credentials don't have access to anything scary. They can make requests, use up memory, eat CPU, hang - all that's taken care of by AWS.

    But how do I stop the following vector:

    1. Malicious user executes code which mocks a builtin method, when the method is called the current code context and environment variables are posted to some server.
    2. Innocent user executes their code on the same lambda container, AWS lambda cleverly reuses the process to improve performance.
    3. The innocent user's code and credentials (environment variables) get posted to the malicious users server.

    I'd be interested to know if anyone else has confirmed one way or the other whether there's an effective way to prevent lambda containers getting tainted like this?

    这篇关于使用 Exec 或 Eval 在 AWS-LAMBDA 中运行不受信任的 Python 代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆