Python的可访问类变量,敏感数据和恶意编码器(黑帽黑客) [英] Python's accessible class variables, sensitive data, and malicious coders (black-hat hackers)

查看:58
本文介绍了Python的可访问类变量,敏感数据和恶意编码器(黑帽黑客)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使正在执行的项目无法访问变量,并且在在Python上遇到过SO帖子类中的私有变量?。对我来说,它提出了一些有趣的问题,为了使其更易回答,我将使用 Q1 Q2 等标签。我环顾四周,但是我找不到我要问的问题的答案,尤其是那些有关敏感数据的问题的答案。

I was trying to make a variable inaccessible for a project I'm doing, and I ran across an SO post on Does Python have "private" variables in classes?. For me, it raised some interesting questions that, to try and make this answerable, I'll label with Q1 , Q2 , etc. I've looked around, but I didn't find answers to the questions I'm asking, especially to those about sensitive data.

我在该帖子中找到了有用的东西,但似乎普遍的共识就像是,如果您看到前面带有 _ 的变量,请像个成年人一样,意识到自己不应该弄乱它。对于以 __ 开头的变量也提出了相同的想法。在那里,我得到了一个大致的想法,即您相信人们不要使用此处和(更详细地)此处。我还在此SO帖子中找到了一些很好的信息。

I found useful stuff in that post, but it seems that the general consensus was something like if you see a variable with a _ before it, act like an adult and realize you shouldn't be messing with it. The same kind of idea was put forward for variables preceded by __. There, I got the general idea that you trust people not to use tricks like those described here and (in more detail) here. I also found some good information at this SO post.

当您谈论良好的编码实践时,这都是非常好的建议。

This is all very good advice when you're talking about good coding practices.

我在分享的帖子的评论中发表了一些想法。我的主要问题已发布作为评论。

I posted some thoughts in comments to the posts I've shared. My main question was posted as a comment.


令人惊讶的是,对于想要引入恶意代码的人们还没有更多的讨论。这是一个真实的问题: Python中没有办法阻止黑帽黑客访问您的变量和方法以及插入可能拒绝服务,泄露个人(或专有公司)信息的代码/数据 Q1 如果Python不允许这种类型的安全性,是否应该将其用于敏感数据 Q2

I'm surprised there hasn't been more discussion of those who want to introduce malicious code. This is a real question: Is there no way in Python to prevent a black-hat hacker from accessing your variables and methods and inserting code/data that could deny service, reveal personal (or proprietary company) informationQ1? If Python doesn't allow this type of security, should it ever be used for sensitive dataQ2?

我完全缺少什么吗:恶意代码编写者是否可以访问变量和方法来插入可能拒绝服务或泄露敏感数据的代码/数据 Q3

Am I totally missing something: Could a malicious coder even access variables and methods to insert code/data that could deny service or reveal sensitive dataQ3?

我想我可能会误解一个概念,丢失某些东西,将问题放在一个不属于该地方的地方,或者只是对什么计算机完全不了解安全是。但是,我想了解这里的情况。如果我完全不合格,我想要一个能告诉我的答案,但我也想知道我是如何完全不合格以及如何重新使用它的。

I imagine I could be misunderstanding a concept, missing something, putting a problem in a place where it doesn't belong, or just being completely ignorant on what computer security is. However, I want to understand what's going on here. If I'm totally off the mark, I want an answer that tells me so, but I would also like to know how I'm totally off the mark and how to get back on it.

我在这里要问的问题的另一部分是我对这些帖子/答案发表的另一条评论。 @SLott (有点措辞)

Another part of the question I'm asking here is from another comment I made on those posts/answers. @SLott said (somewhat paraphrased)


...我发现 private 受保护的是非常重要的设计概念。但是实际上,在成千上万的Java和Python行中,我从未实际上使用 private 受保护的。 ...这是我的问题是谁保护的(还是私人的?)?

... I've found that private and protected are very, very important design concepts. But as a practical matter, in tens of thousands of lines of Java and Python, I've never actually used private or protected. ... Here's my question "protected [or private] from whom?"

试图找出我的担忧是否对请注意,我在该帖子上已评论

To try and find out whether my concerns are anything to be concerned about, I commented on that post. Here it is, edited.


问:不受谁保护?答:来自恶意的黑帽黑客,他们想要访问变量和函数以便能够拒绝服务,访问敏感信息,...看来 A._no_touch = 5 方法将导致这样的恶意编码者嘲笑我的请不要触摸它。我的 A .__ get_SSN(self)似乎很希望B.H. (Black Hat)不知道 x = A(); x._A__get_SSN()技巧(技巧由@Zorf )。

Q: "protected from whom?" A: "From malicious, black-hat hackers who would want to access variables and functions so as to be able to deny service, to access sensitive info, ..." It seems the A._no_touch = 5 approach would cause such a malicious coder to laugh at my "please don't touch this". My A.__get_SSN(self) seems to be just wishful hoping that B.H. (Black Hat) doesn't know the x = A(); x._A__get_SSN() trick (trick by @Zorf).

我可能将问题放在错误的位置,如果可以,我想有人告诉我我将问题放在错误的位置,而且还要加以解释。 使用基于类的方法 Q4 是否可以确保安全? 还有哪些其他非类和变量解决方案可用于处理Python Q5 中的敏感数据?

I could be putting the problem in the wrong place, and if so, I'd like someone to tell me I'm putting the problem in the wrong place, but also to explain. Are there ways of being secure with a class-based approachQ4? What other non-class-and-variable solutions are there for handling sensitive data in PythonQ5?

这里一些代码显示了为什么我看到这些问题的答案,这是为什么怀疑是否应将Python用于敏感数据 Q2 的原因。它不是完整的代码(为什么我要放下这些私有值和方法而不在任何地方使用它们?),但我希望它能显示出我想问的问题类型。我在Python交互式控制台上键入并运行了所有这些命令。

Here's some code that shows why I see the answers to these questions as a reason for wondering if Python should ever be used for sensitive data Q2. It's not complete code (why would I put these private values and methods down without using them anywhere?), but I hope it shows the type of thing I'm trying to ask about. I typed and ran all this at the Python interactive console.

## Type this into the interpreter to define the class.
class A():
  def __init__(self):
    self.name = "Nice guy."
    self.just_a_4 = 4
    self.my_number = 4
    self._this_needs_to_be_pi = 3.14
    self.__SSN = "I hope you do not hack this..."
    self.__bank_acct_num = 123
  def get_info():
    print("Name, SSN, bank account.")
  def change_my_number(self, another_num):
    self.my_number = another_num
  def _get_more_info(self):
    print("Address, health problems.")
  def send_private_info(self):
    print(self.name, self.__SSN, self.__bank_acct_num)
  def __give_20_bucks_to(self, ssn):
    self.__SSN += " has $20"
  def say_my_name(self):
    print("my name")
  def say_my_real_name(self):
    print(self.name)
  def __say_my_bank(self):
    print(str(self.__bank_acct_num))



>>> my_a = A()
>>> my_a._this_needs_to_be_pi
3.14
>>> my_a._this_needs_to_be_pi=4 # I just ignored begins-with-`_` 'rule'.
>>> my_a._this_needs_to_be_pi
4

## This next method could actually be setting up some kind of secure connection,  
## I guess, which could send the private data. I just print it, here.
>>> my_a.send_private_info()
Nice guy. I hope you do not hack this... 123

## Easy access and change a "private" variable
>>> my_a.__SSN
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'A' object has no attribute '__SSN'
>>> my_a.__dict__
{'name': 'Nice guy.', 'just_a_4': 4, 'my_number': 4, '_this_needs_to_be_pi': 4, 
'_A__SSN': 'I hope you do not hack this...', '_A__bank_acct_num': 123}
>>> my_a._A__SSN
'I hope you do not hack this...'

# (maybe) potentially more dangerous
>>> def give_me_your_money(self, bank_num):
      print("I don't know how to inject code, but I can")
      print("access your bank account number:")
      print(my_a._A__bank_acct_num)
      print("and use my bank account number:")
      print(bank_num)
>>> give_me_your_money(my_a,345)
I don't know how to inject code, but I can
access your bank account number:
123
and use my account number:
345

这时,我重新输入了类定义,这可能不是

At this point, I re-entered in the class definition, which probably wasn't necessary.

>>> this_a = A()
>>> this_a.__give_20_bucks_to('unnecessary param')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'A' object has no attribute '__give_20_bucks_to'
>>> this_a._A__give_20_bucks_to('unnecessary param')
>>> this_a._A__SSN
'I hope you do not hack this... has $20'

## Adding a fake "private" variable, `this_a.__SSN`
>>> this_a.__SSN = "B.H.'s SSN"
>>> this_a.__dict__
{'name': 'Nice guy.', 'just_a_4': 4, 'my_number': 4, '_this_needs_to_be_pi': 3.14, 
'_A__SSN': 'I hope you do not hack this... has $20', '_A__bank_acct_num': 123, 
'__SSN': "B.H.'s SSN"}
>>> this_a.__SSN
"B.H.'s SSN"

## Now, changing the real one and "sending/stealing the money"
>>> this_a._A__SSN = "B.H.'s SSN"
>>> this_a._A__give_20_bucks_to('unnecessary param')
>>> this_a._A__SSN
"B.H.'s SSN has $20"

我实际上已经做了一些在以前的签约工作中使用敏感数据-不是SSN和银行帐号,而是诸如人们的年龄,地址,电话号码,个人历史,婚姻和其他关系历史,犯罪记录等之类的东西。我没有参与程序设计保护这些数据;我通过帮助对数据进行实证分析来为机器学习做准备,从而尝试提取有用的信息。我们拥有使用此类数据的许可和法律许可。另一个主要问题是:如何使用Python的敏感数据 Q6 来收集,管理,分析和得出有用的结论?从我在这里讨论的内容来看,似乎类(或其他任何数据结构,虽然我没有在这里介绍,但它们似乎存在相同的问题)似乎都可以安全地完成此操作(我以为基于类的解决方案可能与编译有关。这是真的 Q7 吗?

I've actually done some work at a previous contracting job with sensitive data - not SSNs and bank account numbers, but things like people's ages, addresses, phone numbers, personal history, marital and other relationship history, criminal records, etc. I wasn't involved in the programming to secure this data; I helped with trying to extract useful information by helping to ground-truth the data as preparation for machine learning. We had permission and legal go-aheads to work with such data. Another main question is this: How, in Python, could one collect, manage, analyze, and draw useful conclusions with this sensitive dataQ6? From what I've discussed here, it doesn't seem that classes (or any of the other data structures, which I didn't go into here, but which seem to have the same problems) would allow this to be done securely (privately or in a protected manner. I imagine that a class-based solution probably has something to do with compilation. Is this trueQ7?

最后,由于它不是安全性,而是代码的可靠性将我带到了这里,我将发布另一篇我发现的帖子,并发表我的评论以完成我的问题。

Finally, since it wasn't security, but code reliability that brought me here, I'll post another post I found and comment I made to complete my questions.

@Marcin 已发布


[响应OP的话,]问题很简单。我希望仅在类内部访问和更改私有变量。 [Marcin回答]因此,请勿在类外编写代码访问以 __ 开头的变量。使用 pylint 之类的方法来捕获类似的样式错误。

[In response to the OP's words,] "The problem is simple. I want private variables to be accessed and changed only inside the class." [Marcin responded] So, don't write code outside the class that accesses variables starting with __. Use pylint or the like to catch style mistakes like that.

我的目标与我的以下回复评论是为了查看我的想法是否代表实际的编码问题。我希望它不会像粗鲁一样出现

My goal with my following reply comment was to see if my thoughts represent actual coding concerns. I hope it did't come across as rude


如果您仅出于个人喜好编写代码,并且无需将其交给其他人来维护。每当您处于协作编码环境(任何大专学历和/或工作经验)时,许多人都会使用该编码。有些人想使用一种简单的方法来更改您的 __ you_really_should_not_touch_this 变量。他们可能有这样做的充分理由,但是您可能会设置代码,以使他们的简便方法破坏事情。

It seems this answer would be nice if you wrote code only for your own personal enjoyment and never had to hand it on to someone else to maintain it. Any time you're in a collaborative coding environment (any post-secondary education and/or work experience), the code will be used by many. Someone down the line will want to use an easy way to change your __you_really_should_not_touch_this variable. They may have a good reason for doing so, but it's possible you set up your code such that their "easy way" is going to break things.

是有效点,还是大多数编码人员都尊重双下划线 Q8 是否有更好的方法使用Python保护代码的完整性-比 __ 策略 Q9

Is mine a valid point, or do most coders respect the double underscoreQ8? Is there a better way, using Python, to protect the integrity of the code - better than the __ strategyQ9?

推荐答案

私有受保护对于安全不存在。它们的存在是为了在您的代码中强制执行合约,即逻辑 封装 。如果您将某件作品标记为受保护的私有,则表示它是 逻辑 实现类的实现细节,并且没有其他代码可以直接接触它,因为其他代码可能[无法]正确使用它并可能弄乱状态。

private and protected do not exist for security. They exist to enforce contracts within your code, namely logical encapsulation. If you mark a piece as protected or private, it means that it is a logical implementation detail of the implementing class, and no other code should touch it directly, since other code may not [be able to] use it correctly and may mess up state.

例如,如果您的逻辑规则是每当更改 self._a 时,还必须更新 self._b 具有某个值,则您不希望外部代码修改这些变量,因为如果外部代码不遵循此规则,内部状态可能会混乱。您只希望一个类在内部进行处理,因为这样可以定位潜在的故障点。

E.g., if your logical rule is that whenever you change self._a you must also update self._b with a certain value, then you don't want external code to modify those variables, as your internal state may get messed up if the external code does not follow this rule. You want only your one class to handle this internally since that localises the potential points of failure.

最后,所有这些最终都被编译成一个大字节字节,并且所有数据都在运行时存储在内存中。那时,无论如何,在应用程序范围内都没有保护单个内存偏移的功能,而仅仅是字节存储。 受保护的 private 是程序员在自己的代码上施加的约束,以保持自己的逻辑正确。为此,或多或少的非正式约定,例如 _ 完全足够。

In the end all this gets compiled into a big ball of bytes anyway, and all the data is stored in memory at runtime. At that point there is no protection of individual memory offsets within the application's scope anyway, it's all just byte soup. protected and private are constraints the programmer imposes on their own code to keep their own logic straight. For this purpose, more or less informal conventions like _ are perfectly adequate.

攻击者无法在该级别进行攻击个别属性。正在运行的软件对他们来说是一个黑匣子,无论内部发生什么都无所谓。 如果攻击者可以实际访问各个内存偏移量,或者实际上是 注入代码 ,那么这两种方法都非常有效。此时,受保护的私有无关紧要。

An attacker cannot attack at the level of individual properties. The running software is a black box to them, whatever goes on internally doesn't matter. If an attacker is in a position to actually access individual memory offsets, or actually inject code, then it's pretty much game over either way. protected and private doesn't matter at that point.

这篇关于Python的可访问类变量,敏感数据和恶意编码器(黑帽黑客)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆