用Python解析化学公式 [英] Parsing a Chemistry Formula in Python

查看:66
本文介绍了用Python解析化学公式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试解决此问题: https://leetcode.com/articles/atom-of-atoms/#approach-1-recursion-accepted .

I am trying to solve this problem: https://leetcode.com/articles/number-of-atoms/#approach-1-recursion-accepted.

问题是:给定类似 C(Mg2(OH)4)2 的公式,返回包含元素及其计数的哈希表.元素名称始终以大写字母开头,后跟一个小写字母.

The question is: given a formula like C(Mg2(OH)4)2, return a hash table with elements and their counts. Element names always start with a capital letter and may be followed by a small letter.

我认为我首先要解决最简单的情况:没有括号.

I thought that I will first start by solving the simplest case: no brackets.

def bracket_hash(formula):
    element = ""
    atom_count = 0
    element_hash = {}

    for x in formula:
        if x.isupper():
            if element!="":
                element_hash[element] = 1
                element = ""
            element = x

        elif x.islower():
            element += x            

        else: 
            element_count = int(x)
            element_hash[element] = element_count
            element_count = 0
            element = ""

    if element!="":
        element_hash[element] = 1

    return element_hash

此代码在以下情况下可以很好地工作:

This code works perfectly fine for cases like:

print(bracket_hash("H2O"))
print(bracket_hash("CO2"))
print(bracket_hash("Mg2O4"))
print(bracket_hash("OH"))

现在,我认为必须以某种方式使用堆栈来处理诸如 OH(Ag3(OH)2)4 等多个括号的情况,这里Ag的数量必须为3 * 4,O和H的数量为计数将为2 * 4 +1.

Now I thought that somehow stacks must be used to handle the case of multiple brackets like OH(Ag3(OH)2)4, here Ag's count has to be 3*4 and O and H's count will be 2*4 + 1.

到目前为止,我从这样的事情开始:

So far I started with something like this:

def formula_hash(formula):
    stack = []
    final_hash = {}
    cur = ""
    i = 0

    while i < len(formula):
        if formula[i] == '(':
            j = i
            while formula[j]!=')':
                j = j + 1
            cur = formula[i:j]
            stack.append(bracket_hash(cur))
            cur = ""
            i = j + 1

但是现在我被困住了.

随着编码问题变得越来越长,并且涉及到要解决的数据结构的混合,我有点陷入困境.在这里,他们使用哈希表和堆栈.

I kind of get stuck as coding problems get longer and involved a mix of data structures to solve. Here they use Hash table and stack.

所以我的问题是:如何将这个问题分解为可管理的部分并加以解决.如果确实要解决此问题,则必须将其映射到可管理的代码段.任何帮助将不胜感激.

So my question is: how to break down this problem into manageable parts and solve it. If I am really solving this problem I have to map it to manageable code segments. Any help would be greatly appreciated.

谢谢.

推荐答案

我认为您可以使用递归来解决此问题.这是您的函数应如何工作:

I think you can use recursivity to solve this problem. Here is how your function should work:

  • 就像您在第一个代码中一样,直到遇到括号.
  • 遇到圆括号时,请找到相应的圆括号.这可以通过一个计数器来完成:将其初始化为1,然后在遇到新的右括号时,将计数器递增,而在遇到右括号时,则将其递减.当计数器等于0时,您已经找到了匹配的右括号.
  • 将字符串切入括号之间,并使用该字符串调用相同的函数(这是递归方面).
  • 将返回的字典中的值添加到当前字典中,再乘以括号后面的数字.

如果您在实施此解决方案的某些部分时遇到问题,请告诉我,我将提供更多详细信息.

If you have problems implementing some parts of this solution, tell me and I will give more details.

关于堆栈方法堆栈方法只是模拟递归.它具有一堆计数器,而不是再次调用该函数并具有本地计数器.打开圆括号时,它会在此上下文中计数;关闭圆括号时,它将与包含它的上下文合并,并具有相应的多重性.

about the stack approach The stack approach just simulates recursivity. Instead of calling the function again and having local counter, it has a stack of counters. When an opening parenthesis is opened, it counts in this context, and when it's closed it merges it with the context which contains it, with corresponding multiplicity.

到目前为止,我更喜欢递归方法,这是更自然的方法.

I prefer by far the recursive approach, which is more natural.

这篇关于用Python解析化学公式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆