如果与当前令牌python相同,则删除句子中的前一个令牌 [英] Deleting previous token in a sentence if same as current token python

查看:103
本文介绍了如果与当前令牌python相同,则删除句子中的前一个令牌的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个关键的价值对字典,如:

I have 2 dictionaries of key, value pairs like:

tokenIDs2number = {(6, 7): 1000000000.0, (22,): 700.0, (12,): 3000.0}

tokenIDs2number = {(27, 28): u'South Asia'}

键是句子中数字和位置插槽的索引位置的元组:

The keys are tuples of the index locations of number and location slots in the sentence:

GDP in 2007 totaled about $ 1 billion , or about $ 3,000 per capita -LRB- exceeding the average of about $ 700 in the rest of South Asia -RRB- .

我想循环遍历数字和位置的所有元组,并从元组中删除值如果他们彼此相邻,例如使他们:

I want to loop through all the tuples for both the numbers and locations, and remove values from the tuples if they are next to each other, e.g. make them:

tokenIDs2number = {(7,): 1000000000.0, (22,): 700.0, (12,): 3000.0}

tokenIDs2number = {(28,): u'South Asia'}

所以以后,我可以用位置和号码槽填充这个句子令牌,所以句子变成:

So that later on, I can fill this sentence token in with location and number slots, so the sentence becomes:

GDP in 2007 totaled about $ NUMBER_SLOT , or about $ NUMBER_SLOT per capita -LRB- exceeding the average of about $ NUMBER_SLOT in the rest of LOCATION_SLOT -RRB- .

而不是:

GDP in 2007 totaled about $ NUMBER_SLOT NUMBER_SLOT , or about $ NUMBER_SLOT per capita -LRB- exceeding the average of about $ 700 in the rest of LOCATION_SLOT LOCATION_SLOT -RRB- .

当前代码:

for locationTokenIDs, location in tokenIDs2location.items():
  for numberTokenIDs, number in tokenIDs2number.items():
    prevNoID=numberTokenIDs[0]
    prevLocID=locationTokenIDs[0]
    for numberTokenID in numberTokenIDs:
        for locationTokenID in locationTokenIDs:
            if numberTokenID==prevNoID+1:
                numberTokenIDs.remove(numberTokenIDs[prevNoID])
                if numberTokenID>0 and numberTokenID<(len(sampleTokens)-1):
                    prevNoID = numberTokenID
            if locationTokenID==prevLocID+1:
                locationTokenIDs.remove(locationTokenIDs[prevLocID])
                if locationTokenID>0 and locationTokenID<(len(sampleTokens)-1):
                    prevLocID = locationTokenID

但是,似乎我不能从元组中删除数字,所以我是str

However, it seems I cannot just remove numbers from a tuple, so I am struggling to figure out how to do this.

推荐答案

由于 tuple s (通常 dict 键)是不可变的,您不能直接更改键。然而,您可以使用字典理解将您的dict转换为所需的一行:

Since tuples (and usually dict keys in general) are immutable, you can not change the keys directly. However, you can use a dictionary comprehension to transform your dict to what you need in one line:

tokenIDs2number = {(6, 7): 1000000000.0, (22,): 700.0, (12,): 3000.0}
tokenIDs2number = {(k[-1],): v for k, v in tokenIDs2number.items()}

使用 k [-1] 始终访问最后元素可以让您以相同的方式处理任何长度的元组。

Using k[-1] to always access the last element lets you handle tuples of any length the same way.

这篇关于如果与当前令牌python相同,则删除句子中的前一个令牌的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆