在Python中为字符串中的单词分配数字 [英] Assigning number to word in a string in python
问题描述
您好,我在python中有一个压缩任务来开发代码,如果输入为
hi i have a compression task in python to develop code where if the input is
你好,我,你好,我能听到你的声音吗?
然后输出应为
1,2,3,1,4,5,6,3,1,7,5,8
基本上,每个单词都分配有一个数值,如果重复单词,则单词也是如此.这段编码是用python编写的,请帮助我谢谢
Basically each word is assigned a numerical value and if the word is repeated so is the word. This coding is in python, please help me thank you
推荐答案
一种简单的方法是使用dict,当您发现一个新单词时,如果您之前看到过该单词,则使用递增变量添加键/值对只需打印字典中的值即可:
An easy way is to use a dict, when you find a new word add a key/value pairing using an incrementing variable, when you have seen the word before just print the value from the dict:
s = 'hello its me, hello can you hear me, hello are you listening'
def cyc(s):
# set i to 1
i = 1
# split into words on whitespace
it = s.split()
# create first key/value pair
seen = {it[0]: i}
# yield 1 for first word
yield i
# for all var the first word
for word in it[1:]:
# if we have seen this word already, use it's value from our dict
if word in seen:
yield seen[word]
# else first time seeing it so increment count
# and create new k/v pairing
else:
i += 1
yield i
seen[word] = i
print(list(cyc(s)))
输出:
[1, 2, 3, 1, 4, 5, 6, 3, 1, 7, 5, 8]
您也可以通过使用 iter
并调用 next
弹出第一个单词来避免切片,即使您想将 foo == foo!
我们需要从字符串中删除所有标点,这些标点可以通过 str.rstrip :
You can also avoid slicing by using iter
and calling next
to pop the first word, also if you want to make foo == foo!
we need to remove any punctuation from the string which cam be done with str.rstrip:
from string import punctuation
def cyc(s):
i = 1
it = iter(s.split())
seen = {next(it).rstrip(punctuation): i}
yield i
for word in it:
word = word.rstrip(punctuation)
if word in seen:
yield seen[word]
else:
i += 1
yield i
seen[word] = i
这篇关于在Python中为字符串中的单词分配数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!