计算标签的功能 [英] Function to count hashtags
本文介绍了计算标签的功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正试图获得一种功能,该功能可以计算并显示列表的标签。
I'm trying to get a function that counts and shows hashtags of a list.
示例输入:
["Hey, im in the #pool",
"beautiful #city",
"#city is nice",
"Have a nice #weekend",
"#weekend <3",
"Nice #"]
输出:
{"pool" : 1, "city" : 2, "weekend" : 2}
但是,如果只有#
,且后面没有单词,则不应将其视为主题标签。
与井号之前的内容相同,不允许将类似%#的内容计为井号。
标签定义为(az,AZ,0-9),每隔一个char结束标签
But if there is only a #
followed by no words, it should not count as a hashtag.
Same with stuff before the hashtag, something like „%#" is not allowed to count as a hashtag.
Hashtags are defined (a-z,A-Z,0-9) every other char ends the hashtag
我当前的代码:
def analyze(posts):
tag = {}
for sentence in posts:
words = sentence.split(' ')
for word in words:
if word.startswith('#'):
if word[1:] in tag.keys():
tag[word[1:]] += 1
else:
tag[word[1:]] = 1
return(tag)
posts = ["Hey, im in the #pool",
"beautiful #city",
"#city is nice",
"Have a nice #weekend",
"#weekend <3",
"Nice #"]
print(analyze(posts))
推荐答案
一键进行不区分大小写正则表达式搜索和 collections.Counter
对象:
In one pass with case insensitive regex search and collections.Counter
object:
from collections import Counter
import re
lst = ["Hey, im in the #pool", "beautiful #city", "#city is nice",
"Have a nice #weekend", "#weekend <3", "Nice #"]
hash_counts = Counter(re.findall(r'#([a-z0-9]+)', ' '.join(lst), re.I))
print(dict(hash_counts))
输出:
{'pool': 1, 'city': 2, 'weekend': 2}
这篇关于计算标签的功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文