python - 如何匹配文本每个单词在另一个文本中的单词,及该单词对应的值?

查看:93
本文介绍了python - 如何匹配文本每个单词在另一个文本中的单词,及该单词对应的值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问 题

文本ttt.txt内容:
president said would bill program loan farmers
corn committee department agriculture
usda house
文本sss.txt内容:
Topic 0th:

said   0.045193
would   0.028879
bill   0.011087
program   0.010718
loan   0.008395
farmers   0.008237
corn   0.008078
committee   0.007022
department   0.006811
agriculture   0.006653
usda   0.006547
house   0.006494
president 

Topic 1th:

said   0.044315
shares   0.031928
stock   0.028001
company   0.023888
group   0.017063
offer   0.016408
share   0.016268
dlrs   0.016034
corp   0.015520
common   0.013463
president  0.000047

如何在sss中匹配ttt中每个单词分别在2个主题下的单词及对应的值?

解决方案


# coding: utf8

result = {}
with open('ttt.txt') as f_t, open('sss.txt') as f_s:
    key_set = set(f_t.read().split())     # 将ttt的每个单词存到key集合
    topic = ''
    for line in f_s:
        if line.startswith('Topic'):      # 储存每个Topic
            topic = line.strip()
            result[topic] = {}

        else:
            line_split = line.split()
            if len(line_split) < 2:
                line_split.append('None')  # 防止没有值的key
            key, value = line_split

            if key in key_set:            # 如果第一列在key集合内 就收集值
                result[topic].update({
                    key: value
                })
print(result)

这篇关于python - 如何匹配文本每个单词在另一个文本中的单词,及该单词对应的值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆