合并单词对的数量:python [英] Combine count of word pairs: python

查看:78
本文介绍了合并单词对的数量:python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个映射器,打印出每个单词对,每个单词对计数为1.

I wrote a mapper that prints out word pairs and a count of 1 for each of them.

import sys
from itertools import tee


for line in sys.stdin:
    line = line.strip()
    words = line.split()

def pairs(lst):
    return zip(lst,lst[1:]+[lst[0]])

for i in pairs(words):
    print i,1

我尝试编写一个可以创建字典的化简器,但是我对如何总结它们有些困惑.

I tried writing a reducer that creates a dictionary, but I am a bit stuck on how to sum them up.

import sys

mydict = dict()
for line in sys.stdin:
    (word,cnt) = line.strip().split('\t') #\t
    mydict[word] = mydict.get(word,0)+1

for word,cnt in mydict.items():
    print word,cnt

但是它说.split行中没有足够的参数,想法吗?谢谢你.

But it says there are not enough arguments in the .split line, thoughts? Thank you.

推荐答案

我认为问题是(word,cnt) = line.strip().split('\t') #\t
split()方法返回一个列表,并尝试将其分配给(word, cnt),该列表不起作用,因为项目数不匹配(有时可能只有一个单词).
也许您想使用类似的东西

I think the problem is (word,cnt) = line.strip().split('\t') #\t
The split() method returns a list, and tries to assign it to (word, cnt), which does not work because the number of items doesn't match (maybe there's sometimes only one word).
Maybe you want to use something like

for word in line.strip().split('\t'):
    mydict[word] = mydict.get(word, 0) + 1

如果您对列表元素为空有疑问,请使用list(filter(None, list_name))删除它们.

If you have problems with empty list elements, use list(filter(None, list_name)) to remove them.

免责声明:我没有测试代码.此外,这仅涉及您的第二个示例

Disclaimer: I didn't test the code. Also, this only refers to your second example

这篇关于合并单词对的数量:python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆