如何在Python中逐字输入行? [英] How to input a line word by word in Python?
问题描述
我有多个文件,每个文件都有一行,每个数字都有大约10M.我想检查每个文件,并为每个重复编号的文件打印0,为没有重复编号的文件打印1.
I have multiple files, each with a line with, say ~10M numbers each. I want to check each file and print a 0 for each file that has numbers repeated and 1 for each that doesn't.
我正在使用一个列表来计数频率.由于每行有大量数字,因此我想在接受每个数字后更新频率,并在找到重复的数字后立即中断.尽管这在C语言中很简单,但是我不知道如何在Python中做到这一点.
I am using a list for counting frequency. Because of the large amount of numbers per line I want to update the frequency after accepting each number and break as soon as I find a repeated number. While this is simple in C, I have no idea how to do this in Python.
如何在不存储(或作为输入)整行的情况下逐字输入一行?
How do I input a line in a word-by-word manner without storing (or taking as input) the whole line?
我还需要一种从实时输入而不是文件中执行此操作的方法.
I also need a way for doing this from live input rather than a file.
推荐答案
读取行,拆分行,将数组结果复制到一组中.如果集合的大小小于数组的大小,则文件包含重复的元素
Read the line, split the line, copy the array result into a set. If the size of the set is less than the size of the array, the file contains repeated elements
with open('filename', 'r') as f:
for line in f:
# Here is where you do what I said above
要逐字读取文件,请尝试
To read the file word by word, try this
import itertools
def readWords(file_object):
word = ""
for ch in itertools.takewhile(lambda c: bool(c), itertools.imap(file_object.read, itertools.repeat(1))):
if ch.isspace():
if word: # In case of multiple spaces
yield word
word = ""
continue
word += ch
if word:
yield word # Handles last word before EOF
然后您可以执行以下操作:
Then you can do:
with open('filename', 'r') as f:
for num in itertools.imap(int, readWords(f)):
# Store the numbers in a set, and use the set to check if the number already exists
此方法也应适用于流,因为它一次只能读取一个字节,并从输入流中输出一个以空格分隔的字符串.
This method should also work for streams because it only reads one byte at a time and outputs a single space delimited string from the input stream.
给出答案后,我已经对该方法进行了相当多的更新.看看
After giving this answer, I've updated this method quite a bit. Have a look
<script src="https://gist.github.com/smac89/bddb27d975c59a5f053256c893630cdc.js"></script>
这篇关于如何在Python中逐字输入行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!