计算csv中的单词出现次数并确定行外观 [英] Counting word occurrences in csv and determine row appearances
问题描述
我在一栏中有一个csv文件,例如以下文件.这些符号和数字仅表示该文件不仅仅包含文本.我有两个目标:
I have a csv file such as the following in one column. The symbols and numbers are only to show that the file does not just contain text. I have two objectives:
- 计算单词出现的次数;
- 确定一个单词出现在多少行中.
Stuff
I like apples. Sally likes apples.
Jim has 4 berries. !@#
John has 2 apples.
理想情况下,代码应返回如下内容: {苹果:3} {行数:2}
Ideally, the code should return something like: {apples: 3} {# of rows: 2}
我已经编写了一些代码来尝试计算发生的次数,但是它运行不正常(可能是由于标点符号引起的).另外,我不知道如何确定单词出现的行数;这很简单,只要计算每行中唯一出现的次数即可,但是我不确定如何进行.这是我到目前为止在Python 3.6.1中完成的代码:
I've written some code to try and count occurrences, but it isn't running properly (assumedly because of the punctuation). Also, I do not know how to determine the number of rows a word appears in; this could be as simple as counting the number of unique occurrences in each row, but I'm unsure of how to proceed. Here is the code I have so far, done in Python 3.6.1:
import csv
my_reader = csv.reader(open('file.csv', encoding = 'utf-8'))
ctr = 0
for record in my_reader:
if record[0] == 'apples':
ctr += 1
print(ctr)
代码仅返回0
作为答案.帮助吗?
The code merely returns 0
as the answer. Help?
推荐答案
您正在比较row == 'apple
是否需要的是if 'apple' in row
.要计算发生次数,您可以使用str.count()
,例如:
You are comparing if the row == 'apple
, what you need is if 'apple' in row
. And to count the occurrences you can use str.count()
, for example:
import csv
my_reader = csv.reader(open('file.csv', encoding = 'utf-8'))
ctr = 0
rows = 0
for record in my_reader:
if 'apples' in record[0]:
rows += 1
ctr += record[0].count('apples')
print('apples: {}, rows: {}'.format(ctr, rows))
这样,您将检查row
是否包含apples
,然后在该row
中将rows
递增1,然后将ctr
递增apples
的数量.
This way you will check if the row
contains apples
then you increment rows
by one and increment ctr
by number of apples
in that row
.
这篇关于计算csv中的单词出现次数并确定行外观的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!