如何使用 Pandas 逐个读取 CSV 文件? [英] How to read a CSV file subset by subset with Pandas?

查看:61
本文介绍了如何使用 Pandas 逐个读取 CSV 文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含 13000 行和 3 列的数据框:

I have a data frame with 13000 rows and 3 columns:

('time', 'rowScore', 'label')

我想逐个读取子集:

[[1..360], [360..712], ..., [12640..13000]]

我也使用了列表,但它不起作用:

I used list too but it's not working:

import pandas as pd
import math
import datetime

result="data.csv"
dataSet = pd.read_csv(result)
TP=0
count=0
x=0
df = pd.DataFrame(dataSet, columns = 
     ['rawScore','label'])
for i,row in df.iterrows():
    data=  row.to_dict()   

    ScoreX= data['rawScore']
    labelX=data['label']


  for i in range (1,13000,360):
     x=x+1
    for j in range (i,360*x,1):
        if ((ScoreX  > 0.3) and (labelX ==0)):
            count=count+1
 print("count=",count)

推荐答案

您还可以使用参数 nrowsskiprows 将其分解为块.我建议不要使用 iterrows,因为这通常很慢.如果您在读取值时执行此操作,并分别保存这些块,则会跳过 iterrows 部分.如果您想分成多个块,这是用于读取文件(这似乎是您尝试执行的操作的中间步骤).

You can also use the parameters nrows or skiprows to break it up into chunks. I would recommend against using iterrows since that is typically very slow. If you do this when reading in the values, and saving these chunks separately, then it would skip the iterrows section. This is for the file reading if you want to split up into chunks (which seems to be an intermediate step in what you're trying to do).

另一种方法是通过查看值是否属于每个集合来使用生成器进行子集化:[[1..360], [360..712], ..., [12640..13000]]

Another way is to subset using generators by seeing if the values belong to each set: [[1..360], [360..712], ..., [12640..13000]]

因此编写一个函数,该函数采用索引可被 360 整除的块,如果索引在该范围内,则选择该特定子集.

So write a function that takes the chunks with indices divisible by 360 and if the indices are in that range, then choose that particular subset.

我只是将这些方法写下来作为您可能想要尝试的替代想法,因为在某些情况下,您可能只需要一个子集而不是所有块用于计算目的.

I just wrote these approaches down as alternative ideas you might want to play around with, since in some cases you may only want a subset and not all of the chunks for calculation purposes.

这篇关于如何使用 Pandas 逐个读取 CSV 文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆