删除具有阈值或类别的行,并在 pandas 中保存为多个CSV [英] delete row with threshold or category and save to multiple CSV in pandas

查看:78
本文介绍了删除具有阈值或类别的行,并在 pandas 中保存为多个CSV的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是python的初学者。我的大数据看起来像这样:

I am a beginner in python. I have big data looks like this:

df
Mean       id
0.089394    1
0.389394    2
0.047313    3
0.047313    4
0.767004    5
0.767004    6
0.363154    7
0.363154    8
0.098941    9
1.578785    10
0           11
.....

我要删除或删除行平均列数据的类别低于0到2(例如:> 0,> 0.1,> 0.2,直到> 2)。我使用了以下代码:

I want to eliminate or delete row mean column data with category below than 0 to 2 (example: >0, >0.1, >0.2, until >2). I used this code:

df = df[df.Mean > 0]

如果使用此代码,则每个代码都必须放置许多阈值类别。是否有一种优雅的方法可以根据每个阈值自动计算并保存到多个CSV?

if I use this code, I have to put many threshold categories every single code. Is there an elegant way to calculate and save to multiple CSV automatically based on each threshold?

例如,我对> 0

df>0
Mean       id
0.089394    1
0.389394    2
0.047313    3
0.047313    4
0.767004    5
0.767004    6
0.363154    7
0.363154    8
0.098941    9
1.578785    10

> 0.1

df>0.1
Mean       id
0.089394    1
0.389394    2
0.767004    5
0.767004    6
0.363154    7
0.363154    8
1.578785    10

等等

推荐答案

定义一个将平均值和阈值作为变量的函数:

Define a function that takes in the mean value and the threshold as the variables:

def helping_func(value, threshold):
    return (value > threshold)

使用对于 l oop执行条件检查并存储到单个csv文件中:

Use a for loop to perform the conditional check and store into individual csv files:

for i in np.arange(0,21,1): # to import numpy as np
    threshold = i/10 # to overcome floating point inaccuracy
    result_df = df[helping_func(df["Mean"], threshold)]
    csvFileName = "result" + str(i) + ".csv" # name the individual csv files in any format as you deemed appropriate
    result_df.to_csv(csvFileName, sep=",") # sep character at your preference

或者,只需在中应用条件检查循环:

for i in np.arange(0,21,1): # to import numpy as np
    threshold = i/10 # to overcome floating point inaccuracy
    result_df = df[df["Mean"] > threshold]
    csvFileName = "result" + str(i) + ".csv" # name the individual csv files in any format as you deemed appropriate
    result_df.to_csv(csvFileName, sep=",") # sep character at your preference

这篇关于删除具有阈值或类别的行,并在 pandas 中保存为多个CSV的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆