如果 Pandas 数据帧超过 10 行,则将其拆分为两部分 [英] Split pandas dataframe in two if it has more than 10 rows

查看:28
本文介绍了如果 Pandas 数据帧超过 10 行,则将其拆分为两部分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个巨大的 CSV 文件,里面有很多行的表格.如果每个数据帧包含超过 10 行,我想简单地将它分成 2 个.

I have a huge CSV with many tables with many rows. I would like to simply split each dataframe into 2 if it contains more than 10 rows.

如果为真,我希望第一个数据帧包含前 10 个数据帧,其余数据帧包含在第二个数据帧中.

If true, I would like the first dataframe to contain the first 10 and the rest in the second dataframe.

有没有方便的功能呢?我环顾四周,但没有发现任何有用的东西...

Is there a convenient function for this? I've looked around but found nothing useful...

split_dataframe(df, 2(if > 10))?

推荐答案

如果条件满足,这将返回拆分的 DataFrames,否则返回原始和 None(然后您需要处理分别地).请注意,这假设每个 df 只需要进行一次拆分,并且拆分的第二部分(如果它超过 10 行(意味着原始文件超过 20 行))是好的.

This will return the split DataFrames if the condition is met, otherwise return the original and None (which you would then need to handle separately). Note that this assumes the splitting only has to happen one time per df and that the second part of the split (if it is longer than 10 rows (meaning that the original was longer than 20 rows)) is OK.

df_new1, df_new2 = df[:10, :], df[10:, :] if len(df) > 10 else df, None

注意你也可以根据需要使用df.head(10)df.tail(len(df) - 10)来获取正反面.您还可以使用各种索引方法:如果需要,您可以只提供第一个维度索引,例如 df[:10] 而不是 df[:10, :](尽管我喜欢对您所采用的尺寸进行明确编码).您也可以使用 df.ilocdf.ix 以类似方式进行索引.

Note you can also use df.head(10) and df.tail(len(df) - 10) to get the front and back according to your needs. You can also use various indexing approaches: you can just provide the first dimensions index if you want, such as df[:10] instead of df[:10, :] (though I like to code explicitly about the dimensions you are taking). You can can also use df.iloc and df.ix to index in similar ways.

使用 df.loc 时要小心,因为 它是基于标签的,输入永远不会被解释为整数位置..loc 只会在您碰巧有从 0 开始且没有间隙的整数的索引标签的情况下意外地"工作.

Be careful about using df.loc however, since it is label-based and the input will never be interpreted as an integer position. .loc would only work "accidentally" in the case when you happen to have index labels that are integers starting at 0 with no gaps.

但您还应该考虑 Pandas 提供的各种选项,用于将 DataFrame 的内容转储到 HTML 和 LaTeX 中,以便为演示文稿制作更好的设计表格(而不仅仅是复制和粘贴).只需在谷歌上搜索如何将 DataFrame 转换为这些格式,就会为这个应用程序提供大量教程和建议.

But you should also consider the various options that pandas provides for dumping the contents of the DataFrame into HTML and possibly also LaTeX to make better designed tables for the presentation (instead of just copying and pasting). Simply Googling how to convert the DataFrame to these formats turns up lots of tutorials and advice for exactly this application.

这篇关于如果 Pandas 数据帧超过 10 行,则将其拆分为两部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆