pandas 会自动跳过行吗? [英] Does pandas automatically skip rows do a size limit?

查看:132
本文介绍了 pandas 会自动跳过行吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当您在内存错误中运行时,我们都知道问题所在:最大大小熊猫数据框

We all know the question, when you are running in a memory error: Maximum size of pandas dataframe

我还尝试使用以下命令读取4个大的csv-files:

I also try to read 4 large csv-files with the following command:

files = glob.glob("C:/.../rawdata/*.csv")
dfs = [pd.read_csv(f, sep="\t", encoding='unicode_escape') for f in files]
df = pd.concat(dfs,ignore_index=True)

我得到的唯一按摩是:

C:.. \ conda \ conda \ envs \ DataLab \ lib \ site-packages \ IPython \ core \ interactiveshell.py:3214: DtypeWarning:列(22,25,56,60,71,74)具有混合类型.指定 导入时使用dtype选项,或将low_memory = False设置为false.如果(从 self.run_code(代码,结果)):

C:..\conda\conda\envs\DataLab\lib\site-packages\IPython\core\interactiveshell.py:3214: DtypeWarning: Columns (22,25,56,60,71,74) have mixed types. Specify dtype option on import or set low_memory=False. if (yield from self.run_code(code, result)):

应该没问题.

我的总数据框的大小为:(6639037, 84)

My total dataframe has a size of: (6639037, 84)

在没有内存错误的情况下是否有任何数据大小限制?这意味着python会自动跳过某些行而不告诉我吗?过去我曾用另一个porgramm来做这个,我不认为python这么懒,但是你永远都不知道.

Could there be any datasize restriction without an memory error? That means python is automatically skipping some lines without telling me? I had this with another porgramm in the past, I don't think python is so lazy, but you never know.

进一步阅读: 后来我保存为sqlite-file,但我也不认为这应该是一个问题:

Further reading: Later i am saving is as sqlite-file, but I also don't think this should be a problem:

conn = sqlite3.connect('C:/.../In.db')
df.to_sql(name='rawdata', con=conn, if_exists = 'replace', index=False)
conn.commit()
conn.close()

推荐答案

原来,文件读取中有错误,因此感谢@Oleg O提供的帮助和技巧,以减少内存.

It turned out that there was an error in the file reading, so thanks @Oleg O for the help and tricks to reduce the memory.

就目前而言,我认为python不会自动跳过行会产生影响.它只在错误的编码下发生.我的示例可以在这里找到:熊猫读取csv会跳过一些行

For now I do not think that there is a effect that python automatically skips lines. It only happened with wrong coding. My example you can find here: Pandas read csv skips some lines

这篇关于 pandas 会自动跳过行吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆