pandas 数据框read_csv，指定列并将整行保留为字符串 [英] pandas dataframe read_csv, specify columns and keep whole line as a string

查看：73 发布时间：2021/5/15 20:51:42 python pandas import

本文介绍了 pandas 数据框read_csv，指定列并将整行保留为字符串的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在熊猫read_csv中，有一种方法可以指定例如.col1，col15，整行吗?

我正在尝试从文本文件中导入约700000行数据，该文本文件中有帽子"^"作为定界符，没有文本限定符和回车符作为行定界符.

I am trying to import about 700000 rows of data from a text file which has hats '^' as delimiters, no text qualifiers and carriage return as line delimiter.

在文本文件中，我需要第1列，第15列，然后是表/数据框的三列中的整行.

From the text file I need column 1, column 15 and then the whole line in three columns of a table/dataframe.

我已经搜索了如何在熊猫中做到这一点，但对它的逻辑了解不够深.我可以为所有26列导入很好，但这对我的问题没有帮助.

I've searched how to do this in pandas, but don't know it well enough to get the logic. I can import fine for all 26 columns, but that doesn't help my problem.

my_df = pd.read_csv("tablefile.txt", sep="^", lineterminator="\r",  low_memory=False)

或者我可以使用标准的python将数据放入表中，但是对于700000行，这大约需要4个小时.对我来说太长了.

Or I can use standard python to put the data into a table, but this takes about 4 hours for the 700000 rows. which is far too long for me.

count_1 = 0
for line in open('tablefile.txt'):
    if count_1 > 70:
        break
    else:
        col1id = re.findall('^(\d+)\^', line)
        col15id = re.findall('^.*\^.*\^(\d+)\^.*\^.*\^.*\^.*\^.*\^.*\^.*\^.*\^.*\^.*\^.*', line)
        line = line.strip()

        count_1 = count_1 + 1

        cur.execute('''INSERT INTO mytable (mycol1id, mycol15id, wholeline) VALUES (?, ?, ?)''', 
        (col1id[0], col15id[0], line, ) )

        conn.commit()
    print('row count_1=',count_1)

在熊猫read_csv中，有一种方法可以指定例如.col1，col15，整线?

如上所述， col1 和 col15 是数字，而 wholeline 是字符串

As in above, col1 and col15 are digits and wholeline is a string

我不想在导入后重建字符串，因为在此过程中我可能会丢失一些字符.

谢谢

提交到数据库的每一行都是燃烧时间.

Committing to the database for each line was burning time.

pandas 数据框read_csv，指定列并将整行保留为字符串 [英] pandas dataframe read_csv, specify columns and keep whole line as a string

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas 数据框read_csv，指定列并将整行保留为字符串 [英] pandas dataframe read_csv, specify columns and keep whole line as a string

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭