将 pandas 列转换为PostgreSQL列表? [英] Converting pandas columns to PostgreSQL list?
问题描述
我正在使用几百列的CSV,其中许多只是枚举,即:
I'm working with a CSV of a few hundred columns, many of them are just enumerations, ie:
[
['code_1', 'code_2', 'code_3', ..., 'code_50'],
[1, 2, 3, ..., 50],
[2, 3, 4, ..., 51],
...
[400000, 400001, 400002, ..., 400049]
]
我要将这些数据导入PostgreSQL,并希望将这些列连接到一个数组中,例如:
I'm importing this data into PostgreSQL and would like to concatenate these columns into an array such as:
[
['codes'],
['{1, 2, 3, ..., 50}']
]
等。
我知道可以完成此操作的环回方式,例如
I'm aware of 'round-about' ways I can accomplish this such as
df['codes'] = pd.DataFrame(["{" + df['code_1'] + ", " + df['code_2'] + "}"]).T
,但鉴于此CSV的大小,这是很多冗余代码,可用于编写和维护。
but that's a lot of redundant code to write and maintain given the size of this CSV.
我基本上要使用的是一个列列表,我已经提取了列举的列,例如:
What I basically have to work with is a column list, I've already extracted the enumerated columns such as:
codes = [
'code_1',
'code_2',
'code_3',
...
]
在我开始编写自己的自定义 implode_columns(arr)
函数,熊猫中有什么东西可以解决此问题,或者有特殊的便捷方式容纳PostgreSQL数组吗?
Before I begin writing my own custom "implode_columns(arr)
" function, is there anything in pandas that already solves this problem or has special ways of accommodating PostgreSQL arrays in convenient ways?
推荐答案
假定您已经连接到PostgreSQL,并且已经在PostgreSQL中拥有该表。或访问此链接 https://wiki.postgresql.org/wiki/Psycopg2_Tutorial
Assumed that you already connect to PostgreSQL and already have the table in PostgreSQL. Or visit this link https://wiki.postgresql.org/wiki/Psycopg2_Tutorial
import psycopg2
try:
conn = psycopg2.connect("host='localhost' dbname='template1' user='dbuser' password='dbpass'")
except:
print "I am unable to connect to the database"
首先,打开.csv文件。
First, open the .csv file.
>>> import csv
>>> with open('names.csv') as csvfile:
... reader = csv.DictReader(csvfile)
... for row in reader:
... print(row['first_name'], row['last_name'])
...
来自 https://docs.python.org/2/library/csv.html
更改打印行并在PostgreSQL中插入。
That's example from https://docs.python.org/2/library/csv.html Change the print line with insert into PostgreSQL.
>>> import psycopg2
>>> cur.execute("INSERT INTO test (num, data) VALUES (%s, %s)",
... (100, "abc'def"))
您可以使用(variable1,variable2)更改(100, abc'def)参见此链接 http://initd.org/psycopg/docs/usage.html
或完整的示例代码:
You can change (100, "abc'def") with (variable1, variable2) See this link http://initd.org/psycopg/docs/usage.html Or in full sample code:
>>> import csv
>>> import psycopg2
>>> with open('names.csv') as csvfile:
... reader = csv.DictReader(csvfile)
... for row in reader:
... cur.execute("INSERT INTO test (num, data) VALUES (%s, %s)", (variable1, variable2))
...
希望这会有所帮助...
Hope this will help...
这篇关于将 pandas 列转换为PostgreSQL列表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!