如何使用 pandas 数据框更新Postgres表列? [英] How to update a Postgres table column using a pandas data frame?

查看：87 发布时间：2020/5/24 3:29:02 python pandas postgresql dataframe

本文介绍了如何使用 pandas 数据框更新Postgres表列?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在通过Django在Postgres表中添加100列以上的单列(新迁移).如何使用pandas data_frame中的数据更新PostgreSQL表中的列? Postgres SQL UPDATE 的伪代码将是:

I am adding a single column to a Postgres table with 100+ columns via Django ( a new migration). How can I update a column in a PostgreSQL table with the data from a pandas data_frame? The pseudo-code for Postgres SQL UPDATE would be:

UPDATE wide_table wt
SET wt.z = df.z
WHERE date = 'todays_date'

这样做的原因是，我正在使用S3(这是df.z)中的CSV计算data_frame中的列. Postgres更新的文档易于使用，但是我不确定如何通过Django，sqlalchemy，pyodbc等.

The reason for doing it this way is that I am computing a column in the data_frame using a CSV that is in S3 (this is df.z). The docs for Postgres update are straightforward to use, but I am unsure how to do this via Django, sqlalchemy, pyodbc, or the like.

很抱歉，这有点令人费解.一个不完整的小例子是:

I apologize if this is a bit convoluted. A small and incomplete example would be:

identifier    |      x       |      y      |      z       |      date       
foo           |      2       |      1      |     0.0      |      ...           
bar           |      2       |      8      |     0.0      |      ...      
baz           |      3       |      7      |     0.0      |      ...      
foo           |      2       |      8      |     0.0      |      ...      
foo           |      1       |      5      |     0.0      |      ...      
baz           |      2       |      8      |     0.0      |      ...      
bar           |      9       |      3      |     0.0      |      ...      
baz           |      2       |      3      |     0.0      |      ...

Python片段示例

def apply_function(identifier):
    # Maps baz-> 15.0, bar-> 19.6, foo -> 10.0 for single date
    df = pd.read_csv("s3_file_path/date_file_name.csv")
    # Compute 'z' based on identifier and S3 csv
    return z

postgres_query = "Select identifier from wide_table"
df = pd.read_sql(sql=postgres_query, con=engine)
df['z'] = df.identifier.apply(apply_function)

# Python / SQL Update Logic here to update Postgres Column
???

宽表(更新后列`z`)

identifier    |      x       |      y      |      z        |      date 
foo           |      2       |      1      |     10.0      |      ...     
bar           |      2       |      8      |     19.6      |      ... 
baz           |      3       |      7      |     15.0      |      ... 
foo           |      2       |      8      |     10.0      |      ... 
foo           |      1       |      5      |     10.0      |      ... 
baz           |      2       |      8      |     15.0      |      ... 
bar           |      9       |      3      |     19.6      |      ... 
baz           |      2       |      3      |     15.0      |      ...

注意:z中的值每天都会更改，因此仅创建另一个表来保存这些z值并不是一个很好的解决方案.另外，我真的希望避免删除所有数据并将其重新添加.

NOTE: The values in z will change daily so simply creating another table to hold these z values is not a great solution. Also, I'd really prefer to avoid deleting all of the data and adding it back.

推荐答案

我自己整理了一个解决方案，在其中压缩id和z值，然后执行通用SQL UPDATE语句并利用SQL UPDATE FROM VALUES.

I managed to cobble together a solution myself where I zip the id and z values and then execute a generic SQL UPDATE statement and utilizing SQL UPDATE FROM VALUES.

数据准备

sql_query= "SELECT id, a FROM wide_table"
df = pd.read_sql(sql=sql_query, con=engine)
df['z'] = df.a.apply(apply_function)

zipped_vals = zip(df.id, df.z)
tuple_to_str= str(tuple(zipped_vals))
entries_to_update = tuple_to_str[1:len(tuple_to_str)-1] # remove first and last paren in tuple

SQL查询解决方案:

# Update column z by matching ID from SQL Table & Pandas DataFrame
update_sql_query = f"""UPDATE wide_table t SET z = v.z
                        FROM (VALUES {entries_to_update}) AS v (id, z)
                        WHERE t.id = v.id;"""

with engine.begin() as conn:
    conn.execute(update_sql_query)

conn.exec(sql_query)

答案关于从值更新PostgreSQL表列

Answer on updating PostgreSQL table column from values

PostgreSQL更新文档

这篇关于如何使用 pandas 数据框更新Postgres表列?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用 pandas 数据框更新Postgres表列? [英] How to update a Postgres table column using a pandas data frame?

问题描述

Python片段示例

宽表(更新后列`z`)

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何使用 pandas 数据框更新Postgres表列? [英] How to update a Postgres table column using a pandas data frame?

问题描述

Python片段示例

宽表(更新后列z)

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

宽表(更新后列`z`)

登录关闭