无法将 pandas 数据框中的列添加到python中的mysql [英] Not able to add a column from a pandas data frame to mysql in python
问题描述
我已经从python连接到mysql,并且可以使用df.to_sql命令将整个数据帧添加到sql.当我从pd.DataFrame添加/更新单个列时,无法udate/add.
I have connected to mysql from python and I can add a whole data frame to sql by using df.to_sql command. When I am adding/updating a single column from pd.DataFrame, not able udate/add.
以下是有关数据集,结果的信息
Here is the information about dataset, result,
In [221]: result.shape
Out[221]: (226, 5)
In [223]: result.columns
Out[223]: Index([u'id', u'name', u'height', u'weight', u'categories'], dtype='object')
数据库中已经有除类别之外的所有列的表,因此我只需要将列添加到表中即可.从这些,
I have the table already in the database with all the columns except categories, so I just need to add the column to the table. From these,
ProgrammingError :( 1064,您的SQL语法有误;请查看与您的MySQL服务器版本相对应的手册以获取正确的语法
cursor.execute("ALTER TABLE content_detail ADD category VARCHAR(255)" % result["categories"])
这可以成功添加列,但具有所有NULL值, 而当我尝试这个
This can be successfully add the column but with all NULL values, and when I was trying this
cursor.execute("ALTER TABLE content_detail ADD category=%s VARCHAR(255)" % result["categories"])
以以下错误结束
ProgrammingError Traceback (most recent call last)
<ipython-input-227-ab21171eee50> in <module>()
----> 1 cur.execute("ALTER TABLE content_detail ADD category=%s VARCHAR(255)" % result["categories"])
/usr/lib/python2.7/dist-packages/mysql/connector/cursor.pyc in execute(self, operation, params, multi)
505 self._executed = stmt
506 try:
--> 507 self._handle_result(self._connection.cmd_query(stmt))
508 except errors.InterfaceError:
509 if self._connection._have_next_result: # pylint: disable=W0212
/usr/lib/python2.7/dist-packages/mysql/connector/connection.pyc in cmd_query(self, query)
720 if not isinstance(query, bytes):
721 query = query.encode('utf-8')
--> 722 result = self._handle_result(self._send_cmd(ServerCmd.QUERY, query))
723
724 if self._have_next_result:
/usr/lib/python2.7/dist-packages/mysql/connector/connection.pyc in _handle_result(self, packet)
638 return self._handle_eof(packet)
639 elif packet[4] == 255:
--> 640 raise errors.get_exception(packet)
641
642 # We have a text result set
ProgrammingError: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '=0 corporate
1 corporate
我认为我缺少数据类型,请帮我解决一下,谢谢.
I think there is something I am missing with datatype, please help me to sort this out, thanks.
推荐答案
您不能一步一步将一列包含数据的表添加到表中.您必须至少使用两个单独的语句来先执行DDL(ALTER TABLE
),然后再执行DML(UPDATE
或INSERT ... ON DUPLICATE KEY UPDATE
).
You cannot add a column to your table with data in it all in one step. You must use at least two separate statements to perform the DDL first (ALTER TABLE
) and the DML second (UPDATE
or INSERT ... ON DUPLICATE KEY UPDATE
).
这意味着添加具有NOT NULL
约束的列需要三个步骤:
This means that to add a column with a NOT NULL
constraint requires three steps:
- 添加可为空的列
- 在每一行中填充值
- 在列中添加
NOT NULL
约束
- Add nullable column
- Populate column with values in every row
- Add the
NOT NULL
constraint to the column
或者,通过使用虚拟"默认值,您可以分两步进行操作(请注意不要让任何虚拟"值浮动,或者使用有意义的/有据可查的值):>
Alternatively, by using a "dummy" default value, you can do it in two steps (just be careful not to leave any "dummy" values floating around, or use values that are meaningful/well-documented):
- 将列添加为
NOT NULL DEFAULT ''
(或将0
用于数字类型) - 在每一行中填充值
- Add column as
NOT NULL DEFAULT ''
(or use e.g.0
for numeric types) - Populate column with values in every row
您可以选择再次更改表格以删除DEFAULT
值.就我个人而言,我更喜欢第一种方法,因为它不会在表中引入无意义的值,并且如果第二步有问题,则更有可能引发错误.当列适合某个自然的DEFAULT
值时,我 可能会使用第二种方法,并且我打算将其保留在最终表定义中.
You can optionally alter the table again to remove the DEFAULT
value. Personally, I prefer the first method because it doesn't introduce meaningless values into your table and it's more likely to throw an error if the second step has a problem. I might go with the second method when a column lends itself to a certain natural DEFAULT
value and I plan to keep that in the final table definition.
此外,您没有正确地设置查询参数;您应该将参数值传递给方法,而不是在方法调用中格式化字符串参数.换句话说:
Additionally, you are not parameterizing your query correctly; you should pass the parameter values to the method rather than formatting the string argument inside the method call. In other words:
cursor.execute("Query with %s, %s, ...", iterable_with_values) # Do this!
cursor.execute("Query with %s, %s, ..." % iterable_with_values) # NOT this!
这篇关于无法将 pandas 数据框中的列添加到python中的mysql的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!