无法使用pandas 0.17.1附加数据帧,但可以使用pandas 0.14.1附加数据帧 [英] Cannot append dataframes with pandas 0.17.1 but can with pandas 0.14.1

查看:73
本文介绍了无法使用pandas 0.17.1附加数据帧,但可以使用pandas 0.14.1附加数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我下面有两个数据框ch

I have two dataframes, c and h below

c pickle file: http://s000.tinyupload.com/?file_id=64255815375060941529
h pickle file: http://s000.tinyupload.com/?file_id=98284988001290720556

当我写c.append(h)时,我会得到TypeError: data type not understood,但只有运行熊猫0.17.1时才能得到.如果我在pandas 0.14.1中运行此代码,则数据帧将正确附加.怎么回事,如何修改数据框以正确附加到0.17.1中?

When I write c.append(h) I get TypeError: data type not understood but ONLY if I run pandas 0.17.1. If I run this code in pandas 0.14.1 then the dataframes are appended correctly. What's going on and how can I modify my dataframes to append correctly in 0.17.1?

这是数据帧的标题

In [49]: h.head(3)
Out[49]: 
   report_id adv_firm_key  manager_id         filing_manager_name  \
0      45497       105129       20984  Bridgewater Associates, LP   
1      45497       105129       20984  Bridgewater Associates, LP   
2      45497       105129       20984  Bridgewater Associates, LP   

  report_period            issuer_name      cusip  position_value  quantity  \
0    2015-12-31            ABBOTT LABS  002824100         1745000     38857   
1    2015-12-31  ACCENTURE PLC IRELAND  G1151C101          512000      4900   
2    2015-12-31          ADOBE SYS INC  00724F101         9157000     97479   

  principal_type put_or_call                  sector  total_holding_value  \
0             SH           X             Health Care           7707722000   
1             SH           X  Information Technology           7707722000   
2             SH           X  Information Technology           7707722000   

   total_holding_value_calculated   market_cap  shares_float      beta symbol  \
0                      7707722000  66993140300    1488070000  0.924138    ABT   
1                      7707722000  67773564900     626355000  0.985543    ACN   
2                      7707722000  46848347700     496787000  1.099186   ADBE   

   allocation  portfolio_value  
0      300000          2000000  
1      300000          2000000  
2      300000          2000000  

In [50]: c.head(3)
Out[50]: 
  put_or_call  position_value report_date  fund_id report_period  \
0           X           10000  2015-11-02      502    2015-12-31   
1           X           10000  2015-11-02      502    2015-12-31   
2           X           10000  2015-11-02      502    2015-12-31   

   underlying_id  quantity   side                       created_at  report_id  \
0           1001         5  Short 2016-03-16 17:31:57.003792+00:00        NaN   
1           1001         5  Short 2016-03-16 17:31:57.003792+00:00        NaN   
2           1001         5  Short 2016-03-16 17:31:57.003792+00:00        NaN   

    ...     adv_firm_key                     filing_manager_name symbol  \
0   ...           155680  Davidson Kempner Capital Management LP   AAOI   
1   ...           155680  Davidson Kempner Capital Management LP   AAOI   
2   ...           155680  Davidson Kempner Capital Management LP   AAOI   

                       sector      cusip                     issuer_name  \
0  Telecommunication Services  03823U102     APPLIED OPTOELECTRONICS INC   
1  Telecommunication Services  03823U102  APPLIED OPTOELECTRONICSINC COM   
2  Telecommunication Services  03823U102     APPLIED OPTOELECTRONICS INC   

  principal_type market_cap  shares_float     beta  
0             SH  288734200      14566500  1.45758  
1             SH  288734200      14566500  1.45758  
2             SH  288734200      14566500  1.45758  

[3 rows x 21 columns]

这是堆栈跟踪

In [11]: pd.concat([c,h])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-943f474750e7> in <module>()
----> 1 pd.concat([c,h])

/usr/local/miniconda/envs/analytics-env/lib/python2.7/site-packages/pandas/tools/merge.py in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, copy)
    833                        verify_integrity=verify_integrity,
    834                        copy=copy)
--> 835     return op.get_result()
    836 
    837 

/usr/local/miniconda/envs/analytics-env/lib/python2.7/site-packages/pandas/tools/merge.py in get_result(self)
   1023             new_data = concatenate_block_managers(
   1024                 mgrs_indexers, self.new_axes,
-> 1025                 concat_axis=self.axis, copy=self.copy)
   1026             if not self.copy:
   1027                 new_data._consolidate_inplace()

/usr/local/miniconda/envs/analytics-env/lib/python2.7/site-packages/pandas/core/internals.py in concatenate_block_managers(mgrs_indexers, axes, concat_axis, copy)
   4472                                                 copy=copy),
   4473                          placement=placement)
-> 4474               for placement, join_units in concat_plan]
   4475 
   4476     return BlockManager(blocks, axes)

/usr/local/miniconda/envs/analytics-env/lib/python2.7/site-packages/pandas/core/internals.py in concatenate_join_units(join_units, concat_axis, copy)
   4569     to_concat = [ju.get_reindexed_values(empty_dtype=empty_dtype,
   4570                                          upcasted_na=upcasted_na)
-> 4571                  for ju in join_units]
   4572 
   4573     if len(to_concat) == 1:

/usr/local/miniconda/envs/analytics-env/lib/python2.7/site-packages/pandas/core/internals.py in get_reindexed_values(self, empty_dtype, upcasted_na)
   4823             if self.is_null and not getattr(self.block, 'is_categorical',
   4824                                             None):
-> 4825                 missing_arr = np.empty(self.shape, dtype=empty_dtype)
   4826                 if np.prod(self.shape):
   4827                     # NumPy 1.6 workaround: this statement gets strange if all

TypeError: data type not understood

推荐答案

存在错误 11351 -处理不正确:

如果您尝试添加hcreated_at rel ="nofollow"> concat :

If you try add new column created_at, which is missing in h and concat:

h['created_at'] = np.nan
new = pd.concat([h,c])

获取错误:

AttributeError:"numpy.ndarray"对象没有属性"tz_localize"

AttributeError: 'numpy.ndarray' object has no attribute 'tz_localize'

一种解决方案是将Datetime转换为string:

One solution is convert Datetime to string:

c['created_at'] = c['created_at'].astype(str)  
new = pd.concat([h,c])
new['created_at'] = pd.to_datetime(new['created_at'])  

这篇关于无法使用pandas 0.17.1附加数据帧,但可以使用pandas 0.14.1附加数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆