如何在 pandas 中进行数据框的交集 [英] How to do intersection of dataframes in pandas

查看:54
本文介绍了如何在 pandas 中进行数据框的交集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个如下数据框:

<table border="1" class="dataframe">  <thead>    <tr style="text-align: right;">      <th></th>      <th>Title</th>      <th>ASIN</th>      <th>State</th>      <th>SellerSKU</th>      <th>Quantity</th>      <th>FBAStock</th>      <th>QuantityToShip</th>    </tr>  </thead>  <tbody>    <tr>      <th>1</th>      <td>Daedal crafters- Pack of Two Gajra (Orange and...</td>      <td>B075T64ZWJ</td>      <td>WEST BENGAL</td>      <td>DC216</td>      <td>1</td>      <td>0</td>      <td>1</td>    </tr>    <tr>      <th>2</th>      <td>Daedal Dream Catchers - Intricate Web Design(B...</td>      <td>B06XBRRYVK</td>      <td>KARNATAKA</td>      <td>DDC63BB</td>      <td>1</td>      <td>24</td>      <td>0</td>    </tr>    <tr>      <th>3</th>      <td>Daedal Dream Catchers- Blue and White Four Rin...</td>      <td>B07428QBJ9</td>      <td>MAHARASHTRA</td>      <td>12-16RT-1H8B</td>      <td>1</td>      <td>4</td>      <td>0</td>    </tr>    <tr>      <th>4</th>      <td>Daedal dream catchers- Crescent wine DDC21</td>      <td>B01DI70P9W</td>      <td>UTTAR PRADESH</td>      <td>70-PK4Z-6VSP</td>      <td>1</td>      <td>10</td>      <td>0</td>    </tr>  </tbody></table>

这些列是:

Title   ASIN    State   SellerSKU   Quantity    FBAStock    QuantityToShip 

我有另一个数据框,其中包含上述数据框的行的子集,但此数据框中仅更改了数量"列,并具有列

I have another dataframe which contains a subset of rows of the above dataframe but only the column "Quantity" is changed in this dataframe and has the columns

ASIN State Quantity

如何将较小的数据框与第一个数据框相交或合并,以使较小的数据框数量通过匹配ASIN和State列覆盖原始数据框数量?

How do I intersect or merge this smaller dataframe with the first dataframe such that Quantity of smaller dataframe overwrites the original quantity of dataframe by matching the ASIN and State columns ?

如果可以通过合并来完成,该怎么办?我对诸如'inner','left'等的SQL合并单词不熟悉.

If it can be done by merging , how to do so ? I'm not familiar with SQL merge words like 'inner' , 'left' ,etc...

我正在像这样修改原始DF:

I am modifying the original DF like this :

new = originalDF.groupby(['State' ,'ASIN' , 'Quantity']).size().reset_index().rename(columns= {0 : 'Count'})

new.Quantity = new[['Quantity' , 'Count']].apply(lambda tup : tup[0]*tup[1] , axis = 1)
new.drop(['Count'] , axis =1 , inplace=True)

现在,我要将原始DF的列放到与新列DF的ASIN和State相匹配的新DF上(新DF的数量"列就是我想要的最终数据帧).

Now i want to put the columns of originalDF to the new DF matching the columns ASIN and State of the new DF (Quantity column of new DF is what I want in the final dataframe).

推荐答案

我相信想要

I believe want transform for new column by size per groups with multiple column Quantity by *=:

originalDF = pd.DataFrame({'State':list('aaabbb'),
                           'ASIN':list('cfcccc'),
                           'Quantity':[100] * 6})


originalDF['Quantity'] *= (originalDF.groupby(['State' ,'ASIN' , 'Quantity'])['State']
                                    .transform('size'))

print (originalDF)
  State ASIN  Quantity
0     a    c       200
1     a    f       100
2     a    c       200
3     b    c       300
4     b    c       300
5     b    c       300

详细信息:

print ((originalDF.groupby(['State' ,'ASIN' , 'Quantity'])['State']
                                    .transform('size')))

0    2
1    1
2    2
3    3
4    3
5    3
Name: State, dtype: int64

这篇关于如何在 pandas 中进行数据框的交集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆