根据关系表减去值 [英] Subtracting values based on a relationship table

查看:35
本文介绍了根据关系表减去值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想开发一些代码,这些代码将通过使用目标和源之间的关系表来计算目标位置的值(向下倾斜).通用公式为(值=向下梯度-向上梯度),或者在给定我的关系表的情况下,(值=目标-所有贡献源位置).

在操作上,我想做的类似于我的另一个

通过示例计算,我可以在站点3"站点上手动执行该操作.通过:

  df_sum = df.loc [:,'Site 1':'Site 2'].sum(axis = 1)df_3_sub = df.loc [:,'Site 3'].subtract(df_sum)打印(df_3_sub) 

在我链接的示例中,我使用了一个不错的解决方案(感谢被访者!),

 将networkx导入为nxG = nx.from_pandas_edgelist(df_order.dropna(),source ='source',target ='target',create_using = nx.DiGraph)nx.draw(G,with_labels = True)def all_preds(G,目标):preds = [目标]对于列表中的 p(G.predecessors(target)):preds + = all_preds(G,p)回报物pd.concat([df [all_preds(G,target)].sum(1).rename(target)df_order ['source'].unique()中的目标],轴= 1) 

现在,我希望能够本质上调用.diff(1)而不是sum(1).有相对简单的方法可以做到这一点吗?

此外,最上坡的点(或起点)将没有任何贡献值,也不需要将其转移到新的DataFrame中.同样,减法运算将始终从原始DataFrame中提取出来以减去这些值,并将新减去的值放入一个新的DataFrame中.为了清楚起见,我没有将新减去的值放回到原始数据帧中以代替原始的站点3".值.

它不是超级漂亮或高效,但我想我想出了如何使用for循环来实现这一点:

  result = pd.DataFrame()对于df2.columns中的网站:upgradient = df2 [all_preds(G,site)].drop(site,axis = 1).sum(axis = 1)downgradient = df2 [site]calc = downgradient.subtract(upgradient)result.append(calc,ignore_index = True) 

我认为我只需要帮助for循环的最后一部分,以便结果是一个有凝聚力的DataFrame,并且在for循环的每一步中,列名都与df2 [site]中的名称匹配.我欢迎对我的代码有任何想法,评论或修改!

解决方案

好吧,我想我找到了一种实现自己想要完成的目标的方法.我敢肯定有一种更有效的方法,但是目前看来这对我有用.如果有更优雅/有效的解决方案,我仍然愿意提出建议.

 将pandas导入为pd将networkx导入为nx将numpy导入为npdf2 = pd.DataFrame({网站1":np.random.rand(10),站点2":np.random.rand(10),站点3":np.random.rand(10),站点4":np.random.rand(10),站点 5":np.random.rand(10),站点6":np.random.rand(10)})打印(df2)df_order2 = {'源':[站点1",站点2",站点3",站点4",站点5",站点6"]],'目标':[站点3",站点3",站点5",站点5",站点6",无"]}dfo2 = pd.DataFrame(df_order,列= ['源','目标'])(打印(dfo2))def all_preds(G,目标):preds = [目标]对于列表中的p(G.predecessors(target)):preds + = all_preds(G,p)回报物结果= []对于df2.columns中的网站:upgradient = df2 [all_preds(G,site)].drop(site,axis = 1).sum(axis = 1)downgradient = df2 [site]result.append(downgradient.subtract(upgradient))rfinal = pd.concat(结果,轴= 1)rfinal.columns = df2.columns.values 

I want to develop some code that will calculate the value of the target location (down gradient) by using a relationship table of targets and sources. The general formula is (value = down gradient - up gradient) or, given my relationship table, (value = target - all contributing source locations).

Operationally, what I want to do is similar to one of my other posts, only this time I want to use subtraction.

So, let's start with:

import pandas as pd
import networkx as nx
import numpy as np

df = pd.DataFrame({
"Site 1": np.random.rand(10),
"Site 2": np.random.rand(10),
"Site 3": np.random.rand(10),
"Site 4": np.random.rand(10),
"Site 5": np.random.rand(10),
"Site 6": np.random.rand(10)})

and the relationship table:

df_order = {'source': ["Site 1","Site 2", "Site 3", "Site 4", "Site 5", "Site 6"],
        'target': ["Site 3","Site 3","Site 4","Site 4", "Site 6","None"]
        }
dfo = pd.DataFrame(df_order, columns = ['source', 'target'])

Visually, this looks like:

With a sample calculation, I could manually perform the operation on "Site 3" via:

df_sum = df.loc[:,'Site 1':'Site 2'].sum(axis = 1)
df_3_sub = df.loc[:, 'Site 3'].subtract(df_sum)
print(df_3_sub)

In the example I linked, I ended up with a nice solution (thanks to the respondent!) where I used:

import networkx as nx
G = nx.from_pandas_edgelist(df_order.dropna(), 
                            source='source', target='target', 
                            create_using=nx.DiGraph)
nx.draw(G, with_labels=True)

    def all_preds(G, target):
        preds=[target]
        for p in list(G.predecessors(target)):
            preds += all_preds(G, p)
        return preds

pd.concat([
    df[all_preds(G, target)].sum(1).rename(target)
    for target in df_order['source'].unique()
    ], axis=1)

Now, I want to be able to essentially call .diff(1) instead of sum(1). Is there a relatively simple way to accomplish this?

Additionally, the most up-gradient points (or starting points) will not have any values contributing to them and they do not need to be carried over to the new DataFrame. Also, the subtraction operation will always pull from the original DataFrame to subtract the values and put the newly subtracted value into a new DataFrame. Just to be clear, I am not slotting the newly subtracted value back into the original Dataframe in place of the original "Site 3" values, for example.

EDIT:

It is not super pretty or efficient, but I think I figured out how to go about this with a for loop:

result= pd.DataFrame()

for site in df2.columns:
    upgradient = df2[all_preds(G, site)].drop(site,axis=1).sum(axis=1)
    downgradient = df2[site]
    calc = downgradient.subtract(upgradient) 
    result.append(calc, ignore_index=True)

I think I just need help with the last part of the for loop so that the result is a cohesive DataFrame and the column names match the name in df2[site] at each step in the for loop. I welcome any thoughts, comments or modifications to my code!

解决方案

Well, I think I found one way to accomplish what I wanted to. I am sure there is a more efficient way, but this seems to work for me at the moment. I am still open to suggestions if there is a more elegant/efficient solution out there.

import pandas as pd
import networkx as nx
import numpy as np  


df2 = pd.DataFrame({
    "Site 1": np.random.rand(10),
    "Site 2": np.random.rand(10),
    "Site 3": np.random.rand(10),
    "Site 4": np.random.rand(10),
    "Site 5": np.random.rand(10),
    "Site 6": np.random.rand(10)})

print(df2)
df_order2 = {'source': ["Site 1","Site 2", "Site 3", "Site 4", "Site 5", "Site 6"],
        'target': ["Site 3","Site 3","Site 5","Site 5", "Site 6","None"]
        }

dfo2 = pd.DataFrame(df_order, columns = ['source', 'target'])
(print(dfo2))

def all_preds(G, target):
    preds = [target]
    for p in list(G.predecessors(target)):
        preds += all_preds(G, p)
    return preds

result = []

for site in df2.columns:
    upgradient = df2[all_preds(G, site)].drop(site,axis=1).sum(axis=1)
    downgradient = df2[site]
    result.append(downgradient.subtract(upgradient))

rfinal = pd.concat(result, axis=1)  
rfinal.columns = df2.columns.values

这篇关于根据关系表减去值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆