从 Pandas Dataframe 中,在某些列中具有共同值的不同行之间构建 networkx 图表或流程图 [英] From a Pandas Dataframe, build networkx chart or flow chart between different rows with common values in certain columns

查看:91
本文介绍了从 Pandas Dataframe 中,在某些列中具有共同值的不同行之间构建 networkx 图表或流程图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理显示多行订单流的数据,每一行都是一个独立的站点/站点.示例数据如下所示:

I'm working with data that shows order flow across multiple rows, with each row being an independent stop/station. Sample data looks like this:

  Firm           event_type   id previous_id
0    A                 send  111            
1    B     receive and send  222         111
2    C  receive and execute  333         222
3    D  receive and execute  444         222
4    E   receive and cancel  123         100

这里的链接是由两个字段id"决定的和previous_id".例如,在样本数据中,B 公司的 previous_id 与 A 公司的 id 相同,为 111.因此订单从 A 公司流向 B 公司.

The link here is decided by the two fields "id" and "previous_id". For instance, in the sample data, the previous_id of Firm B is the same as the id of Firm A, 111. Therefore order flows from Firm A to Firm B.

对于公司 E,由于其 previous_id 与任何行的 id 都不匹配,我打算将其作为流程中的一个独立部分.

And for Firm E, since its previous_id doesn't match the id of any row, I intend it to be a standalone part in the flow.

因此,我想根据示例数据实现的目标是这样的:

Therefore what I want to achieve based on the sample data is something like this:

(颜色仅用于说明目的,不是必须的).

(Color is just for illustration purposes, not a must have).

我一直试图在这个 相关问题 但无法让它工作.我希望networkx有向图的标签是具有共享值的列以外的列.

I have been trying to work upon answer from @Dinari in this related question but couldn't get it to work. I would like the label of the networkx directed chart to be a column other than the columns with shared values.

谢谢.

推荐答案

# convert dataypes to ensure that dictionary access will work
df['id'] = df['id'].astype(str)
df['previous_id'] = df['previous_id'].astype(str)

# create a mapping from ids to Firms
replace_dict = dict(df[['id', 'Firm']].values)

# apply that mapping. If no Firm can be found use placeholders 'no_source' and 'no_target'
df['source'] = df['previous_id'].apply(lambda x: replace_dict.get(x) if replace_dict.get(x) else 'no_source' )
df['target'] = df['id'].apply(lambda x: replace_dict.get(x) if replace_dict.get(x) else 'no_target' )

#make the graph
G = nx.from_pandas_edgelist(df, source='source', target='target')

# drop all placeholder nodes
G.remove_nodes_from(['no_source', 'no_target'])

# draw graph
nx.draw_networkx(G, node_shape='s')

要包含箭头,创建一个有向图(DiGraph):

to include arrows, create a directed graph (DiGraph):

#make the graph
G = nx.from_pandas_edgelist(df, source='source', target='target', create_using=nx.DiGraph)

这篇关于从 Pandas Dataframe 中,在某些列中具有共同值的不同行之间构建 networkx 图表或流程图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆