计算 R 中关于列的唯一观察值对之间的关系 [英] Calculating a relationship between unique pairs of observations about a colum in R
问题描述
我正在尝试计算两种商品出现在同一组中的概率(按比例计算).
I am trying to calculate the probability, subject to proportion, of two commodities appearing in the same group.
我有以下数据,
data <- data.frame(group = c(1,1,1,1,2,2,2,2,3,3,3,3),
commodity = c("Wheat", "Coal", "Steel", "Iron", "Wheat", "Coal", "Steel", "Iron", "Wheat", "Coal", "Steel", "Iron"),
quantity = c(5,10,0,5,20,5,10,0,0,10,15,15),
proportion = c(0.25,0.5,0,0.25,0.57,0.14,0.29,0,0,0.25,0.375,0.375))
我想计算每种可能的独特商品对的比例(产品总和除以 2).
I would like to do a calculation on the proportion (sum of products divided by 2) for each of the unique possible pairs of commodities.
结果应该是这样的,
result <- data.frame(commodity1 = c("Wheat", "Wheat", "Wheat", "Coal", "Coal", "Steel"),
commodity2 = c("Coal", "Steel", "Iron", "Steel", "Iron", "Iron"),
result = c(0.103,0.082,0.031,0.067,0.109,0.070))
例如,Wheat - Coal 的结果将计算为 (0.25 * 0.5/2)+(0.57 * 0.14/2)+(0 * 0.25/2)=0.103
Where the result for Wheat - Coal for example, would be calculated (0.25 * 0.5/2)+(0.57 * 0.14/2)+(0 * 0.25/2)=0.103
我已将商品对隔离到单独的 data.frame 中以将结果变异为并尝试 rowwise() 操作.
I have isolated the commodity pairs into a separate data.frame to mutate the result into and attempted a rowwise() operation.
任何建议将不胜感激.
推荐答案
虽然不是很干净,但似乎有效
Though not much of clean, it seems working
library(tidyverse)
#make an intermediate data.frame say `dd`
data %>% select(-quantity) %>%
pivot_longer(proportion) %>%
select(-name, -group) %>%
group_by(commodity) %>%
nest(data = c(value)) -> dd
t(combn(unique(data$commodity), 2)) %>% as.data.frame() %>%
mutate(result = map2_dbl(V1, V2,
~ sum(unlist(dd$data[match(.x, dd$commodity)]) * unlist(dd$data[match(.y, dd$commodity)]))/2
)
)
V1 V2 result
1 Wheat Coal 0.1024000
2 Wheat Steel 0.0826500
3 Wheat Iron 0.0312500
4 Coal Steel 0.0671750
5 Coal Iron 0.1093750
6 Steel Iron 0.0703125
这篇关于计算 R 中关于列的唯一观察值对之间的关系的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!