类似热图的三个分类变量 [英] Heatmap-like plot for three categorical variables
问题描述
我正在处理一个以案例形式分类的数据帧,该数据帧由三个变量(即颜色,形状和大小)及其对应的频率组成.数据帧的示例如下:
I'm dealing with a data frame of categorical variables in case form, made up of three variables (i.e. color, shape and size) and its corresponding frequency. An example of the data frame is like this:
Color Shape Size Freq
1 Yellow Square Big 10
2 Yellow Square Medium 6
3 Yellow Square Small 3
4 Yellow Triangle Big 4
5 Yellow Triangle Medium 6
6 Yellow Triangle Small 8
7 Red Square Big 2
8 Red Square Medium 6
9 Red Square Small 5
10Red Triangle Big 12
.......
"color"变量是针对"shape"和"size"变量进行度量的,每种情况下都有一个频率.
The "color" variable is measured against the "shape" and "size" variables, having a frequency for each case.
从这个数据帧中,我正在努力创建一个类似于热图的图,其中仅显示颜色"和形状"之间的关系,并使用频率最高的变量大小"作为权重.有点棘手,不是!
From this data frame I'm struggling to create a heatmap-like plot where only the relation between "Color" and "Shape" is displayed, and using as weight the variable "Size" with the highest frequency. Bit tricky, isn't it!
例如,对于黄色"-方形"情况,我应该仅显示大",因为大"是具有最高频率的尺寸.对于每种尺寸,都应有相应的颜色(即红色"代表大,绿色"代表中,橙色"代表小). 弗兰克
For example, for the "Yellow" - "Square" cases I should only display "Big", since "big" is the size with the highest freq. For every size there should be an accompanying color (i.e "red" for big, "green" for medium, and "orange" for small). Frank
推荐答案
如何?
library(dplyr)
library(ggplot2)
df_max <- df %>%
group_by(Color, Shape) %>%
slice(which.max(Freq))
head(df_max)
# Source: local data frame [4 x 4]
# Groups: Color, Shape [4]
#
# Color Shape Size Freq
# (chr) (chr) (chr) (int)
# 1 Red Square Medium 6
# 2 Red Triangle Big 12
# 3 Yellow Square Big 10
# 4 Yellow Triangle Small 8
ggplot(df_max, aes(x = Color, y = Shape, fill = Size)) +
geom_tile()
这篇关于类似热图的三个分类变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!