ggplot中的散点图,两组中的一个数值变量 [英] Scatter plot in ggplot, one numeric variable across two groups
问题描述
我想在ggplot2中创建一个散点图,使用以下数据集在x轴上显示男性test_score,在y轴上显示女性test_score.我可以轻松地创建一个geom_line图,将男性和女性分开并将日期("dts")放在x轴上.
I would like to create a scatter plot in ggplot2 which displays male test_scores on the x-axis and female test_scores on the y-axis using the dataset below. I can easily create a geom_line plot splitting male and female and putting the date ("dts") on the x-axis.
library(tidyverse)
#create data
dts <- c("2011-01-02","2011-01-02","2011-01-03","2011-01-04","2011-01-05",
"2011-01-02","2011-01-02","2011-01-03","2011-01-04","2011-01-05")
sex <- c("M","F","M","F","M","F","M","F","M","F")
test <- round(runif(10,.5,1),2)
semester <- data.frame("dts" = as.Date(dts), "sex" = sex, "test_scores" =
test)
#show the geom_line plot
ggplot(semester, aes(x = dts, y = test, color = sex)) + geom_line()
似乎只有一个时间序列,ggplot2对宽格式的数据比长格式的数据要好.例如,我可以轻松创建两列"male_scores"和"female_scores",并将它们相互绘制,但我想保持数据整洁和长格式.
It seems with only one time series, ggplot2 does better with the data in wide format than long format. For instance, I could easily create two columns, "male_scores" and "female_scores" and plot those against each other, but I would like to keep my data tidy and in long format.
干杯,谢谢.
推荐答案
您收拾得整整齐齐.整理数据不仅是使数据尽可能长的一种机制,而且还使数据尽可能地广泛.
You've over-tidied. Tidying data isn't just the mechanism of making it as long as possible, its making it as wide as necessary..
例如,如果您将目击动物的位置分别定为X和Y,则不会有两行,其中一行的标签"列包含"X",X坐标在值"列中,而另一行包含"标签"列中的"Y"和值"列中的Y坐标-除非您确实将数据存储在键值存储区中,但这是另外一回事了...
For example, if you had location as X and Y for animal sightings you wouldn't have two rows, one with a "label" column containing "X" and the X coordinate in a "value" column and another with "Y" in the "label" column and the Y coordinate in the "value" column - unless you really where storing the data in a key-value store but that's another story...
通配数据,然后将男性和女性的考试成绩分别放入 test_core_male
和 test_score_female
中,这就是散点图的x和y美学意义.
Widen your data and put the test scores for male and female into test_core_male
and test_score_female
, then they are the x and y aesthetics for your scatter plot.
这篇关于ggplot中的散点图,两组中的一个数值变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!