ggplot:多列值的Boxplot [英] ggplot: Boxplot of multiple column values
问题描述
以下是我作为csv文件导入的数据类型:
RPID mm ID时间频率频率。 1 Freq.2
RPO483 1 B6AC 5 23301 30512
RPO483 1 B6AC 25 19 17
RPO244 1 B6C 5 14889 20461
RPO244 1 B6C 25 81 86
RPO876 1 G3G3A 5 106760 59950 103745
RPO876 1 G3G3A 25 4578 38119 37201
RPO876 7 F3G3A 5 205803 148469 173580
RPO876 7 F3G3A 25 28648 30321 26454
RPO939 7 F3E324A 5 242285
RPO939 7 F3E324A 25 42837
RPO934 7 F3E325A 5 242001 129272 112371
RPO934 7 F3E325A 25 73057 58685 66582
对于每个ID,我想为Freq,Freq.1和Freq.2列中的值生成箱形图。然而,目前我只能成功绘制一个Y值 - 例如:
pre $ dataset< - read.csv (〜/ R / dataset.csv)
库(ggplot2)
p < - ggplot(dataset)
p + geom_boxplot(aes(x = ID,y = Freq,color = mm ))
我尝试过类似y = c(Freq,Freq.1,Freq.2 ),但结果如下:
错误:美学必须是长度为1或与dataProblems的长度相同: ID
我确信有一个简单的解决方案,但是因为我对R很新颖,如果是数据格式错误,语法错误,包装错误或其他问题,我无法告诉
。
任何帮助都会很大赞赏!
您需要重塑数据才能绘制。
首先我读取您的数据。请注意,您有一些 NA 值。
dat < read.table(text ='
RPID mm ID时间频率频率1频率2
RPO483 1 B6AC 5 23301 30512
RPO483 1 B6AC 25 19 17
RPO244 1 B6C 5 14889 20461
RPO244 1 B6C 25 81 86
RPO876 1 G3G3A 5 106760 59950 103745
RPO876 1 G3G3A 25 4578 38119 37201
RPO876 7 F3G3A 5 205803 148469 173580
RPO876 7 F3G3A 25 28648 30321 26454
RPO939 7 F3E324A 5 242285
RPO939 7 F3E324A 25 42837
RPO934 7 F3E325A 5 242001 129272 112371
RPO934 7 F3E325A 25 73057 58685 66582',head = T,fill = T)
使用 reshape2
例如
library(reshape2)
dat.m< - melt(dat,id.vars ='ID ',measure.vars = c('Freq','Freq.1','Freq.2'))
library(ggplot2)
p < - ggplot(dat.m)+
geom_boxplot(aes(x = ID,y = value,color = variable))
Here is the type of data that I'm importing as a csv file:
RPID mm ID Time Freq Freq.1 Freq.2
RPO483 1 B6AC 5 23301 30512
RPO483 1 B6AC 25 19 17
RPO244 1 B6C 5 14889 20461
RPO244 1 B6C 25 81 86
RPO876 1 G3G3A 5 106760 59950 103745
RPO876 1 G3G3A 25 4578 38119 37201
RPO876 7 F3G3A 5 205803 148469 173580
RPO876 7 F3G3A 25 28648 30321 26454
RPO939 7 F3E324A 5 242285
RPO939 7 F3E324A 25 42837
RPO934 7 F3E325A 5 242001 129272 112371
RPO934 7 F3E325A 25 73057 58685 66582
For each "ID", I'd like to generate a boxplot for values in columns "Freq", "Freq.1" and "Freq.2". However, currently I'm only able to successfully plot one Y value -- for example:
dataset <- read.csv("~/R/dataset.csv")
library(ggplot2)
p <- ggplot(dataset)
p + geom_boxplot(aes(x=ID, y=Freq, color=mm))
I've tried something like y=c(Freq,Freq.1,Freq.2), but this results in the following:
Error: Aesthetics must either be length one, or the same length as the dataProblems:ID
I'm sure there is a simple solution to this, but as I am very new to R, I can't tell if it is a problem of wrong data format, wrong syntax, wrong package or something else entirely.
Any help would be greatly appreciated !
You need to reshape the data in order to plot.
First I read your data. Note that you have some NA values.
dat <- read.table(text = '
RPID mm ID Time Freq Freq.1 Freq.2
RPO483 1 B6AC 5 23301 30512
RPO483 1 B6AC 25 19 17
RPO244 1 B6C 5 14889 20461
RPO244 1 B6C 25 81 86
RPO876 1 G3G3A 5 106760 59950 103745
RPO876 1 G3G3A 25 4578 38119 37201
RPO876 7 F3G3A 5 205803 148469 173580
RPO876 7 F3G3A 25 28648 30321 26454
RPO939 7 F3E324A 5 242285
RPO939 7 F3E324A 25 42837
RPO934 7 F3E325A 5 242001 129272 112371
RPO934 7 F3E325A 25 73057 58685 66582',head=T, fill=T)
Using reshape2
for example
library(reshape2)
dat.m <- melt(dat,id.vars='ID', measure.vars=c('Freq','Freq.1','Freq.2'))
library(ggplot2)
p <- ggplot(dat.m) +
geom_boxplot(aes(x=ID, y=value, color=variable))
这篇关于ggplot:多列值的Boxplot的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!