R-在箱图中订购 [英] R - ordering in boxplot

查看:91
本文介绍了R-在箱图中订购的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在 R 中生成一系列箱形图,该图由2个因素分组。我已经设法绘制好了图,但是我无法按照正确的方向订购盒子。

I am trying to produce a series of box plots in R that is grouped by 2 factors. I've managed to make the plot, but I cannot get the boxes to order in the correct direction.

我正在使用的数据场如下:

My data farm I am using looks like this:

Nitrogen    Species    Treatment
2           G          L
3           R          M
4           G          H
4           B          L
2           B          M
1           G          H

我尝试过:

boxplot(mydata$Nitrogen~mydata$Species*mydata$Treatment)

这按字母顺序对箱子进行排序(前三个是高级处理,然后在这三个中按物种名称按字母顺序对它们进行排序)。

this ordered the boxes alphabetically (first three were the "High" treatments, then within those three they were ordered by species name alphabetically).

我想要箱形图排序了Low> Medium> High,然后在该物种的每个组G> R> B中。

I want the box plot ordered Low>Medium>High then within each of those groups G>R>B for the species.

所以我尝试在公式中使用一个因子:

So i tried using a factor in the formula:

f = ordered(interaction(mydata$Treatment, mydata$Species), 
            levels = c("L.G","L.R","L.B","M.G","M.R","M.B","H.G","H.R","H.B")

然后:

boxplot(mydata$Nitrogen~f)

盒子依旧按顺序排列。标签现在有所不同,但是框没有移动。

however the boxes are still shoeing up in the same order. The labels are now different, but the boxes have not moved.

我已经提取了每组数据并将它们分别绘制在一起:

I have pulled out each set of data and plotted them all together individually:

lg = mydata[mydata$Treatment="L" & mydata$Species="G", "Nitrogen"]
mg = mydata[mydata$Treatment="M" & mydata$Species="G", "Nitrogen"]
hg = mydata[mydata$Treatment="H" & mydata$Species="G", "Nitrogen"]
etc ..

boxplot(lg, lr, lb, mg, mr, mb, hg, hr, hb)

这给出了我想要的,但是我希望以更优雅的方式做到这一点,所以我不

This gives what i want, but I would prefer to do this in a more elegant way, so I don't have to pull each one out individually for larger data sets.

可加载的数据:

mydata <-
structure(list(Nitrogen = c(2L, 3L, 4L, 4L, 2L, 1L), Species = structure(c(2L, 
3L, 2L, 1L, 1L, 2L), .Label = c("B", "G", "R"), class = "factor"), 
    Treatment = structure(c(2L, 3L, 1L, 2L, 3L, 1L), .Label = c("H", 
    "L", "M"), class = "factor")), .Names = c("Nitrogen", "Species", 
"Treatment"), class = "data.frame", row.names = c(NA, -6L))


推荐答案

以下命令将通过重建处理和种类因子来创建所需的排序,并使用明确的级别手动排序:

The following commands will create the ordering you need by rebuilding the Treatment and Species factors, with explicit manual ordering of the levels:

mydata$Treatment = factor(mydata$Treatment,c("L","M","H"))

mydata$Species = factor(mydata$Species,c("G","R","B"))

编辑1 :糟糕将其设置为HML而不是LMH。

edit 1 : oops I had set it to HML instead of LMH. fixing.

编辑2:什么因素(X,Y)起作用:

如果对现有因子运行factor(X,Y),它将使用Y中值的顺序来枚举因子X中存在的值。这是数据示例。

If you run factor(X,Y) on an existing factor, it uses the ordering of the values in Y to enumerate the values present in the factor X. Here's some examples with your data.

> mydata$Treatment
[1] L M H L M H
Levels: H L M
> as.integer(mydata$Treatment)
[1] 2 3 1 2 3 1
> factor(mydata$Treatment,c("L","M","H"))
[1] L M H L M H                               <-- not changed
Levels: L M H                                 <-- changed
> as.integer(factor(mydata$Treatment,c("L","M","H")))
[1] 1 2 3 1 2 3                               <-- changed

乍一看它不会改变因子的外观,但确实会改变数据的存储方式。

It does NOT change what the factor looks like at first glance, but it does change how the data is stored.

这里重要的是,许多绘图函数将绘制最低的枚举,然后是下一个,等等。

What's important here is that many plot functions will plot the lowest enumeration leftmost, followed by the next, etc.

如果仅使用 factor(X)创建因子,则通常会根据因子级别的字母顺序进行枚举(例如, H, L, M)。如果标签的常规顺序与字母顺序不同(即 H, M, L),则可能会使您的图表显得奇怪。

If you create factors simply using factor(X) then usually the enumeration is based upon the alphabetical order of the factor levels, (e.g. "H","L","M"). If your labels have a conventional ordering different from alphabetical (i.e. "H","M","L"), this can make your graphs seems strange.

乍一看,问题似乎可能是由于数据在数据框中的排序所致-例如,如果我们只能将所有 H放在顶部然后在底部使用 L,那么它将起作用。没错但是,如果您想要标签以与数据中第一个出现的顺序相同的顺序出现,则可以使用以下形式:

At first glance, it may seem like the problem is due to the ordering of data in the data frame - i.e. if only we could place all "H" at the top and "L" at the bottom, then it would work. It doesn't. But if you want your labels to appear in the same order as the first occurrence in the data, you can use this form:

 mydata$Treatment = factor(mydata$Treatment, unique(mydata$Treatment))

这篇关于R-在箱图中订购的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆