如何使用boxplots在R中绘制csv数据列 [英] How to plot columns of csv data in R using boxplots
问题描述
我有一个示例数据帧,该数据帧长3600行,宽6列.我想在R中创建一个绘图,该绘图将显示六个框形图,每个六列数据中的一个.我正在使用ggplot.我可以很容易地在excel中创建它们(如下所示),但是希望能够在R中做到这一点,因为我未来的数据帧将变得更大,而R似乎更容易处理大型数据集.
I have a sample dataframe that is 3600 rows long by 6 columns wide. I want to create plot in R that will show six boxplots, one for each of the 6 columns of data. I am using ggplot. I can create them in excel easy enough (shown below) but want to be able to do it in R as my future dataframes are going to be much larger and R seems to handle large datasets a lot easier.
使用下面的代码,我可以很好地绘制第一列,但无法弄清楚如何添加其他5列中的数据.
Using the code below I can plot the first column fine, but can't figure out how to add the data from the other 5 columns.
ggplot(data=df)+
geom_boxplot(aes(x="Label", y=col1))
推荐答案
使用 ggplot2
Using geom_boxplot
from ggplot2
要使用 ggplot2
为6列中的每列获取一个箱线图,您需要先将数据框重塑为更长的格式,以匹配 ggplot2
的语法(一列用于x值,一列用于y值,一列或多列作为分类值).然后,您可以使用 ggplot2
和 geom_boxplot
函数:
To get a boxplot for each of your 6 columns with ggplot2
, you need to reshape first your dataframe into a longer format in order to match the grammar of ggplot2
(one column for x values, one column for y values and one or more column as categorical values). Then, you can use ggplot2
and geom_boxplot
function:
以下是使用包含的 iris
数据集的示例:
Here, an example using the included iris
dataset:
> head(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
使用 tidyr
包中的
pivot_longer 函数,您可以将此数据集的前4列重塑为更长的格式:
Using, pivot_longer
function from tidyr
package you can reshape the first 4 columns of this dataset into a longer format:
library(tidyr)
library(dplyr)
iris2 <- iris %>% pivot_longer(cols = Sepal.Length:Petal.Width, names_to =
"Var", values_to = "val")
# A tibble: 600 x 3
Species Var val
<fct> <chr> <dbl>
1 setosa Sepal.Length 5.1
2 setosa Sepal.Width 3.5
3 setosa Petal.Length 1.4
4 setosa Petal.Width 0.2
5 setosa Sepal.Length 4.9
6 setosa Sepal.Width 3
7 setosa Petal.Length 1.4
8 setosa Petal.Width 0.2
9 setosa Sepal.Length 4.7
10 setosa Sepal.Width 3.2
# … with 590 more rows
然后,您可以在 ggplot2
中使用此新数据集,以获取 Var
的每个值的箱线图:
And then, you can use this new dataset in ggplot2
for getting boxplot for each of values of Var
:
library(ggplot2)
ggplot(iris2, aes(x = Var, y = val, fill = Var))+
geom_boxplot()
替代使用 base r
Alternative using base r
无需重新设计数据框,就可以通过使用 base r
中的 boxplot
函数立即获得boxplot:
Without the need to reshape your dataframe, you can get the boxplot right away by using boxplot
function in base r
:
boxplot(iris[,c(1:4)], col = c("red","green","blue","orange"))
它能回答您的问题吗?
这篇关于如何使用boxplots在R中绘制csv数据列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!