如何使用R查找子集的均值? [英] How to find mean for subset using R?

查看:94
本文介绍了如何使用R查找子集的均值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用R中的mtcars预先安装的数据集,我试图查找仅适用于梅赛德斯汽车的"mpg"变量的平均值.我是R的新手,自己学习.我使用以下方法算出了所有汽车的平均mpg:

Using the pre-installed dataset in R, mtcars, I'm trying to find the mean of the "mpg" variable for only Mercedes cars. I am new to R and learning on my own. I've figured out the average for mpg of all cars using the following:

read.csv("mtcars.csv") 平均值(mtcars $ mpg)

read.csv ("mtcars.csv") mean(mtcars$mpg)

我想到了使用GROUP BY之类的东西来仅对梅赛德斯"汽车进行分组,但似乎无法弄清楚.我敢肯定这真的很简单,所以我有点沮丧,我不知道下一步该怎么做....

I thought of using something like a GROUP BY, to group only the 'Mercedes cars, but can't seem to figure it out. I'm sure it's really simple so I'm a little frustrated I'm not seeing what to do here next....

文件的外观如下: https://gist.github.com/seankross/a412dfbd88b3db70b74b

推荐答案

在基数R中,mtcars是内置数据帧.您可以在控制台中键入mtcars进行查看.

In base R, mtcars is a built-in data frame. You can type mtcars in the console to view it.

在这里打印mtcars数据框的前10行.

Here I printed the first 10 rows of the mtcars data frame.

head(mtcars, 10)
#                    mpg cyl  disp  hp drat    wt  qsec vs am gear carb
# Mazda RX4         21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
# Mazda RX4 Wag     21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
# Datsun 710        22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
# Hornet 4 Drive    21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
# Hornet Sportabout 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
# Valiant           18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
# Duster 360        14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
# Merc 240D         24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
# Merc 230          22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
# Merc 280          19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4

您需要的信息(模型)存储在行名称中.要访问该信息,我们可以使用rownames函数.

The information you need, the model, is stored in the row names. To access that information, we can use the rownames function.

rownames(mtcars)
# [1] "Mazda RX4"           "Mazda RX4 Wag"       "Datsun 710"         
# [4] "Hornet 4 Drive"      "Hornet Sportabout"   "Valiant"            
# [7] "Duster 360"          "Merc 240D"           "Merc 230"           
# [10] "Merc 280"            "Merc 280C"           "Merc 450SE"         
# [13] "Merc 450SL"          "Merc 450SLC"         "Cadillac Fleetwood" 
# [16] "Lincoln Continental" "Chrysler Imperial"   "Fiat 128"           
# [19] "Honda Civic"         "Toyota Corolla"      "Toyota Corona"      
# [22] "Dodge Challenger"    "AMC Javelin"         "Camaro Z28"         
# [25] "Pontiac Firebird"    "Fiat X1-9"           "Porsche 914-2"      
# [28] "Lotus Europa"        "Ford Pantera L"      "Ferrari Dino"       
# [31] "Maserati Bora"       "Volvo 142E"

下一步,我们需要过滤行名称,以查看是否有任何匹配"Merc"的元素.我们可以使用grepl来实现,如果匹配则返回逻辑向量.这里的"^ Merc"表示以"Merc"开头的字符串.

The next thing we need to do is filter the row names to see if there are any elements match "Merc". We can use grepl to achieve this, which returns a logical vector if there is a match. Here "^Merc" means to capture string with a beginning in "Merc".

grepl("^Merc", rownames(mtcars))
# [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
# [14]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
# [27] FALSE FALSE FALSE FALSE FALSE FALSE

最后,我们可以使用逻辑向量对mtcars数据帧进行子集化.子集之后,我们可以计算子集mpg的平均值.

Finally, we can use the logical vector to subset the mtcars data frame. After the subset, we can calculate the average of mpg of the subset.

mtcars_merc <- mtcars[grepl("^Merc", rownames(mtcars)), ]
mean(mtcars_merc$mpg)
# [1] 19.01429

这篇关于如何使用R查找子集的均值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆