R:将列添加到 data.frame 以分为低、中、高范围 [英] R: add column to data.frame to split into ranges low, medium, high

查看:24
本文介绍了R:将列添加到 data.frame 以分为低、中、高范围的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个data.frame 星系和它们的距离(z):

I have a data.frame of galaxies and their distances (z):

> head(sdss16, 10)
                 SDSS  RAJ2000   DEJ2000   MJD Class QSO        z    umag    gmag    rmag    imag    zmag    e_umag    e_gmag    e_rmag    e_imag    e_zmag
1  000000.15+353104.2 0.000629 35.517841 58402     0   1 0.845435 18.9640 18.6307 18.4295 18.4118 18.2555 0.0248228 0.0138142 0.0173684 0.0171765 0.0281816
2  000000.33+310325.3 0.001415 31.057048 58073     0   1 2.035491 22.0825 21.7871 21.5621 21.3595 20.9340 0.1381920 0.0461832 0.0504525 0.0603687 0.1857780
3  000000.36+070350.8 0.001535  7.064129 58449     0   1 1.574227 22.5173 22.1028 21.8542 21.6380 21.8888 0.2093710 0.0641275 0.0674263 0.0829677 0.2956540
4  000000.36+274356.2 0.001526 27.732283 57654     0   1 1.770552 22.3475 21.9031 21.7528 21.6635 21.9946 0.1889810 0.0556878 0.0731551 0.0841880 0.3567380
5  000000.45+092308.2 0.001914  9.385637 58450     0   1 2.024146 18.7664 18.6627 18.4998 18.3365 18.1586 0.0261839 0.0309531 0.0179315 0.0260643 0.0214897
6  000000.45+174625.4 0.001898 17.773739 56945     3   1 2.309000 22.4403 21.9089 22.0700 21.9268 21.3725 0.2871240 0.0677072 0.1153900 0.1489100 0.3854550
7  000000.47-002703.9 0.001978 -0.451088 55477     3   1 0.250000 21.6832 21.1946 20.5092 20.1535 19.8793 0.1288200 0.0415909 0.0301123 0.0290315 0.0765198
8  000000.57+055630.8 0.002375  5.941903 57367     0   1 2.102771 22.3606 21.6176 21.3399 21.2840 20.7872 0.3101850 0.0539608 0.0710789 0.1014390 0.2420300
9  000000.62+311944.3 0.002595 31.328982 58073     0   1 1.991313 19.6818 19.4060 19.3189 19.0364 18.8358 0.0299476 0.0160732 0.0150661 0.0247494 0.0376382
10 000000.66+145828.8 0.002756 14.974675 56268     3   1 2.497000 21.9420 21.2236 20.8861 20.7823 20.6592 0.1638730 0.0360871 0.0372218 0.0509094 0.2107500

我想添加一个新列,根据星系所在的分位数将 z 描述为低"、中"或高":

I want to add a new column which describes the z as 'Low', 'Medium', or 'High' based on which quantile the galaxy is in:

summary(z)

     Min.   1st Qu.    Median      Mean   3rd Qu.      Max.      NA's 
-0.002643  1.177832  1.692103  1.740606  2.260000  7.023917         4

我可以使用

lowz <- sdss16 %>% filter(z < quantile(z, 0.25))
midz <- sdss16 %>% filter(z >= quantile(z, 0.25) & z < quantile(z, 0.75))
hiz <-  sdss16 %>% filter(z >= quantile(z, 0.75))

所以我的问题是,如所述,如何根据四分位数添加新列?

so my question is, how can I add a new column based on the quartiles, as described?

推荐答案

也许这行得通?

library(tidyverse)
sdss16 %>% 
  mutate(z_category = case_when(z < quantile(z, 0.25) ~ "Low",
                                  z >= quantile(z, 0.25) & z <= quantile(z, 0.75) ~ "Medium",
                                  z > quantile(z, 0.75) ~ "High"))

这篇关于R:将列添加到 data.frame 以分为低、中、高范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆