如何使用dcast()对列的值求和? [英] How can I sum values of columns with dcast()?

查看:11
本文介绍了如何使用dcast()对列的值求和?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我被dcast函数束缚住了;我正在尝试为每个计数年的多个物种的个体创建一个求和表。

我有一个包含3列的数据框:(1)年份(系数)、(2)物种名称(系数)和(3)计数(数字)。

Year    Species Counts
2002    SP1 2
2002    SP1 3
2004    SP1 2
2002    SP2 8
2002    SP2 2
2002    SP3 1
2002    SP3 1
2003    SP3 2
2004    SP3 1

我正在尝试获取这种包含总和的表:

    2002    2003    2004
SP1 5   0   2
SP2 10  0   0
SP3 2   2   1

Aggregate做不到我想要的。我使用的dcast函数如下:

dcast( DF, Species ~ Year , sum)

无论我如何尝试,求和都不适用于因数。我收到以下错误消息:

(Error in Summary.factor(integer(0), na.rm = FALSE): sum not meaningful for factors 

当我尝试默认函数(LENGTH)时,我得到的是行数的总和,而不是个人的总和。当我尝试将总和设置为处理我的"Counts"列而不是其他因子时,它也不起作用,或者我收到相同的错误消息。

我怎样才能得到那种包含计数和的表?


编辑:

该函数计算假和并生成个体。下面是用Excel中的交叉表和R:中的交叉表得出的总和的比较。

EXCEL                   2003    2004    2005    2006    2007    2008    2009
Anthocharis_cardamines  1   0   2   2   0   1   0
Apatura_ilia            0   0   0   0   1   0   0
Aporia_crataegi         2   3   29  26  6   4   3
Brintesia_circe         126 217 199 303 64  99  55


DCAST                   2003    2004    2005    2006    2007    2008    2009
Anthocharis_cardamines  2   0   4   4   0   2   0
Apatura_ilia            0   0   0   0   2   0   0
Aporia_crataegi         4   6   258 205 25  8   6
Brintesia_circe         883 1334    1050    1770    490 848 354

计算的数字甚至与我昨天得到的行数总和都不相符。 这些金额是如何运作的?

编辑2:

>dput(head(counts, 10)
structure(list(year = structure(c(16L, 16L, 16L, 16L, 16L, 16L, 
16L, 16L, 16L, 15L), .Label = c("1994", "1995", "1996", "1997", 
"1998", "1999", "2000", "2001", "2002", "2003", "2004", "2005", 
"2006", "2007", "2008", "2009"), class = "factor"), species = structure(c(146L, 
146L, 146L, 146L, 146L, 146L, 146L, 146L, 146L, 146L), .Label = c("Aglais_urticae", 
"Anthocharis_cardamines", "Anthocharis_euphenoides", "Apatura_ilia", 
"Apatura_iris", "Aphantopus_hyperantus", "Aporia_crataegi", "Araschnia_levana", 
"Arethusana_arethusa", "Argynnis_adippe", "Argynnis_aglaja", 
"Argynnis_paphia", "Aricia_agestis", "Boloria_dia", "Boloria_euphrosyne", 
"Boloria_selene", "Brenthis_daphne", "Brenthis_ino", "Brintesia_circe", 
"Callophrys_rubi", "Carcharodus_alceae", "Carcharodus_floccifera", 
"Carcharodus_lavatherae", "Carterocephalus_palaemon", "Celastrina_argiolus", 
"Charaxes_jasius", "Chazara_briseis", "Clossiana_dia", "Coenonympha_arcania", 
"Coenonympha_dorus", "Coenonympha_glycerion", "Coenonympha_oedippus", 
"Coenonympha_pamphilus", "Coenonympha_tullia", "Colias_alfacariensis", 
"Colias_croceus", "Colias_hyale", "Colias_palaeno", "Cupido_alcetas", 
"Cupido_argiades", "Cupido_minimus", "Cupido_osiris", "Diacrisia_sannio", 
"Erebia_aethiops", "Erebia_euryale", "Erebia_ligea", "Erebia_medusa", 
"Erebia_meolans", "Erynnis_tages", "Euchloe_crameri", "Euclidia_glyphica", 
"Euphydryas_aurinia", "Euplagia_quadripunctaria", "Everes_argiades", 
"Fabriciana_adippe", "Glaucopsyche_alcon", "Glaucopsyche_alexis", 
"Glaucopsyche_arion", "Glaucopsyche_melanops", "Glaucopsyche_nausithous", 
"Glaucopsyche_teleius", "Gonepteryx_cleopatra", "Gonepteryx_rhamni", 
"Hamearis_lucina", "Hesperia_comma", "Heteropterus_morpheus", 
"Hipparchia_fidia", "Hipparchia_semele", "Hyles_euphorbiae", 
"Hyponephele_lupinus", "Inachis_io", "Iphiclides_podalirius", 
"Issoria_lathonia", "Lampides_boeticus", "Lasiommata_maera", 
"Lasiommata_megera", "Leptidea_sinapis", "Leptotes_pirithous", 
"Libelloides_coccajus", "Libelloides_longicornis", "Limenitis_camilla", 
"Limenitis_populi", "Limenitis_reducta", "Lopinga_achine", "Lycaena_alciphron", 
"Lycaena_dispar", "Lycaena_helle", "Lycaena_phlaeas", "Lycaena_tityrus", 
"Macroglossum_stellatarum", "Maculinea_arion", "Maniola_jurtina", 
"Melanargia_arge", "Melanargia_galathea", "Melanargia_lachesis", 
"Melanargia_occitanica", "Melitaea_cinxia", "Melitaea_diamina", 
"Melitaea_didyma", "Melitaea_phoebe", "Mesoacidalia_aglaja", 
"Minois_dryas", "Neohipparchia_statilinus", "Neozephyrus_quercus", 
"Nymphalis_antiopa", "Nymphalis_polychloros", "Ochlodes_sylvanus", 
"Ochlodes_venatus", "Palaeochrysophanus_hippothoe", "Papilio_machaon", 
"Pararge_aegeria", "Pieris_napi", "Plebeius_agestis", "Plebeius_argyrognomon", 
"Polygonia_c-album", "Polyommatus_bellargus", "Polyommatus_coridon", 
"Polyommatus_escheri", "Polyommatus_icarus", "Polyommatus_semiargus", 
"Polyommatus_thersites", "Pontia_daplidice", "Pseudopanthera_macularia", 
"Pseudophilotes_baton", "Pseudotergumia_fidia", "Pyrgus_malvae", 
"Pyronia_bathseba", "Pyronia_cecilia", "Pyronia_tithonus", "Quercusia_quercus", 
"Satyrium_acaciae", "Satyrium_esculi", "Satyrium_ilicis", "Satyrium_pruni", 
"Satyrium_spini", "Satyrium_w-album", "Smerinthus_ocellatus", 
"Speyeria_aglaja", "Spialia_sertorius", "Thecla_betulae", "Thymelicus_acteon", 
"Thymelicus_lineola", "Thymelicus_sylvestris", "Vanessa_atalanta", 
"Vanessa_cardui", "Zerynthia_polyxena", "Zygaena_carniolica", 
"Zygaena_ephialtes", "Zygaena_erythrus", "Zygaena_fausta", "Zygaena_filipendulae", 
"Zygaena_hilaris", "Zygaena_loti", "Zygaena_occitanica", "Zygaena_purpuralis", 
"Zygaena_sarpedon", "Zygaena_transalpina"), class = "factor"), 
    Counts = c(2, 2, 2, 2, 2, 17, 52, 2, 2, 17)), .Names = c("year", 
"species", "Counts"), row.names = 5479:5488, class = "data.frame")

> str(counts)
'data.frame':   3161 obs. of  3 variables:
 $ year   : Factor w/ 16 levels "1994","1995",..: 16 16 16 16 16 16 16 16 16 15 ...
 $ species: Factor w/ 157 levels "Aglais_urticae",..: 146 146 146 146 146 146 146 146 146 146 ...
 $ Counts : num  2 2 2 2 2 17 52 2 2 17 ...

我希望它能有所帮助...

推荐答案

Adcast()版本

这对我有效:

require("reshape2")
dcast(counts, Year ~ Species, value.var = "Counts", fun.aggregate = sum)

> dcast(counts, Year ~ Species, value.var = "Counts", fun.aggregate = sum)
  Year SP1 SP2 SP3
1 2002   5  10   2
2 2003   0   0   2
3 2004   2   0   1
检查counts$Counts是否为数字;查看str(counts)的输出,其中counts是您的DF。我通过:

创建了counts
counts <- read.table(text = "Year    Species Counts
2002    SP1 2
2002    SP1 3
2004    SP1 2
2002    SP2 8
2002    SP2 2
2002    SP3 1
2002    SP3 1
2003    SP3 2
2004    SP3 1", header = TRUE)

这是在使用

> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_CA.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] reshape2_1.2.2

loaded via a namespace (and not attached):
[1] plyr_1.8      stringr_0.6.2 tools_3.0.1

使用xtabs()的备用基本R版本

您可能还想尝试一下Base R附带的xtabs()函数

xtabs(Counts ~ Year + Species, data = counts)

> xtabs(Counts ~ Year + Species, data = counts)
      Species
Year   SP1 SP2 SP3
  2002   5  10   2
  2003   0   0   2
  2004   2   0   1

这篇关于如何使用dcast()对列的值求和?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆