r函数/循环将列和值添加到多个数据框 [英] r function/loop to add column and value to multiple dataframes

查看:147
本文介绍了r函数/循环将列和值添加到多个数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有8个数据框,我想添加一个名为' park '的列,然后在这个列中填写一个来自最后四个字符的值 dataframe 名称。以下是我的八个数据框中的两个:

  water_land_by_ownname_apis<  -  structure(list(OWNERNAME = c(Forest Service(USFS ),鱼类和野生动物服务局(FWS),
国家自然资源部,私人土地所有者,
国家公园管理局(NPS),未知, ,
美国原住民土地),WATER = c(696600,9900,1758600,26100,
112636800,1586688300,0,11354400),LAND = c(258642900,997200,
41005800,2536200,165591900,1075917600,461700,314052300)),class =data.frame,.Names = c(OWNERNAME,
WATER,LAND),data_types = c(C ,F,F),row.names = c(1,
2,3,4,5,6,7 8))

water_land_by_ownname_indu< - 结构(列表(OWNERNAME = c(大自然保护协会),
其他州土地,私人机构,州交通部,
国家自然资源部,未知,国家公园服务(NPS),
私人Lando wner,联合所有权,私人非营利,
土地信托),WATER = c(24300,1018800,5282100,012600,19192500,
802800,139500,0 ,0,0),LAND = c(719100,10045800,12556800,$ b $ 900,2018700,1446426000,42484500,5769900,38700,852300,70200
)),class =data.frame, .Names = c(OWNERNAME,WATER,LAND
),data_types = c(C,F,F),row.names = c(1, 2,3,
4,5,6,7,8,9,10,11))

看起来像这样...

 > water_land_by_ownname_apis 
OWNERNAME WATER LAND
1森林服务(USFS)696600 258642900
2鱼类和野生动物服务(FWS)9900 997200
3国家自然资源部门1758600 41905800
4私人土地所有者26100 2536200
5国家公园服务(NPS)112636800 165591900
6未知1586688300 1075917600
7私立机构0 461700
8美国原住民土地11354400 314052300
> ; water_land_by_ownname_indu
OWNERNAME WATER LAND
1大自然保护协会(TNC)24300 719100
2其他州土地1018800 10045800
3私立机构5282100 12556800
4州立交通部0 900
5州自然资源部12600 2018700
6未知19192500 1446426000
7国家公园管理局802800 42484500
8私人土地所有者139500 5769900
9联合所有权0 38700
10私人非盈利0 852300
11土地信托0 70200

对于每个数据框,我想添加一列('park'),并填充该列中的数据框名称的最后四个字符。例如...

  water_land_by_ownname_apis $ park <-'apis'
water_land_by_ownname_indu $ park < - 'indu '

导致此...

 > water_land_by_ownname_apis 
OWNERNAME WATER LAND park
1森林服务(USFS)696600 258642900 apis
2鱼类和野生动物管理局(FWS)9900 997200 apis
3国家自然资源部1758600 41905800 apis
4私人土地所有者26100 2536200 apis
5国家公园服务(NPS)112636800 165591900 apis
6未知1586688300 1075917600 apis
7私立机构0 461700 apis
8美国原住民土地11354400 314052300 apis
> water_land_by_ownname_indu
OWNERNAME WATER LAND park
1大自然保护协会(TNC)24300 719100 indu
2其他州土地1018800 10045800 indu
3私人机构5282100 12556800 indu
4州运输部0 900美元
5国家自然资源部12600 2018700 indu
6未知19192500 1446426000 indu
7国家公园管理局(NPS)802800 42484500 indu
8私人土地所有者139500 5769900 indu
9联合所有权0 38700 indu
10私人非盈利0 852300 indu
11 Land Trust 0 70200 indu

然后,将它们放在一起....

  water_land_by_ownname<  - RB ind(water_land_by_ownname_apis,water_land_by_ownname_indu)

然后,从内存中删除先前的数据帧...

  rm(water_land_by_ownname_apis,water_land_by_ownname_indu)


  do.call( rbind,lapply(ls(pattern ='water。*'),
function(x){
dat = get(x)
dat $ park = sub('。* _(。 *)$','\\1',x)
dat
)))




  1. ls 将提取具有特定模式的所有data.frames名称,在这里我假设你data.frame以word 。这会将商店名称放在 lapply 使用的列表中。

  2. sub 将提取名称的最后部分

  3. do.call + rbind 应用到结果列表中以获得唯一的大数据。frame

使用您的2个data.frames获得:

  OWNERNAME WATER LAND park 
1森林服务(USFS)696600 258642900 apis
2鱼类和野生动物服务(FWS)9900 997200 apis
3州自然资源部1758600 41905800 apis
4私人土地所有者26100 2536200 apis
5国家公园管理局(NPS)112636800 165591900 apis
6未知1586688300 1075917600 apis
7私立机构0 461700 apis
8美国原住民土地11354400 314052300 apis
12大自然保护协会(TNC)24300 719100 indu
21其他州土地1018800 10045800 indu
31私立机构5282100 12556800 indu
41国家交通部0 900产品
51国务院自然资源12600 2018700 indu
61未知19192500 1446426000 indu
71 National Park Service(NPS)802800 42484500 indu


I have 8 data frames that I want to add a column called 'park', then fill this column in w/ a value that comes from the last four characters of the dataframe name. Here are two of my eight data frames:

water_land_by_ownname_apis <- structure(list(OWNERNAME = c("Forest Service (USFS)", "Fish and Wildlife Service (FWS)", 
"State Department of Natural Resources", "Private Landowner", 
"National Park Service (NPS)", "Unknown", "Private Institution", 
"Native American Land"), WATER = c(696600, 9900, 1758600, 26100, 
112636800, 1586688300, 0, 11354400), LAND = c(258642900, 997200, 
41905800, 2536200, 165591900, 1075917600, 461700, 314052300)), class = "data.frame", .Names = c("OWNERNAME", 
"WATER", "LAND"), data_types = c("C", "F", "F"), row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8"))

water_land_by_ownname_indu <- structure(list(OWNERNAME = c("The Nature Conservancy (TNC)", 
"Other State Land", "Private Institution", "State Department of Transportation", 
"State Department of Natural Resources", "Unknown", "National Park Service (NPS)", 
"Private Landowner", "Joint Ownership", "Private Non-profit", 
"Land Trust"), WATER = c(24300, 1018800, 5282100, 0, 12600, 19192500, 
802800, 139500, 0, 0, 0), LAND = c(719100, 10045800, 12556800, 
900, 2018700, 1446426000, 42484500, 5769900, 38700, 852300, 70200
)), class = "data.frame", .Names = c("OWNERNAME", "WATER", "LAND"
), data_types = c("C", "F", "F"), row.names = c("1", "2", "3", 
"4", "5", "6", "7", "8", "9", "10", "11"))

Which look like this...

> water_land_by_ownname_apis
                              OWNERNAME      WATER       LAND
1                 Forest Service (USFS)     696600  258642900
2       Fish and Wildlife Service (FWS)       9900     997200
3 State Department of Natural Resources    1758600   41905800
4                     Private Landowner      26100    2536200
5           National Park Service (NPS)  112636800  165591900
6                               Unknown 1586688300 1075917600
7                   Private Institution          0     461700
8                  Native American Land   11354400  314052300
> water_land_by_ownname_indu
                               OWNERNAME    WATER       LAND
1           The Nature Conservancy (TNC)    24300     719100
2                       Other State Land  1018800   10045800
3                    Private Institution  5282100   12556800
4     State Department of Transportation        0        900
5  State Department of Natural Resources    12600    2018700
6                                Unknown 19192500 1446426000
7            National Park Service (NPS)   802800   42484500
8                      Private Landowner   139500    5769900
9                        Joint Ownership        0      38700
10                    Private Non-profit        0     852300
11                            Land Trust        0      70200

For each dataframe, I want to add a column ('park') and fill this column with the last four characters of the data frame name. For example...

water_land_by_ownname_apis$park <- 'apis'
water_land_by_ownname_indu$park <- 'indu'

Resulting in this...

> water_land_by_ownname_apis
                              OWNERNAME      WATER       LAND park
1                 Forest Service (USFS)     696600  258642900 apis
2       Fish and Wildlife Service (FWS)       9900     997200 apis
3 State Department of Natural Resources    1758600   41905800 apis
4                     Private Landowner      26100    2536200 apis
5           National Park Service (NPS)  112636800  165591900 apis
6                               Unknown 1586688300 1075917600 apis
7                   Private Institution          0     461700 apis
8                  Native American Land   11354400  314052300 apis
> water_land_by_ownname_indu
                               OWNERNAME    WATER       LAND park
1           The Nature Conservancy (TNC)    24300     719100 indu
2                       Other State Land  1018800   10045800 indu
3                    Private Institution  5282100   12556800 indu
4     State Department of Transportation        0        900 indu
5  State Department of Natural Resources    12600    2018700 indu
6                                Unknown 19192500 1446426000 indu
7            National Park Service (NPS)   802800   42484500 indu
8                      Private Landowner   139500    5769900 indu
9                        Joint Ownership        0      38700 indu
10                    Private Non-profit        0     852300 indu
11                            Land Trust        0      70200 indu

Then, rbind them together....

water_land_by_ownname <- rbind (water_land_by_ownname_apis, water_land_by_ownname_indu)

Then, remove prior data frames from memory...

rm (water_land_by_ownname_apis,water_land_by_ownname_indu)

解决方案

You can do this for example:

do.call(rbind,lapply(ls(pattern='water.*'),
       function(x) {
         dat=get(x)
         dat$park = sub('.*_(.*)$','\\1',x)
         dat
       }))

  1. ls will extract all data.frames names having certain pattern, here I assume you data.frame begin with the word water. This will be store names in a list handy for lapply use.
  2. sub will extract the last part of the name
  3. do.call+ rbind applied to the resulted list to get a unique big data.frame

using your 2 data.frames I get :

                              OWNERNAME      WATER       LAND park
1                  Forest Service (USFS)     696600  258642900 apis
2        Fish and Wildlife Service (FWS)       9900     997200 apis
3  State Department of Natural Resources    1758600   41905800 apis
4                      Private Landowner      26100    2536200 apis
5            National Park Service (NPS)  112636800  165591900 apis
6                                Unknown 1586688300 1075917600 apis
7                    Private Institution          0     461700 apis
8                   Native American Land   11354400  314052300 apis
12          The Nature Conservancy (TNC)      24300     719100 indu
21                      Other State Land    1018800   10045800 indu
31                   Private Institution    5282100   12556800 indu
41    State Department of Transportation          0        900 indu
51 State Department of Natural Resources      12600    2018700 indu
61                               Unknown   19192500 1446426000 indu
71           National Park Service (NPS)     802800   42484500 indu

这篇关于r函数/循环将列和值添加到多个数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆