r函数/循环将列和值添加到多个数据框 [英] r function/loop to add column and value to multiple dataframes
问题描述
我有8个数据框,我想添加一个名为' park
'的列,然后在这个列中填写一个来自最后四个字符的值 dataframe
名称。以下是我的八个数据框中的两个:
water_land_by_ownname_apis< - structure(list(OWNERNAME = c(Forest Service(USFS ),鱼类和野生动物服务局(FWS),
国家自然资源部,私人土地所有者,
国家公园管理局(NPS),未知, ,
美国原住民土地),WATER = c(696600,9900,1758600,26100,
112636800,1586688300,0,11354400),LAND = c(258642900,997200,
41005800,2536200,165591900,1075917600,461700,314052300)),class =data.frame,.Names = c(OWNERNAME,
WATER,LAND),data_types = c(C ,F,F),row.names = c(1,
2,3,4,5,6,7 8))
water_land_by_ownname_indu< - 结构(列表(OWNERNAME = c(大自然保护协会),
其他州土地,私人机构,州交通部,
国家自然资源部,未知,国家公园服务(NPS),
私人Lando wner,联合所有权,私人非营利,
土地信托),WATER = c(24300,1018800,5282100,012600,19192500,
802800,139500,0 ,0,0),LAND = c(719100,10045800,12556800,$ b $ 900,2018700,1446426000,42484500,5769900,38700,852300,70200
)),class =data.frame, .Names = c(OWNERNAME,WATER,LAND
),data_types = c(C,F,F),row.names = c(1, 2,3,
4,5,6,7,8,9,10,11))
看起来像这样...
> water_land_by_ownname_apis
OWNERNAME WATER LAND
1森林服务(USFS)696600 258642900
2鱼类和野生动物服务(FWS)9900 997200
3国家自然资源部门1758600 41905800
4私人土地所有者26100 2536200
5国家公园服务(NPS)112636800 165591900
6未知1586688300 1075917600
7私立机构0 461700
8美国原住民土地11354400 314052300
> ; water_land_by_ownname_indu
OWNERNAME WATER LAND
1大自然保护协会(TNC)24300 719100
2其他州土地1018800 10045800
3私立机构5282100 12556800
4州立交通部0 900
5州自然资源部12600 2018700
6未知19192500 1446426000
7国家公园管理局802800 42484500
8私人土地所有者139500 5769900
9联合所有权0 38700
10私人非盈利0 852300
11土地信托0 70200
对于每个数据框,我想添加一列('park'),并填充该列中的数据框名称的最后四个字符。例如...
water_land_by_ownname_apis $ park <-'apis'
water_land_by_ownname_indu $ park < - 'indu '
导致此...
> water_land_by_ownname_apis
OWNERNAME WATER LAND park
1森林服务(USFS)696600 258642900 apis
2鱼类和野生动物管理局(FWS)9900 997200 apis
3国家自然资源部1758600 41905800 apis
4私人土地所有者26100 2536200 apis
5国家公园服务(NPS)112636800 165591900 apis
6未知1586688300 1075917600 apis
7私立机构0 461700 apis
8美国原住民土地11354400 314052300 apis
> water_land_by_ownname_indu
OWNERNAME WATER LAND park
1大自然保护协会(TNC)24300 719100 indu
2其他州土地1018800 10045800 indu
3私人机构5282100 12556800 indu
4州运输部0 900美元
5国家自然资源部12600 2018700 indu
6未知19192500 1446426000 indu
7国家公园管理局(NPS)802800 42484500 indu
8私人土地所有者139500 5769900 indu
9联合所有权0 38700 indu
10私人非盈利0 852300 indu
11 Land Trust 0 70200 indu
然后,将它们放在一起....
water_land_by_ownname< - RB ind(water_land_by_ownname_apis,water_land_by_ownname_indu)
然后,从内存中删除先前的数据帧...
rm(water_land_by_ownname_apis,water_land_by_ownname_indu)
do.call( rbind,lapply(ls(pattern ='water。*'),
function(x){
dat = get(x)
dat $ park = sub('。* _(。 *)$','\\1',x)
dat
)))
-
ls
将提取具有特定模式的所有data.frames名称,在这里我假设你data.frame以word 。这会将商店名称放在lapply
使用的列表中。 -
sub
将提取名称的最后部分 -
do.call
+rbind
应用到结果列表中以获得唯一的大数据。frame
使用您的2个data.frames获得:
OWNERNAME WATER LAND park
1森林服务(USFS)696600 258642900 apis
2鱼类和野生动物服务(FWS)9900 997200 apis
3州自然资源部1758600 41905800 apis
4私人土地所有者26100 2536200 apis
5国家公园管理局(NPS)112636800 165591900 apis
6未知1586688300 1075917600 apis
7私立机构0 461700 apis
8美国原住民土地11354400 314052300 apis
12大自然保护协会(TNC)24300 719100 indu
21其他州土地1018800 10045800 indu
31私立机构5282100 12556800 indu
41国家交通部0 900产品
51国务院自然资源12600 2018700 indu
61未知19192500 1446426000 indu
71 National Park Service(NPS)802800 42484500 indu
I have 8 data frames that I want to add a column called 'park
', then fill this column in w/ a value that comes from the last four characters of the dataframe
name. Here are two of my eight data frames:
water_land_by_ownname_apis <- structure(list(OWNERNAME = c("Forest Service (USFS)", "Fish and Wildlife Service (FWS)",
"State Department of Natural Resources", "Private Landowner",
"National Park Service (NPS)", "Unknown", "Private Institution",
"Native American Land"), WATER = c(696600, 9900, 1758600, 26100,
112636800, 1586688300, 0, 11354400), LAND = c(258642900, 997200,
41905800, 2536200, 165591900, 1075917600, 461700, 314052300)), class = "data.frame", .Names = c("OWNERNAME",
"WATER", "LAND"), data_types = c("C", "F", "F"), row.names = c("1",
"2", "3", "4", "5", "6", "7", "8"))
water_land_by_ownname_indu <- structure(list(OWNERNAME = c("The Nature Conservancy (TNC)",
"Other State Land", "Private Institution", "State Department of Transportation",
"State Department of Natural Resources", "Unknown", "National Park Service (NPS)",
"Private Landowner", "Joint Ownership", "Private Non-profit",
"Land Trust"), WATER = c(24300, 1018800, 5282100, 0, 12600, 19192500,
802800, 139500, 0, 0, 0), LAND = c(719100, 10045800, 12556800,
900, 2018700, 1446426000, 42484500, 5769900, 38700, 852300, 70200
)), class = "data.frame", .Names = c("OWNERNAME", "WATER", "LAND"
), data_types = c("C", "F", "F"), row.names = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10", "11"))
Which look like this...
> water_land_by_ownname_apis
OWNERNAME WATER LAND
1 Forest Service (USFS) 696600 258642900
2 Fish and Wildlife Service (FWS) 9900 997200
3 State Department of Natural Resources 1758600 41905800
4 Private Landowner 26100 2536200
5 National Park Service (NPS) 112636800 165591900
6 Unknown 1586688300 1075917600
7 Private Institution 0 461700
8 Native American Land 11354400 314052300
> water_land_by_ownname_indu
OWNERNAME WATER LAND
1 The Nature Conservancy (TNC) 24300 719100
2 Other State Land 1018800 10045800
3 Private Institution 5282100 12556800
4 State Department of Transportation 0 900
5 State Department of Natural Resources 12600 2018700
6 Unknown 19192500 1446426000
7 National Park Service (NPS) 802800 42484500
8 Private Landowner 139500 5769900
9 Joint Ownership 0 38700
10 Private Non-profit 0 852300
11 Land Trust 0 70200
For each dataframe, I want to add a column ('park') and fill this column with the last four characters of the data frame name. For example...
water_land_by_ownname_apis$park <- 'apis'
water_land_by_ownname_indu$park <- 'indu'
Resulting in this...
> water_land_by_ownname_apis
OWNERNAME WATER LAND park
1 Forest Service (USFS) 696600 258642900 apis
2 Fish and Wildlife Service (FWS) 9900 997200 apis
3 State Department of Natural Resources 1758600 41905800 apis
4 Private Landowner 26100 2536200 apis
5 National Park Service (NPS) 112636800 165591900 apis
6 Unknown 1586688300 1075917600 apis
7 Private Institution 0 461700 apis
8 Native American Land 11354400 314052300 apis
> water_land_by_ownname_indu
OWNERNAME WATER LAND park
1 The Nature Conservancy (TNC) 24300 719100 indu
2 Other State Land 1018800 10045800 indu
3 Private Institution 5282100 12556800 indu
4 State Department of Transportation 0 900 indu
5 State Department of Natural Resources 12600 2018700 indu
6 Unknown 19192500 1446426000 indu
7 National Park Service (NPS) 802800 42484500 indu
8 Private Landowner 139500 5769900 indu
9 Joint Ownership 0 38700 indu
10 Private Non-profit 0 852300 indu
11 Land Trust 0 70200 indu
Then, rbind them together....
water_land_by_ownname <- rbind (water_land_by_ownname_apis, water_land_by_ownname_indu)
Then, remove prior data frames from memory...
rm (water_land_by_ownname_apis,water_land_by_ownname_indu)
You can do this for example:
do.call(rbind,lapply(ls(pattern='water.*'),
function(x) {
dat=get(x)
dat$park = sub('.*_(.*)$','\\1',x)
dat
}))
ls
will extract all data.frames names having certain pattern, here I assume you data.frame begin with the word water. This will be store names in a list handy forlapply
use.sub
will extract the last part of the namedo.call
+rbind
applied to the resulted list to get a unique big data.frame
using your 2 data.frames I get :
OWNERNAME WATER LAND park
1 Forest Service (USFS) 696600 258642900 apis
2 Fish and Wildlife Service (FWS) 9900 997200 apis
3 State Department of Natural Resources 1758600 41905800 apis
4 Private Landowner 26100 2536200 apis
5 National Park Service (NPS) 112636800 165591900 apis
6 Unknown 1586688300 1075917600 apis
7 Private Institution 0 461700 apis
8 Native American Land 11354400 314052300 apis
12 The Nature Conservancy (TNC) 24300 719100 indu
21 Other State Land 1018800 10045800 indu
31 Private Institution 5282100 12556800 indu
41 State Department of Transportation 0 900 indu
51 State Department of Natural Resources 12600 2018700 indu
61 Unknown 19192500 1446426000 indu
71 National Park Service (NPS) 802800 42484500 indu
这篇关于r函数/循环将列和值添加到多个数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!