使用 R 的 shapefiles 包中的 convert.to.shapefile 向 shapefile 添加额外的数据列 [英] Adding extra data column to shapefile using convert.to.shapefile in R's shapefiles package
问题描述
我的目标非常简单,即将 1 列统计数据添加到 shapefile,以便我可以使用它来为地理区域着色.数据是来自 gadm 的国家/地区文件.为此,我通常在 R 中使用外部包:
My goal is very simple, namely to add 1 column of statistical data to a shapefile so that I can use it for example to colour a geographical area. The data are a country file from gadm. To this end I usually use the foreign package in R thus:
library(foreign)
newdbf <- read.dbf("CHN_adm1.dbf") #original shape file
incrdata <- read.csv("CHN_test.csv") #.csv file with same region names column + new data column
mergedbf <- merge(newdbf,incrdata)
write.dbf(mergedbf,"CHN_New")
这几乎在所有情况下都实现了我想要的,但是我在 R 外部处理的软件之一只会识别 .shp 文件而不会读取 .dbf(尽管从某种意义上说,该语句显然是轻微的矛盾).不知道为什么不会.无论如何,基本上它让我需要做与上面相同的事情,但是使用 shapefile.我认为根据 shapefiles 包的说明,该过程应该运行如下:
This achieves what I want in almost all circumstances, but one of the pieces of software I am dealing with external to R will only recognize .shp files and will not read .dbf (although clearly in a sense that statement is a slight contradiction). Not sure why it won't. Anyhow, essentially it leaves me needing to do the same thing as above, but with a shapefile. I think that according to notes on shapefiles package, the process should run something like this:
library(shapefiles)
shaper <- read.shp("CHN_adm1.shp")
simplified <- convert.to.simple(shaper)
simplified <- change.id(simplified,incrdata$DataNew) #DataNew being new column of data from the .csv
simpleAsList <- by(simplified,simplified[,1],function(x)x)
####This is where I hit problems####
backToShape <- convert.to.shapefile(simplified,
data.frame(index=c("20","30","40","50","60","70","80")),"index",5)
write.shapefile(backToShape,"CHN_TestShape")
我担心我无法理解 shapefile,因为我无法用数据帧的方式解开它们或将它们可视化,因此当它返回到外部图表包.
I'm afraid that I can't get my head around shapefiles, since I can't unpick them or visualize them in a way I can with dataframes, and so the resultant shape has been screwed up when it goes back to the external charting package.
需要明确的是:在backToShape"中,我只想添加数据列并重建 shapefile.碰巧我拥有的数据显示为一个因子,即 20,30,40 等,但数据可以很容易地连续,而且我确定我不需要输入所有可能性,但它是我似乎能让它被接受的唯一方法.有人可以让我走上正确的轨道,如果我错过了一个更简单的方法,我也会非常感谢听到建议.非常感谢.
To be clear: in 'backToShape' I just want to add the column of data and reconstruct the shapefile. It so happens that the data I have appears as a factor, ie 20,30,40 etc, but the data could just as easily be continuous, and I'm sure I don't need to type in all possibilities, but it was the only way I could seem to get it to be accepted. Can somebody please put me on the right track, and if I'm missing a simpler way, I'd also be extremely grateful to hear a suggestion. Many thanks in advance.
推荐答案
停止使用 shapefiles
包.
安装 sp
和 rgdal
包.
使用以下命令读取 shapefile:
Read shapefile with:
chn = readOGR(".","CHN_adm1") # first arg is path, second is shapefile name w/o .shp
现在 chn
就像一个数据框.实际上 chn@data
是一个数据框.对该数据框执行您喜欢的操作,但保持相同的顺序,然后您可以通过以下方式使用新数据保存更新的 shapefile:
Now chn
is like a data frame. In fact chn@data
is a data frame. Do what you like to that data frame but keep it in the same order, and then you can save the updated shapefile with the new data by:
writeOGR(chn, ".", "CHN_new", driver="ESRI Shapefile")
注意你不应该直接操作 chn@data
数据框,你可以使用 chn
就像它在很多方面都是一个数据框,例如 chn$foo
获取名为 foo
的列,或者 chn$popden = chn$pop/chn$area
将创建一个新的人口密度列,如果你有人口和面积列.
Note you shouldn't really manipulate the chn@data
data frame directly, you can work with chn
like it is a data frame in many respects, for example chn$foo
gets the column named foo
, or chn$popden = chn$pop/chn$area
would create a new column of population density if you have population and area columns.
spplot(chn, "popden")
将映射您刚刚创建的 popden
列,并且:
will map by the popden
column you just created, and:
head(as.data.frame(chn))
应该显示 shapefile 数据的前几行.
should show you the first few lines of the shapefile data.
这篇关于使用 R 的 shapefiles 包中的 convert.to.shapefile 向 shapefile 添加额外的数据列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!