如何从文本中识别位置 [英] how to identify locations from text
本文介绍了如何从文本中识别位置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
df = read.csv(secondary.csv,header = TRUE )
解决方案
S < - S / O SK hungu九十零分之一百〇一MODEL HOUSE TALAB Gagni酒店SHUKUL LUCKNOW北方邦LUCKNOW北方邦226001
我建议制作所有可能的Nx个字符串,其中N是字符串的长度,x是可变长度的
allchr< - unset(strsplit(S,))
listsubstr < - sapply(1:length(allchr),function(I)paste0(allchr [I:length(allchr)],collapse =))
#[1] S / O SK hungu九十分之一百○一MODEL HOUSE TALAB Gagni酒店SHUKUL LUCKNOW北方邦LUCKNOW北方邦226001
#[2]/ O SK hungu九十分之一百○一MODEL HOUSE TALAB Gagni酒店SHUKUL LUCKNOW北方邦LUCKNOW北方邦226001\"
#[3] / O SK hungu九十零分之一百零一MODEL HOUSE TALAB Gagni酒店SHUKUL LUCKNOW北方邦LUCKNOW北方邦226001
#[4] O- SK hungu九十零分之一百零一MODEL HOUSE TALAB Gagni酒店SHUKUL LUCKNOW北方邦LUCKNOW北方邦226001
您可以遍历这个列表来检查有效的地理编码。我不得不提供伪代码,因为我不知道如何检查一个字符串是否是有效的地理编码。
pre $ sapply(listsubstr,函数(I)is.geocode(I))#包含伪代码
你也可以用递归(x是gecode){#包含伪代码$($)
$ b
myfun< b $ b return(x)
} else {
myfun(substr(x,2,nchar(S)))
}
}
Here is sample to my function that getscodes
df= read.csv("secondary.csv",header = TRUE)
解决方案
S <- "s / O sk hungu 101 / 90 MODEL HOUSE TALAB GAGNI SHUKUL LUCKNOW UTTAR PRADESH LUCKNOW UTTAR PRADESH 226001"
I recommend making all possible N-x strings where N is length of your string and x is variable length
allchr <- unlist(strsplit(S, ""))
listsubstr <- sapply(1:length(allchr), function(I) paste0(allchr[I:length(allchr)], collapse=""))
# [1] "s / O sk hungu 101 / 90 MODEL HOUSE TALAB GAGNI SHUKUL LUCKNOW UTTAR PRADESH LUCKNOW UTTAR PRADESH 226001"
# [2] " / O sk hungu 101 / 90 MODEL HOUSE TALAB GAGNI SHUKUL LUCKNOW UTTAR PRADESH LUCKNOW UTTAR PRADESH 226001"
# [3] "/ O sk hungu 101 / 90 MODEL HOUSE TALAB GAGNI SHUKUL LUCKNOW UTTAR PRADESH LUCKNOW UTTAR PRADESH 226001"
# [4] " O sk hungu 101 / 90 MODEL HOUSE TALAB GAGNI SHUKUL LUCKNOW UTTAR PRADESH LUCKNOW UTTAR PRADESH 226001"
You can iterate through this list to check for valid geocodes. I have to provide pseudocode since I'm not sure how to check if a string is a valid geocode.
sapply(listsubstr, function(I) is.geocode(I)) # contains pseudocode
You could also do this with recursion though.
myfun <- function(x) {
if (x is gecode) { # contains pseudocode
return(x)
} else {
myfun(substr(x, 2, nchar(S)))
}
}
这篇关于如何从文本中识别位置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文