创建字典并用R替换拉丁词 [英] Create dictionary and replace by it latin words in R

查看:109
本文介绍了创建字典并用R替换拉丁词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有带有拉丁语单词的数据集

I have dataset with latin words

text<-c("TESS",
"MAG")

我想设置拉丁西里尔字母音译

I want to set transliteration from latin-cyrillic

library(stringi)
d=stri_trans_general(mydat$text, "latin-cyrillic")

但是我想手动创建翻译字典。
例如:

But I want to manually create the translit dictionary. For example:

dictionary<-c("Tess"="ТЕСС"
"MAG"="МАГ"
.......
......
)

创建字典时,mydat $文本中的
必须用我设置的西里尔字母词替换所有拉丁词。
这样的东西

when dictionary is created, in mydat$text,all latin words must be replaced by cyrillic words, which i set. something like this

d=dictionary(mydat$text)

如何执行这种替换?

text<-c("TESS",
"MAG")



已翻译的文件



file with translit

dict=path.csv

包含

dict=

structure(list(old = structure(c(2L, 1L), .Label = c("mag", "tess"
), class = "factor"), new = structure(c(2L, 1L), .Label = c("маг", 
"тесс"), class = "factor")), .Names = c("old", "new"), class = "data.frame", row.names = c(NA, 
-2L))

#output

text<-c("ТЕСС",
"МАГ")

仅此

推荐答案

去了!

dict <- structure(list(
  old = structure(c(2L, 1L), .Label = c("mag", "tess"),class = "factor"),
  new = structure(c(2L, 1L), .Label = c("маг", "тесс"), class = "factor")),
  .Names = c("old", "new"), class = "data.frame", row.names = c(NA, -2L))

input<-c("TESS","MAG")

output <- with(lapply(dict,as.character), new[match(tolower(input),old)])
output
# [1] "тесс" "маг"

这篇关于创建字典并用R替换拉丁词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆