在R中将字符转换为html [英] Convert character to html in R

查看:258
本文介绍了在R中将字符转换为html的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在R中将包含非ASCII字符的字符(矢量)转换为html的首选方式是什么?例如,我想转换

What's the prefered way in R to convert a character (vector) containing non-ASCII characters to html? I would for example like to convert

  "ü"

  "ü"

我知道,巧妙地使用gsub是可行的(但有人能一劳永逸吗?),我认为R2HTML包可以做到这一点,但事实并非如此.

I am aware that this is possible by a clever use of gsub (but has anyone doen it once and for all?) and I thought that the package R2HTML would do that, but it doesn't.

这是我最终使用的内容;很明显,可以通过修改字典来扩展它:

Here is what I ended up using; it can obviously be extended by modifying the dictionary:

char2html <- function(x){
  dictionary <- data.frame(
    symbol = c("ä","ö","ü","Ä", "Ö", "Ü", "ß"),
    html = c("&auml;","&ouml;", "&uuml;","&Auml;",
             "&Ouml;", "&Uuml;","&szlig;"))
  for(i in 1:dim(dictionary)[1]){
    x <- gsub(dictionary$symbol[i],dictionary$html[i],x)
  }
  x
}

x <- c("Buschwindröschen", "Weißdorn")
char2html(x)

推荐答案

XML为此使用方法insertEntities,但是该方法是内部的.因此,使用它需要您自担风险,因为不能保证它将在将来的版本中保持这种运行状态.

The XML uses a method insertEntities for this, but that method is internal. So you may use it at your own risk, as there are no guarantees that it will remain to operate like this in future versions.

现在,您的代码可以使用

Right now, your code could be accomplished using

char2html <- function(x) XML:::insertEntities(x, c("ä"="auml", "ö"="ouml", …))

使用命名列表而不是data.frame感觉很优雅,但不会改变事物的核心.在后台,insertEntities调用gsub的方式与您的代码非常相似.

The use of a named list instead of a data.frame feels kind of elegant, but doesn't change the core of things. Under the hood, insertEntities calls gsub in much the same way your code does.

如果数字HTML实体在您的环境中有效,则可以使用utf8ToInt将所有文本转换为文本,然后将可安全打印的ASCII字符转换回未转义的形式.这样可以省去维护实体字典的麻烦.

If numeric HTML entities are valid in your environment, then you could probably convert all your text into those using utf8ToInt and then turn safely printable ASCII characters back into unescaped form. This would save you the trouble of maintaining a dictionary for your entities.

这篇关于在R中将字符转换为html的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆