rpy2 错误:“字符串中无法识别的转义"; [英] rpy2 Error: "unrecognized escape in character string"

查看:60
本文介绍了rpy2 错误:“字符串中无法识别的转义";的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 R 中有一段代码,我想将其插入到我的 Python 代码中.为了这个目的,我正在使用 rpy2.R 代码涉及许多正则表达式,似乎 rpy2 没有正确处理它们,或者我没有对它们进行充分编码.

I have a chunk of code in R that I would like to insert in my python code. To that aim I am using rpy2. The R code involves many regular expressions and it seems that rpy2 is not handling them correctly or perhaps I am not coding them adequately.

这里是一段代码的例子,另一个不工作:

Here is an example of a piece of code that words and another that does not work:

1) 它有效:一个非常简单的 removeStopWords 函数:

1) It works: A very trivial removeStopWords function:

import rpy2.robjects as robjects
from rpy2.robjects.packages import importr

robjects.r('''
library(data.table)
library(tm)

removeStopWords <- function(x) gsub("  ", " ", removeWords(x, stopwords("english")))

''')

In [4]: r_f = robjects.r['removeStopWords']
In [5]: r_f('I want to dance')[0]
Out[5]: 'I want dance'

2) 它不起作用:删除前导和尾随空格的功能也很简单:

2) it does not work: an also trivial function to remove leading and trailing spaces:

robjects.r('''
library(data.table)
library(tm)

trim <- function (x) gsub("^\\s+|\\s+$", "", x)

''')

 Error: '\s' is an unrecognized escape in character string starting ""^\s"
p = rinterface.parse(string)
Abort

然后我被 IPython开除"了

and the I am "expelled out" from IPython

我直接试过:

import rpy2.rinterface as ri
exp = ri.parse('trim <- function (x) gsub("^\\s+|\\s+$", "", x)') 

但是结果还是一样,Abort然后退出IPython

but the result is the same, Abort and then out of IPython

在这个阶段我真的不知道该尝试什么.R 代码非常大,所以从 R 全部迁移到 python 需要我一些时间......我宁愿不必做这样的事情.

At this stage I don't really know what to try. The R code is quite large so moving all from R to python would take me some time...and I would prefer not having to do such a thing.

非常感谢任何帮助!

提前感谢您的时间.

推荐答案

当你在 Python 中将 \\ 写成一个字符串时,它被存储为 \ 因为 \ 是一个转义字符.所以当R执行代码时,看到的是"^\s+|\s+$".但是 \ 也是 R 中的转义字符,并且 \s 不被识别为任何转义字符.

When you write \\ in a string in Python, it is stored as \ since \ is an escaping character. So when R executes the code, it sees "^\s+|\s+$". But \is also and escaping character in R and \s not recognized as any escaped character.

如果你想让R收到"^\\s+|\\s+$",你需要写"^\\\\s+|\\\\s+$" 在 Python 中(反斜杠数量的两倍).

If you want R to recieve "^\\s+|\\s+$", you need to write "^\\\\s+|\\\\s+$" in Python(twice the number of backslashes).

这篇关于rpy2 错误:“字符串中无法识别的转义";的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆