使用source与parse& amp;的注意事项评估? [英] What are the caveats of using source versus parse & eval?
问题描述
我可以替换
source(filename, local = TRUE, encoding = 'UTF-8')
with
eval(parse(filename, encoding = 'UTF-8'))
没有任何损坏的风险,以使UTF-8源文件在Windows上可以正常工作?
without any risk of breakage, to make UTF-8 source files work on Windows?
我当前正在通过
source(filename, local = TRUE, encoding = 'UTF-8')
但是,众所周知在Windows上不起作用,句号停止。
However, it is well known that this does not work on Windows, full stop.
作为解决方法,郑e建议改用
eval(parse(filename, encoding = 'UTF-8'))
这似乎工作得很好 1 ,但即使参考了 source
,我不明白它们在一个关键细节上有何不同:
This seems to work quite well1 but even after consulting the source code of source
, I don’t understand how they differ in one crucial detail:
两个 source
和 sys.source
做不只需 parse
然后 eval
文件内容。相反,他们解析文件内容,然后在解析的表达式上手动进行迭代,并分别< eval
对其进行处理。我不明白为什么 sys.source
( source
至少需要使用它来显示详细的诊断,如果有这样的指示;但是 sys.source
会没有任何东西):
Both source
and sys.source
do not simply parse
and then eval
the file content. Instead, they parse the file content and then iterate manually over the parsed expressions, and eval
them one by one. I do not understand why this would be necessary in sys.source
(source
at least uses it to show verbose diagnostics, if so instructed; but sys.source
does nothing of the kind):
for (i in seq_along(exprs)) eval(exprs[i], envir)
分别使用 eval
ing语句的目的是什么??为什么要遍历索引而不是直接遍历子表达式?还有其他警告吗?
What is the purpose of eval
ing statements separately? And why is it iterating over indices instead directly over the sub-expressions? What other caveats are there?
要澄清一下:我不担心 source的其他参数
和 parse
,其中一些可以通过选项设置。
To clarify: I am not concerned about the additional parameters of source
and parse
, some of which may be set via options.
1 源
被编码触发但的原因解析
不能归结为源
试图转换输入文本的事实。 parse
没做这样的事情,它按原样读取文件的字节内容,只是将其 Encoding
标记为 UTF-8
在内存中。
1 The reason that source
is tripped up by the encoding but parse
isn’t boils down to the fact that source
attempts to convert the input text. parse
does no such thing, it reads the file’s byte content as-is and simply marks its Encoding
as UTF-8
in memory.
推荐答案
这不是一个完整的答案,因为它主要是解决了问题的 seq_along
部分,但由于篇幅太长而无法包含在注释中。
This is not a full answer as it primarily addresses the seq_along
part of the question, but too lengthy to include as comments.
seq_along
后跟 [
,而对于x中的i仅使用 (我认为它与
seq_along
相似,后跟 [[,而不是
[
)是前者保留表达式。下面是一个说明差异的示例:
One key difference between the seq_along
followed by [
vs just using for i in x
approach (which I believe is be similar to seq_along
followed by [[
instead of [
) is that the former preserves the expression. Here is an example to illustrate the difference:
> txt <- "x <- 1 + 1
+ # abnormal expression
+ 2 *
+ 3
+ "
> x <- parse(text=txt, keep.source=TRUE)
>
> for(i in x) print(i)
x <- 1 + 1
2 * 3
> for(i in seq_along(x)) print(x[i])
expression(x <- 1 + 1)
expression(2 *
3)
或者:
> attributes(x[[2]])
NULL
> attributes(x[2])
$srcref
$srcref[[1]]
2 *
3
与 eval(parse(...,keep.source = T))相比是否有任何实际影响
,我只能说可以,但是无法想象会发生这种情况。
Whether this has any practical impact when comparing to eval(parse(..., keep.source=T))
, I can only say that it could, but can't imagine a situation where it does.
请注意,分别设置子集表达式也会导致 srcref
业务获取子集,这可能很有用(...也许吗?)。
Note that subsetting expression separately also leads to the srcref
business getting subset, which could conceivably be useful (...maybe?).
这篇关于使用source与parse& amp;的注意事项评估?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!