R 跳过/dev/stdin 中的行 [英] R skips lines from /dev/stdin

查看：19 发布时间：2021/9/5 20:30:04 r terminal

本文介绍了R 跳过/dev/stdin 中的行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个包含数字列表的文件(自己制作:for x in $(seq 10000); do echo $x; done > file).

I have a file with a list of numbers (make it for yourself: for x in $(seq 10000); do echo $x; done > file).

$> R -q -e "x <- read.csv('file', header=F); summary(x);"

> x <- read.csv('file', header=F); summary(x);
       V1       
 Min.   :    1  
 1st Qu.: 2501  
 Median : 5000  
 Mean   : 5000  
 3rd Qu.: 7500  
 Max.   :10000

现在，人们可能期望 cat 处理文件并从 /dev/stdin 读取具有相同的输出，但事实并非如此:

Now, one might expect cating the file and reading from /dev/stdin to have the same output, but it does not:

$> cat file | R -q -e "x <- read.csv('/dev/stdin', header=F); summary(x);"
> x <- read.csv('/dev/stdin', header=F); summary(x);
       V1       
 Min.   :    1  
 1st Qu.: 3281  
 Median : 5520  
 Mean   : 5520  
 3rd Qu.: 7760  
 Max.   :10000

使用 table(x) 显示跳过了一堆行:

Using table(x) shows that a bunch of lines were skipped:

    1  1042  1043  1044  1045  1046  1047  1048  1049  1050  1051  1052  1053 
    1     1     1     1     1     1     1     1     1     1     1     1     1 
 1054  1055  1056  1057  1058  1059  1060  1061  1062  1063  1064  1065  1066 
    1     1     1     1     1     1     1     1     1     1     1     1     1
 ...

看起来 R 对 stdin 做了一些有趣的事情，因为这个 Python 将正确打印文件中的所有行:

It looks like R is doing something funny with stdin, as this Python will properly print all the lines in the file:

cat file | python -c 'with open("/dev/stdin") as f: print f.read()'

<小时>

这个问题似乎相关，但更多的是关于跳过格式错误的 CSV 文件中的行，而我的输入只是一个数字列表.

This question seems related, but it is more about skipping lines in a malformed CSV file, whereas my input is just a list of numbers.

推荐答案

head --bytes=4K file |tail -n 3

产生这个:

1039
1040
104

这表明 R 在/dev/stdin 上创建了一个大小为 4KB 的输入缓冲区，并在初始化期间填充它.当您的 R 代码随后读取/dev/stdin 时，它会在此时在文件中启动:

This suggests that R creates an input buffer on /dev/stdin, of size 4KB, and fills it during initialisation. When your R code then reads /dev/stdin, it starts in file at this point:

实际上，如果在文件中将 1041 行替换为 1043，则 table(x) 中会得到3"而不是1":


Indeed, if in file you replace the line 1041 by 1043, you get a "3" instead of "1" in the table(x):
3  1042  1043  1044  1045  1046  1047  1048  1049  1050  1051  1052  1053 
1     1     1     1     1     1     1     1     1     1     1     1     1 
...

table(x)中的第一个1实际上是1041的最后一位.文件的前 4KB 已被吃掉.
The first 1 in table(x) is actually the last digit of 1041. The first 4KB of file have been eaten.

                        这篇关于R 跳过/dev/stdin 中的行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

R 跳过/dev/stdin 中的行 [英] R skips lines from /dev/stdin

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

R 跳过/dev/stdin 中的行 [英] R skips lines from /dev/stdin

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭