要在 R 中列出的文本文件 [英] Text file to list in R

查看:17
本文介绍了要在 R 中列出的文本文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大文本文件,每行中有可变数量的字段.每行中的第一个条目对应一个生物途径,随后的每个条目对应该途径中的一个基因.前几行可能看起来像这样

I have a large text file with a variable number of fields in each row. The first entry in each row corresponds to a biological pathway, and each subsequent entry corresponds to a gene in that pathway. The first few lines might look like this

path1   gene1 gene2
path2   gene3 gene4 gene5 gene6
path3   gene7 gene8 gene9

我需要把这个文件作为一个列表读入R,每个元素是一个字符向量,列表中每个元素的名字是该行的第一个元素,例如:

I need to read this file into R as a list, with each element being a character vector, and the name of each element in the list being the first element on the line, for example:

> pathways <- list(
+     path1=c("gene1","gene2"), 
+     path2=c("gene3","gene4","gene5","gene6"),
+     path3=c("gene7","gene8","gene9")
+ )
> 
> str(pathways)
List of 3
 $ path1: chr [1:2] "gene1" "gene2"
 $ path2: chr [1:4] "gene3" "gene4" "gene5" "gene6"
 $ path3: chr [1:3] "gene7" "gene8" "gene9"
> 
> str(pathways$path1)
 chr [1:2] "gene1" "gene2"
> 
> print(pathways)
$path1
[1] "gene1" "gene2"

$path2
[1] "gene3" "gene4" "gene5" "gene6"

$path3
[1] "gene7" "gene8" "gene9"

...但我需要为数千行自动执行此操作.我看到了一个 类似的问题发布以前在这里,但我不知道如何从那个线程中做到这一点.

...but I need to do this automatically for thousands of lines. I saw a similar question posted here previously, but I couldn't figure out how to do this from that thread.

提前致谢.

推荐答案

这是一种方法:

# Read in the data
x <- scan("data.txt", what="", sep="
")
# Separate elements by one or more whitepace
y <- strsplit(x, "[[:space:]]+")
# Extract the first vector element and set it as the list element name
names(y) <- sapply(y, `[[`, 1)
#names(y) <- sapply(y, function(x) x[[1]]) # same as above
# Remove the first vector element from each list element
y <- lapply(y, `[`, -1)
#y <- lapply(y, function(x) x[-1]) # same as above

这篇关于要在 R 中列出的文本文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆