如何通过提取特定行来制作变量? [英] How to make a variable by extracting specific line?
问题描述
我有以下类似的数据,其中基因名称(例如ABCB9)中包含SNP名称(rs号或c#_pos#).在名为c#_pos000000的SNP中,#的范围是1到22(染色体数)
I have data like below with SNP names (rs number or c#_pos#) included in gene names (e.g. ABCB9). In SNPs named as c#_pos000000, range of # is 1 to 22 (chromosome number)
ABCB9
rs11057374
rs7138100
c22_pos41422393
rs12309481
END
ABCC10
rs1214748
END
HDAC9
rs928578
rs10883039
END
HCN2
rs12428035
rs9561933
c2_pos102345
rs3848077
rs3099362
END
通过使用这些数据,我要使输出如下所示
by using this data, I want to make the output like below
rs11057374 ABCB9
rs7138100 ABCB9
c22_pos41422393 ABCB9
rs12309481 ABCB9
rs1214748 ABCC10
rs928578 HDAC9
rs10883039 HDAC9
rs12428035 HCN2
rs9561933 HCN2
c2_pos102345 HCN2
rs3848077 HCN2
rs3099362 HCN2
没有必要空白和"END"
It is not necessary whether there are blank and "END"
如何在R或linux中生成此输出?
How make the this output in R or linux?
推荐答案
我们可以略有不同.使用readLines
读取文件并删除前/后空格(trimws
),split
基于基于空白值(""
)创建的分组向量的'lines1'之后,请删除""
或从list
元素中的"END"
字符串,然后通过对每个list
元素(sapply(lst1,
[, 1)
)的第一次观察来设置list
的names
,同时提取除第一个元素以外的所有其他元素并stack
它.
We can do this slightly differently. After reading the file with readLines
and removing the leading/lagging spaces (trimws
), split
the 'lines1' based on the grouping vector creating based on blank values (""
), remove the ""
or "END"
strings from the list
elements, then set the names
of the list
with the first observation of each list
element (sapply(lst1,
[, 1)
) while extracting all other elements except the first one and stack
it.
lines1 <- trimws(lines)
lst1 <- lapply(split(lines1, cumsum(lines1=="")),
function(x) x[!x %in% c("", "END")])
stack(setNames(lapply(lst1,`[`,-1), sapply(lst1, `[`,1)))
# values ind
#1 rs11057374 ABCB9
#2 rs7138100 ABCB9
#3 c22_pos41422393 ABCB9
#4 rs12309481 ABCB9
#5 rs1214748 ABCC10
#6 rs928578 HDAC9
#7 rs10883039 HDAC9
#8 rs12428035 HCN2
#9 rs9561933 HCN2
#10 c2_pos102345 HCN2
#11 rs3848077 HCN2
#12 rs3099362 HCN2
数据
lines <- readLines("yourdata.txt")
这篇关于如何通过提取特定行来制作变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!