创建带有多个分隔符的两列 [英] Create two column with multiple separators
本文介绍了创建带有多个分隔符的两列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据框,例如
COl1
scaffold_97606_2-BACs _-__ SP1_1
UELV01165908.1_2-BACs _ + __ SP2_2
UXGC01046554.1_9-702 _ + __ SP3_3
scaffold_12002_1087-1579 _-__ SP4_4
我想将它们分为两列并得到:
COL1 COL2
scaffold_97606 2-BACs _-__ SP1_1
UELV01165908.1 2-BACs_ + __ SP2_2
UXGC01046554.1 9-702 _ + __ SP3_3
scaffold_12002 1087-1579 _-__ SP4_4
so如您所见,分隔符更改可能是 .Number _
或 Number_Number
到目前为止,我写过;
df2<-df1%>%
独立的(COL1,paste0('col',1 :2),sep =分隔符模式,extra =合并)
但我不知道知道在此处应使用什么分隔符分隔符样式 $ b
解决方案
您可以使用
> df1%&%;%
独立(COl1,paste0('col',1:2),sep =(?< = \\d)_(?= \\d +-) ,extra = merge)
col1 col2
1 scaffold_97606 2-BACs _-__ SP1_1
2 UELV01165908.1 2-BACs _ + __ SP2_2
3 UXGC01046554.1 9- 702 _ + __ SP3_3
4 scaffold_12002 1087-1579 _-__ SP4_4
请参见 regex演示
模式详细信息
-
(?< = \d)
-向后看是正数,需要立即在当前位置的左侧输入数字 -
_
-下划线 -
(?= \d +- )
-一个正向的超前查询,需要一位或多位数字,然后紧接当前位置右侧的-
。
I have a dataframe such as
COl1
scaffold_97606_2-BACs_-__SP1_1
UELV01165908.1_2-BACs_+__SP2_2
UXGC01046554.1_9-702_+__SP3_3
scaffold_12002_1087-1579_-__SP4_4
and I would like to separate both into two columns and get :
COL1 COL2
scaffold_97606 2-BACs_-__SP1_1
UELV01165908.1 2-BACs_+__SP2_2
UXGC01046554.1 9-702_+__SP3_3
scaffold_12002 1087-1579_-__SP4_4
so as you can see the separator changes it can be .Number_
or Number_Number
So far I wrote ;
df2 <- df1 %>%
separate(COL1, paste0('col', 1:2), sep = " the separator patterns ", extra = "merge")
but I do not know what separator I should use here in the " the separator patterns "
part
解决方案
You may use
> df1 %>%
separate(COl1, paste0('col', 1:2), sep = "(?<=\\d)_(?=\\d+-)", extra = "merge")
col1 col2
1 scaffold_97606 2-BACs_-__SP1_1
2 UELV01165908.1 2-BACs_+__SP2_2
3 UXGC01046554.1 9-702_+__SP3_3
4 scaffold_12002 1087-1579_-__SP4_4
See the regex demo
Pattern details
(?<=\d)
- a positive lookbehind that requires a digit immediately to the left of the current location_
- an underscore(?=\d+-)
- a positive lookahead that requires one or more digits and then a-
immediately to the right of the current location.
这篇关于创建带有多个分隔符的两列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文