tsv文件两列的结合 [英] union of two columns of a tsv file

查看：204 发布时间：2018/5/25 17:50:37 linux graph cut tsv

本文介绍了tsv文件两列的结合的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个存储有向图的文件。每行代表

node1 TAB node2 TAB权重

我想找到这组节点。有没有更好的工会方式？我目前的解决方案涉及到创建临时文件：

  cut -f1 input_graph |排序| uniq> nodes1 
 cut -f2 input_graph |排序| uniq> nodes2 
 cat nodes1 nodes2 |排序| uniq>节点

解决方案

  { cut -f1 input_graph; cut -f2 input_graph; } |排序| uniq

无需重新排序两次。

{cmd1; CMD2; }语法相当于（cmd1; cmd2），但可能会避免一个子shell。

在另一种语言（例如Perl）中，您可以使用哈希中的第一列，然后

仅使用Bash，您可以通过使用语法 cat <（cmd1）<（cmd2）来避免临时文件）。 Bash负责创建临时文件描述符并设置管道。

在脚本中（您可能希望避免使用bash），如果最终需要临时文件，使用 mktemp

I've a file which stores a directed graph. Each line is represented as

node1 TAB node2 TAB weight

I want to find the set of nodes. Is there a better way of getting union? My current solution involves creating temporary files:
cut -f1 input_graph | sort | uniq > nodes1 cut -f2 input_graph | sort | uniq > nodes2 cat nodes1 nodes2 | sort | uniq > nodes

解决方案
{ cut -f1 input_graph; cut -f2 input_graph; } | sort | uniq
No need to sort twice.

The { cmd1; cmd2; } syntax is equivalent to (cmd1; cmd2) but may avoid a subshell.

In another language (e.g. Perl), you could slurp the first column in a hash and then process the second column sequentially.

With Bash only, you can avoid temporary files by using the syntax cat <(cmd1) <(cmd2). Bash takes care of creating temporary file descriptors and setting up pipelines.

In a script (where you may want to avoid requiring bash), if you end up needing temporary files, use mktemp

这篇关于tsv文件两列的结合的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

tsv文件两列的结合 [英] union of two columns of a tsv file

问题描述

相关文章

服务器开发最新文章

热门教程

热门工具

登录关闭

tsv文件两列的结合 [英] union of two columns of a tsv file

问题描述

相关文章

服务器开发最新文章

热门教程

热门工具

登录 关闭

登录关闭