如何从使用bash一组字符串的B筛选出一组字符串A的 [英] How to filter out a set of strings A from a set of strings B using Bash

查看:167
本文介绍了如何从使用bash一组字符串的B筛选出一组字符串A的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有我想从一个超集另一个字符串中删除,而不是在任何特定的顺序,从而构建了一套新的字符串列表。是bash中是可行的?

I have a list of strings which I want to remove from a super set of another strings, not in a any specific order and thus constructing a new set. Is that doable in Bash?

推荐答案

它看起来像你正在寻找更好的东西比O(nm)的运行时间,所以这里是一个答案。
fgrep一样或者grep的-F使用阿霍Corasick算法,使单一的FSM出固定字符串列表,所以在检查SET2每个单词花费O(字长度)的时间。这意味着这个脚本的整个运行时间为O(N + M)。

It looks like you're looking for something with better than O(nm) running time, so here's an answer to that. Fgrep or grep -F uses the Aho-Corasick algorithm to make a single FSM out of a list of fixed strings, so checking each word in SET2 takes O(length of word) time. This means the whole running time of this script is O(n+m).

(明显的运行时间也依赖于词语的长度)

(obviously the running times are also dependent on the length of the words)

[meatmanek@yggdrasil ~]$ cat subtract.sh 
#!/bin/bash
subtract()
{
  SET1=( $1 )
  SET2=( $2 )
  OLDIFS="$IFS"
  IFS=$'\n'
  SET3=( $(grep -Fxv "${SET1[*]}" <<< "${SET2[*]}") )
  IFS="$OLDIFS"
  echo "${SET3[*]}"
  # SET3 = SET2-SET1
}
subtract "$@"
[meatmanek@yggdrasil ~]$ . subtract.sh 

[meatmanek@yggdrasil ~]$ subtract "package-x86 test0 hello world" "computer hello sizeof compiler world package-x86 rocks"
computer sizeof compiler rocks
[meatmanek@yggdrasil ~]$

这篇关于如何从使用bash一组字符串的B筛选出一组字符串A的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆