Linux上的二进制grep? [英] Binary grep on Linux?

查看:19
本文介绍了Linux上的二进制grep?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我生成了以下二进制文件:

Say I have generated the following binary file:

# generate file:
python -c 'import sys;[sys.stdout.write(chr(i)) for i in (0,0,0,0,2,4,6,8,0,1,3,0,5,20)]' > mydata.bin

# get file size in bytes
stat -c '%s' mydata.bin

# 14

比如说,我想使用类似 grep 的语法找到所有零 (0x00) 的位置.

And say, I want to find the locations of all zeroes (0x00), using a grep-like syntax.

 

目前我能做的最好的是:

The best I can do so far is:

$ hexdump -v -e "1/1 " %02x
"" mydata.bin | grep -n '00'

1: 00
2: 00
3: 00
4: 00
9: 00
12: 00

然而,这将原始二进制文件中的每个字节隐式转换为多字节 ASCII 表示,grep 对其进行操作;不完全是优化的主要例子:)

However, this implicitly converts each byte in the original binary file into a multi-byte ASCII representation, on which grep operates; not exactly the prime example of optimization :)

是否有类似于 Linux 的二进制 grep 的东西?可能还有一些支持类似正则表达式的语法的东西,但也支持字节字符"——也就是说,我可以写一些像 'a(x00*)b' 和 match '在字节a"(97)和b"(98)之间出现零次或多次字节 0?

Is there something like a binary grep for Linux? Possibly, also, something that would support a regular expression-like syntax, but also for byte "characters" - that is, I could write something like 'a(x00*)b' and match 'zero or more' occurrences of byte 0 between bytes 'a' (97) and 'b' (98)?

上下文是我正在开发一个驱动程序,我在其中捕获 8 位数据;数据出现问题,可能是千字节到兆字节,我想检查特定的签名以及它们出现的位置.(到目前为止,我正在处理千字节片段,所以优化并不是那么重要 - 但是如果我开始在兆字节长的捕获中遇到一些错误,并且我需要分析这些错误,我的猜测是我想要更优化的东西:) .尤其是,我想要一些可以将字节grep"为字符的东西 - hexdump 强制我搜索每个字节的字符串)

The context is that I'm working on a driver, where I capture 8-bit data; something goes wrong in the data, which can be kilobytes up to megabytes, and I'd like to check for particular signatures and where they occur. (so far, I'm working with kilobyte snippets, so optimization is not that important - but if I start getting some errors in megabyte long captures, and I need to analyze those, my guess is I would like something more optimized :) . And especially, I'd like something where I can "grep" for a byte as a character - hexdump forces me to search strings per byte)

同样的问题,不同的论坛 :) 通过二进制文件查找字节序列

same question, different forum :) grepping through a binary file for a sequence of bytes

感谢@tchrist 的回答,这里还有一个带有grepping"和匹配以及显示结果的示例(虽然与 OP 的问题不完全相同):

Thanks to the answer by @tchrist, here is also an example with 'grepping' and matching, and displaying results (although not quite the same question as OP):

$ perl -ln0777e 'print unpack("H*",$1), "
", pos() while /(.....xCC.....)/g' /path/to/myfile.bin

ca000000cb000000cc000000cd000000ce     # Matched data (hex)
66357                                  # Offset (dec)

要将匹配的数据分组为一个字节(两个十六进制字符),则需要为匹配字符串中的字节数指定H2 H2 H2 ...";由于我的匹配 '.....xCC.....' 覆盖 17 个字节,我可以写 '"H2"x17' 在 Perl 中.这些H2"中的每一个都将返回一个单独的变量(如在列表中),因此还需要使用 join 在它们之间添加空格 - 最终:

To have the matched data be grouped as one byte (two hex characters) each, then "H2 H2 H2 ..." needs to be specified for as many bytes are there in the matched string; as my match '.....xCC.....' covers 17 bytes, I can write '"H2"x17' in Perl. Each of these "H2" will return a separate variable (as in a list), so join also needs to be used to add spaces between them - eventually:

$ perl -ln0777e 'print join(" ", unpack("H2 "x17,$1)), "
", pos() while /(.....xCC.....)/g' /path/to/myfile.bin

ca 00 00 00 cb 00 00 00 cc 00 00 00 cd 00 00 00 ce
66357

嗯..确实 Perl 是非常好的二进制 grepping"工具,我必须承认:) 只要你正确地学习语法:)

Well.. indeed Perl is very nice 'binary grepping' facility, I must admit :) As long as one learns the syntax properly :)

推荐答案

单行输入

这是较短的单行版本:

One-Liner Input

Here’s the shorter one-liner version:

% perl -ln0e 'print tell' < inputfile

这是一个稍长的单线:

% perl -e '($/,$) = ("","
"); print tell while <STDIN>' < inputfile

连接这两个单行的方法是反编译第一个程序:

The way to connect those two one-liners is by uncompiling the first one’s program:

% perl -MO=Deparse,-p -ln0e 'print tell'
BEGIN { $/ = "00"; $ = "
"; }
LINE: while (defined(($_ = <ARGV>))) {
    chomp($_);
    print(tell);
}

程序输入

如果你想把它放在一个文件中而不是从命令行调用它,这里有一个更明确的版本:

Programmed Input

If you want to put that in a file instead of a calling it from the command line, here’s a somewhat more explicit version:

#!/usr/bin/env perl

use English qw[ -no_match_vars ];

$RS  = "";    # input  separator for readline, chomp
$ORS = "
";    # output separator for print

while (<STDIN>) {
    print tell();
}

这是一个很长的版本:

#!/usr/bin/env perl

use strict;
use autodie;  # for perl5.10 or better
use warnings qw[ FATAL all  ];

use IO::Handle;

IO::Handle->input_record_separator("");
IO::Handle->output_record_separator("
");

binmode(STDIN);   # just in case

while (my $null_terminated = readline(STDIN)) {
    # this just *past* the null we just read:
    my $seek_offset = tell(STDIN);
    print STDOUT $seek_offset;  

}

close(STDIN);
close(STDOUT);

单线输出

顺便说一句,为了创建测试输入文件,我没有使用你的又大又长的 Python 脚本;我只是使用了这个简单的 Perl one-liner:

One-Liner Output

BTW, to create the test input file, I didn’t use your big, long Python script; I just used this simple Perl one-liner:

% perl -e 'print 0.0.0.0.2.4.6.8.0.1.3.0.5.20' > inputfile

您会发现,在完成相同的工作时,Perl 通常比 Python 短 2-3 倍.而且您不必在清晰度上妥协;有什么比上面的单行更简单?

You’ll find that Perl often winds up being 2-3 times shorter than Python to do the same job. And you don’t have to compromise on clarity; what could be simpler that the one-liner above?

我知道,我知道.如果您还不了解该语言,这可能会更清楚:

I know, I know. If you don’t already know the language, this might be clearer:

#!/usr/bin/env perl
@values = (
    0,  0,  0,  0,  2,
    4,  6,  8,  0,  1,
    3,  0,  5, 20,
);
print pack("C*", @values);

虽然这也有效:

print chr for @values;

也一样

print map { chr } @values;

虽然对于那些喜欢一切严谨细致的人来说,这可能是你看到的更多:

Although for those who like everything all rigorous and careful and all, this might be more what you would see:

#!/usr/bin/env perl

use strict;
use warnings qw[ FATAL all ];
use autodie;

binmode(STDOUT);

my @octet_list = (
    0,  0,  0,  0,  2,
    4,  6,  8,  0,  1,
    3,  0,  5, 20,
);

my $binary = pack("C*", @octet_list);
print STDOUT $binary;

close(STDOUT); 

TMTOWTDI

Perl 支持多种做事方式,因此您可以选择最适合您的方式.如果这是我计划作为学校或工作项目检查的内容,我肯定会选择更长、更仔细的版本——或者如果我使用的是单行代码,至少在 shell 脚本中添加注释.

TMTOWTDI

Perl supports more than one way to do things so that you can pick the one that you’re most comfortable with. If this were something I planned to check in as school or work project, I would certainly select the longer, more careful versions — or at least put a comment in the shell script if I were using the one-liners.

您可以在自己的系统上找到 Perl 的文档.只需输入

You can find documentation for Perl on your own system. Just type

% man perl
% man perlrun
% man perlvar
% man perlfunc

等在您的 shell 提示符下.如果您想在网络上使用漂亮的版本,请获取 perl 的联机帮助页,perlrunperlvarperlfunc 来自 http://perldoc.perl.org.

etc at your shell prompt. If you want pretty-ish versions on the web instead, get the manpages for perl, perlrun, perlvar, and perlfunc from http://perldoc.perl.org.

这篇关于Linux上的二进制grep?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆