如何找到多个字节中的特定字节? [英] How to find a specific byte in many bytes?

查看:115
本文介绍了如何找到多个字节中的特定字节?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用Java读取了一个文件,并使用HexDump输出数据.看起来像这样: 第一和第二行: 一个:31 30 30 31 30 30 30 31 31 30 30 31 30 31 31 31 两个:30 31 31 30 30 31 31 30 31 31 30 30 31 31 30 31 我想在第一个"31 30 30 31"和第二个"31 30 30 31"之间打印数据.我的理想输出是31 30 30 31 30 30 30 31 31 30 30 31 30 31 31 31 30 31. 但是实际输出是错误的,我想我的代码在data1中找不到31 30 3031.如何弄清楚?

我使用jdk 1.7,软件很不错

import java.io.DataInputStream;
import java.io.FileInputStream;
import java.io.File;
public class TestDemo{

  public static void main(String[] args) {


        try {
            File file = new File("/0testData/1.bin");
            DataInputStream isr = new DataInputStream(newFileInputStream(file));

            int bytesPerLine = 16;

            int byteCount = 0;
            int data;
            while ((data = isr.read()) != -1) {
                if (byteCount == 0)
                    System.out.println();
                else if (byteCount % bytesPerLine == 0)
                    System.out.printf("\n",byteCount );
                else
                    System.out.print(" ");


                String data1 = String.format("%02X",data & 0xFF);
                System.out.printf(data1);


                byteCount += 1;
                if(data1.contains("31 30 30 31")) {
                    int i=data1.indexOf("31 30 30 31",12);

                    System.out.println("find it!");
                    String strEFG=data1.substring(i,i+53);
                    System.out.println("str="+strEFG);
                }else {
                    System.out.println("cannot find it");
                }

            }

        } catch (Exception e) {
            System.out.println("Exception: " + e);
        }

    }
}


我的理想输出是31 30 30 31 30 30 30 31 31 30 30 31 30 31 31 31 30 31. 但是实际的输出是:

31找不到它 30找不到 30找不到 31找不到 30找不到 30找不到 30找不到 31找不到 31找不到 30找不到 30找不到 31找不到 30找不到 31找不到 31找不到 31找不到它

30找不到它 31找不到 31找不到 30找不到 30找不到 31找不到 31找不到 30找不到 31找不到 31找不到 30找不到 30找不到 31找不到 31找不到 30找不到 31找不到它

31找不到它 31找不到 31找不到 31找不到 30找不到 30找不到 30找不到 30找不到 30找不到 31找不到 30找不到 31找不到 30找不到 31找不到 31找不到 31找不到它

31找不到它 31找不到 30找不到 31找不到 31找不到 31找不到 31找不到 31找不到 31找不到 31找不到 30找不到 30找不到 31找不到 30找不到 31找不到 31找不到它

30找不到它 31找不到 31找不到 30找不到 30找不到 31找不到 31找不到 30找不到 30找不到 31找不到 30找不到 30找不到它

解决方案

我觉得您的输入数据有些混乱.不过,这可能会回答您的问题.

它不能提供与您要求的输出完全相同的输出,但是我认为您应该能够通过使用标志"inPattern"来对其进行调整以打开或关闭输出.如果inPattern为true,则打印从文件读取的数据;如果为false,则不打印从文件读取的数据.

这可能不是最佳的编码形式,因为它完全是静态方法,但是可以满足您的要求.

您的代码的问题(我认为)是data1将是2个字符串.它不可能包含11个字符串("31 30 30 31").如果您尝试反转测试(即"31 30 30 31" .contains(data1)),则它将仅匹配一个字节,而不匹配您打算匹配的4个字节.

 package hexdump;

import java.io.DataInputStream;
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.LinkedList;

public class HexDumpWithFilter {
//    private static final int beginPattern [] = { 0x47, 0x0d, 0x0a, 0x1a };
    private static final int beginPattern [] = { 0x00, 0x83, 0x7d, 0x8a };
    private static final int endPattern [] = { 0x23, 0x01, 0x78, 0xa5 };
    private static LinkedList<Integer> bytesRead = new LinkedList();

    public static void main(String[] args) {
        try {
            InputStream isr = new DataInputStream(new FileInputStream("C:\\Temp\\resistor.png"));
            int bytesPerLine = 16;
            int byteCount = 0;
            int data;
            boolean inPattern = false;
            while ((data = isr.read()) != -1) {
                // Capture the data just read into an input buffer.
                bytesRead.add(data);
                // If we have too much data in the input buffer to compare to our
                // pattern, peel off the first byte.
                // Note: This assumes that the begin pattern and end Pattern are the same lengths.
                if (bytesRead.size() > beginPattern.length) {
                    bytesRead.removeFirst();
                }

                // Output a byte count at the start of each new line of output.
                if (byteCount % bytesPerLine == 0)
                    System.out.printf("\n%04x:", byteCount);

                // Output the spacing - if we have found our pattern, then also output an asterisk
                System.out.printf(inPattern ? " *%02x" : "  %02x", data);

                // Finally check to see if we have found our pattern if we have enough bytes
                // in our bytesRead buffer.
                if (bytesRead.size() == beginPattern.length) {
                    // If we are not currently in a pattern, then check for the begin pattern
                    if (!inPattern && checkPattern(beginPattern, bytesRead)) {
                        inPattern = true;
                    }
                    // if we are currently in a pattern, then check for the end pattern.
                    if (inPattern && checkPattern (endPattern, bytesRead)) {
                        inPattern = false;
                    }
                }

                byteCount += 1;
            }
            System.out.println();
        } catch (Exception e) {
            System.out.println("Exception: " + e);
        }
    }

    /**
     * Function to check whether our input buffer read from the file matches
     * the supplied pattern.
     * @param pattern the pattern to look for in the buffer.
     * @param bytesRead the buffer of bytes read from the file.
     * @return true if pattern and bytesRead have the same content.
     */
    private static boolean checkPattern (int [] pattern, LinkedList<Integer> bytesRead) {
        int ptr = 0;
        boolean patternMatch = true;
        for (int br : bytesRead) {
            if (br != pattern[ptr++]) {
                patternMatch = false;
                break;
            }
        }
        return patternMatch;
    }
}
 

此代码有一个小问题,因为它没有标记开始模式,但是标记了结束模式.希望这对您来说不是问题.如果您需要正确地标记开始或不标记结束,那么将会有另一种复杂性.基本上,您必须先读入文件,然后将数据写出所读取数据的后面4个字节.这可以通过在以下行上打印缓冲区中的值来实现:

     bytesRead.removeFirst();
 

而不是打印从文件读取的值(即"data"变量中的值).

以下是针对电阻器图像的PNG文件运行时产生的数据的示例.

0000:  89  50  4e  47  0d  0a  1a  0a  00  00  00  0d  49  48  44  52
0010:  00  00  00  60  00  00  00  1b  08  06  00  00  00  83  7d  8a
0020: *3a *00 *00 *00 *09 *70 *48 *59 *73 *00 *00 *2e *23 *00 *00 *2e
0030: *23 *01 *78 *a5  3f  76  00  00  00  07  74  49  4d  45  07  e3
0040:  03  0e  17  1a  0f  c2  80  9c  d0  00  00  01  09  49  44  41
0050:  54  68  de  ed  9a  31  0b  82  40  18  86  cf  52  d4  a1  7e
0060:  45  4e  81  5b  a3  9b  10  ae  ae  4d  4d  61  7f  a1  21  1b
0070:  fa  0b  45  53  53  ab  ab  04  6e  42  4b  9b  d0  64  bf  a2
0080:  06  15  a9  6b  ef  14  82  ea  ec  e8  7d  c6  f7  0e  f1  be
0090:  e7  3b  0f  0e  25  4a  29  25  a0  31  5a  28  01  04  fc  35
00a0:  f2  73  e0  af  af  b5  93  fd  c9  8c  cd  36  cb  da  f9  ae
00b0:  ad  11  d3  50  84  2e  50  92  96  24  88  f2  ca  b1  41  7b
00c0:  cc  64  c7  db  b6  be  7e  5e  87  ef  0e  08  e3  82  64  85
00d0:  b8  47  4c  56  50  12  c6  85  b8  9f  20  1e  0b  10  bd  81
00e0:  64  1e  5b  38  49  cb  ca  31  e3  7c  67  b2  b4  c7  f6  c4
00f0:  62  da  65  b2  f9  ea  c2  64  a7  dd  90  c9  fa  a3  3d  0e
0100:  61  00  01  10  00  20  00  02  00  04  40  00  80  00  08  00
0110:  10  00  01  00  02  7e  82  af  5f  c6  99  86  42  5c  5b  7b
0120:  eb  19  be  f7  e2  8d  a4  77  f8  e8  bb  07  51  5e  7b  91
0130:  28  c4  0e  d0  55  89  38  96  2a  6c  77  3a  96  4a  74  55
0140:  12  57  00  8f  05  88  de  40  12  fe  8a  c0  21  0c  01  00
0150:  02  20  00  34  c3  03  f7  3f  46  9a  04  49  f8  9d  00  00
0160:  00  00  49  45  4e  44  ae  42  60  82

请注意,某些字节前面有星号吗?这些是beginPattern和endPattern内部的字节.

还要注意,我使用了beginPattern和endPattern.您不需要这样做,我只是这样做是为了使我更容易在我的电阻器.png文件中找到模式来测试模式匹配.您可以为开始和结束都使用一个变量,为两者都设置相同的值,或者如果您想为开始和结束使用单个模式(例如"0x31、0x30、0x30、0x31"),则只需分配endPattern = beginPattern./p>

I readed a file using Java and use HexDump to output the data. It looks like this: The first and second line: one:31 30 30 31 30 30 30 31 31 30 30 31 30 31 31 31 two: 30 31 31 30 30 31 31 30 31 31 30 30 31 31 30 31 I want to print the data between first "31 30 30 31"and the second "31 30 30 31".My ideal ouput is 31 30 30 31 30 30 30 31 31 30 30 31 30 31 31 31 30 31. But the real output is wrong,I think my code can not find the 31 30 30 31 in the data1.How to figure it out?

I Use jdk 1.7 and the software is idea

import java.io.DataInputStream;
import java.io.FileInputStream;
import java.io.File;
public class TestDemo{

  public static void main(String[] args) {


        try {
            File file = new File("/0testData/1.bin");
            DataInputStream isr = new DataInputStream(newFileInputStream(file));

            int bytesPerLine = 16;

            int byteCount = 0;
            int data;
            while ((data = isr.read()) != -1) {
                if (byteCount == 0)
                    System.out.println();
                else if (byteCount % bytesPerLine == 0)
                    System.out.printf("\n",byteCount );
                else
                    System.out.print(" ");


                String data1 = String.format("%02X",data & 0xFF);
                System.out.printf(data1);


                byteCount += 1;
                if(data1.contains("31 30 30 31")) {
                    int i=data1.indexOf("31 30 30 31",12);

                    System.out.println("find it!");
                    String strEFG=data1.substring(i,i+53);
                    System.out.println("str="+strEFG);
                }else {
                    System.out.println("cannot find it");
                }

            }

        } catch (Exception e) {
            System.out.println("Exception: " + e);
        }

    }
}


My ideal ouput is 31 30 30 31 30 30 30 31 31 30 30 31 30 31 31 31 30 31. But the real output is:

31cannot find it 30cannot find it 30cannot find it 31cannot find it 30cannot find it 30cannot find it 30cannot find it 31cannot find it 31cannot find it 30cannot find it 30cannot find it 31cannot find it 30cannot find it 31cannot find it 31cannot find it 31cannot find it

30cannot find it 31cannot find it 31cannot find it 30cannot find it 30cannot find it 31cannot find it 31cannot find it 30cannot find it 31cannot find it 31cannot find it 30cannot find it 30cannot find it 31cannot find it 31cannot find it 30cannot find it 31cannot find it

31cannot find it 31cannot find it 31cannot find it 31cannot find it 30cannot find it 30cannot find it 30cannot find it 30cannot find it 30cannot find it 31cannot find it 30cannot find it 31cannot find it 30cannot find it 31cannot find it 31cannot find it 31cannot find it

31cannot find it 31cannot find it 30cannot find it 31cannot find it 31cannot find it 31cannot find it 31cannot find it 31cannot find it 31cannot find it 31cannot find it 30cannot find it 30cannot find it 31cannot find it 30cannot find it 31cannot find it 31cannot find it

30cannot find it 31cannot find it 31cannot find it 30cannot find it 30cannot find it 31cannot find it 31cannot find it 30cannot find it 30cannot find it 31cannot find it 30cannot find it 30cannot find it

解决方案

I feel that your input data is a bit confusing. Nevertheless, this probably answers your question.

It doesn't give quite the same output that you are asking for, but I think you should be able to tweak it to turn on or off the output by using the flag "inPattern". If inPattern is true, print your data read from the file, if false, do not print the data read from the file.

This is probably not the best form of coding as it is entirely static methods - but it does what you ask for.

The problem with your code (I think) is that data1 will be a 2 character string. It is impossible for it to contain a 11 character string ("31 30 30 31"). If you tried reversing the test (i.e. "31 30 30 31".contains(data1)) then it will only be matching a single byte - not the 4 bytes you are intending to match.

package hexdump;

import java.io.DataInputStream;
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.LinkedList;

public class HexDumpWithFilter {
//    private static final int beginPattern [] = { 0x47, 0x0d, 0x0a, 0x1a };
    private static final int beginPattern [] = { 0x00, 0x83, 0x7d, 0x8a };
    private static final int endPattern [] = { 0x23, 0x01, 0x78, 0xa5 };
    private static LinkedList<Integer> bytesRead = new LinkedList();

    public static void main(String[] args) {
        try {
            InputStream isr = new DataInputStream(new FileInputStream("C:\\Temp\\resistor.png"));
            int bytesPerLine = 16;
            int byteCount = 0;
            int data;
            boolean inPattern = false;
            while ((data = isr.read()) != -1) {
                // Capture the data just read into an input buffer.
                bytesRead.add(data);
                // If we have too much data in the input buffer to compare to our
                // pattern, peel off the first byte.
                // Note: This assumes that the begin pattern and end Pattern are the same lengths.
                if (bytesRead.size() > beginPattern.length) {
                    bytesRead.removeFirst();
                }

                // Output a byte count at the start of each new line of output.
                if (byteCount % bytesPerLine == 0)
                    System.out.printf("\n%04x:", byteCount);

                // Output the spacing - if we have found our pattern, then also output an asterisk
                System.out.printf(inPattern ? " *%02x" : "  %02x", data);

                // Finally check to see if we have found our pattern if we have enough bytes
                // in our bytesRead buffer.
                if (bytesRead.size() == beginPattern.length) {
                    // If we are not currently in a pattern, then check for the begin pattern
                    if (!inPattern && checkPattern(beginPattern, bytesRead)) {
                        inPattern = true;
                    }
                    // if we are currently in a pattern, then check for the end pattern.
                    if (inPattern && checkPattern (endPattern, bytesRead)) {
                        inPattern = false;
                    }
                }

                byteCount += 1;
            }
            System.out.println();
        } catch (Exception e) {
            System.out.println("Exception: " + e);
        }
    }

    /**
     * Function to check whether our input buffer read from the file matches
     * the supplied pattern.
     * @param pattern the pattern to look for in the buffer.
     * @param bytesRead the buffer of bytes read from the file.
     * @return true if pattern and bytesRead have the same content.
     */
    private static boolean checkPattern (int [] pattern, LinkedList<Integer> bytesRead) {
        int ptr = 0;
        boolean patternMatch = true;
        for (int br : bytesRead) {
            if (br != pattern[ptr++]) {
                patternMatch = false;
                break;
            }
        }
        return patternMatch;
    }
}

There is a small problem with this code in that it does not mark the beginning pattern, but does mark the ending pattern. Hopefully this is not a problem for you. If you need to correctly mark the beginning or not mark the ending, then there will be another level of complexity. Basically you would have to read ahead in the file and write the data out 4 bytes behind the data you have been reading. This could be achieved by printing the value that comes off of the buffer at the line which reads:

    bytesRead.removeFirst();

rather than printing the value read from the file (i.e. the value in the "data" variable).

Following is an example of the data produced when run against a PNG file of an image of a resistor.

0000:  89  50  4e  47  0d  0a  1a  0a  00  00  00  0d  49  48  44  52
0010:  00  00  00  60  00  00  00  1b  08  06  00  00  00  83  7d  8a
0020: *3a *00 *00 *00 *09 *70 *48 *59 *73 *00 *00 *2e *23 *00 *00 *2e
0030: *23 *01 *78 *a5  3f  76  00  00  00  07  74  49  4d  45  07  e3
0040:  03  0e  17  1a  0f  c2  80  9c  d0  00  00  01  09  49  44  41
0050:  54  68  de  ed  9a  31  0b  82  40  18  86  cf  52  d4  a1  7e
0060:  45  4e  81  5b  a3  9b  10  ae  ae  4d  4d  61  7f  a1  21  1b
0070:  fa  0b  45  53  53  ab  ab  04  6e  42  4b  9b  d0  64  bf  a2
0080:  06  15  a9  6b  ef  14  82  ea  ec  e8  7d  c6  f7  0e  f1  be
0090:  e7  3b  0f  0e  25  4a  29  25  a0  31  5a  28  01  04  fc  35
00a0:  f2  73  e0  af  af  b5  93  fd  c9  8c  cd  36  cb  da  f9  ae
00b0:  ad  11  d3  50  84  2e  50  92  96  24  88  f2  ca  b1  41  7b
00c0:  cc  64  c7  db  b6  be  7e  5e  87  ef  0e  08  e3  82  64  85
00d0:  b8  47  4c  56  50  12  c6  85  b8  9f  20  1e  0b  10  bd  81
00e0:  64  1e  5b  38  49  cb  ca  31  e3  7c  67  b2  b4  c7  f6  c4
00f0:  62  da  65  b2  f9  ea  c2  64  a7  dd  90  c9  fa  a3  3d  0e
0100:  61  00  01  10  00  20  00  02  00  04  40  00  80  00  08  00
0110:  10  00  01  00  02  7e  82  af  5f  c6  99  86  42  5c  5b  7b
0120:  eb  19  be  f7  e2  8d  a4  77  f8  e8  bb  07  51  5e  7b  91
0130:  28  c4  0e  d0  55  89  38  96  2a  6c  77  3a  96  4a  74  55
0140:  12  57  00  8f  05  88  de  40  12  fe  8a  c0  21  0c  01  00
0150:  02  20  00  34  c3  03  f7  3f  46  9a  04  49  f8  9d  00  00
0160:  00  00  49  45  4e  44  ae  42  60  82

Note that some of the bytes have an asterisk in front of them? These are the bytes that are inside of the beginPattern and endPattern.

Also note that I used a beginPattern and an endPattern. You do not need to do this, I only did it to make it easier for me to find a pattern in my resistor.png file to test the pattern matching. You can use one variable for both begin and end, set the same value for both or simply assign endPattern = beginPattern if you want to use a single pattern (e.g. "0x31, 0x30, 0x30, 0x31") for the start and finish.

这篇关于如何找到多个字节中的特定字节?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆