过滤(搜索和替换) InputStream 中的字节数组 [英] Filter (search and replace) array of bytes in an InputStream
问题描述
我有一个 InputStream,它将 html 文件作为输入参数.我必须从输入流中获取字节.
I have an InputStream which takes the html file as input parameter. I have to get the bytes from the input stream .
我有一个字符串:"XYZ"
.我想将此字符串转换为字节格式,并检查从 InputStream 获得的字节序列中的字符串是否匹配.如果有,我必须用其他字符串的再见序列替换匹配.
I have a string: "XYZ"
. I'd like to convert this string to byte format and check if there is a match for the string in the byte sequence which I obtained from the InputStream. If there is then, I have to replace the match with the bye sequence for some other string.
有人可以帮我解决这个问题吗?我已经使用正则表达式来查找和替换.但是查找和替换字节流,我不知道.
Is there anyone who could help me with this? I have used regex to find and replace. however finding and replacing byte stream, I am unaware of.
以前,我使用jsoup来解析html并替换字符串,但是由于一些utf编码问题,当我这样做时,文件似乎已损坏.
Previously, I use jsoup to parse html and replace the string, however due to some utf encoding problems, the file seems to appear corrupted when I do that.
TL;DR:我的问题是:
是一种在 Java 中的原始 InputStream 中查找和替换字节格式的字符串的方法吗?
Is a way to find and replace a string in byte format in a raw InputStream in Java?
推荐答案
不确定您是否选择了解决问题的最佳方法.
Not sure you have chosen the best approach to solve your problem.
也就是说,我不喜欢(并且有政策不)用不"回答问题,所以这里......
That said, I don't like to (and have as policy not to) answer questions with "don't" so here goes...
来自文档:
FilterInputStream 包含一些其他输入流,它用作其基本数据源,可能沿途转换数据或提供附加功能.
A FilterInputStream contains some other input stream, which it uses as its basic source of data, possibly transforming the data along the way or providing additional functionality.
<小时>
把它写下来是一个有趣的练习.这是一个完整的示例:
It was a fun exercise to write it up. Here's a complete example for you:
import java.io.*;
import java.util.*;
class ReplacingInputStream extends FilterInputStream {
LinkedList<Integer> inQueue = new LinkedList<Integer>();
LinkedList<Integer> outQueue = new LinkedList<Integer>();
final byte[] search, replacement;
protected ReplacingInputStream(InputStream in,
byte[] search,
byte[] replacement) {
super(in);
this.search = search;
this.replacement = replacement;
}
private boolean isMatchFound() {
Iterator<Integer> inIter = inQueue.iterator();
for (int i = 0; i < search.length; i++)
if (!inIter.hasNext() || search[i] != inIter.next())
return false;
return true;
}
private void readAhead() throws IOException {
// Work up some look-ahead.
while (inQueue.size() < search.length) {
int next = super.read();
inQueue.offer(next);
if (next == -1)
break;
}
}
@Override
public int read() throws IOException {
// Next byte already determined.
if (outQueue.isEmpty()) {
readAhead();
if (isMatchFound()) {
for (int i = 0; i < search.length; i++)
inQueue.remove();
for (byte b : replacement)
outQueue.offer((int) b);
} else
outQueue.add(inQueue.remove());
}
return outQueue.remove();
}
// TODO: Override the other read methods.
}
示例用法
class Test {
public static void main(String[] args) throws Exception {
byte[] bytes = "hello xyz world.".getBytes("UTF-8");
ByteArrayInputStream bis = new ByteArrayInputStream(bytes);
byte[] search = "xyz".getBytes("UTF-8");
byte[] replacement = "abc".getBytes("UTF-8");
InputStream ris = new ReplacingInputStream(bis, search, replacement);
ByteArrayOutputStream bos = new ByteArrayOutputStream();
int b;
while (-1 != (b = ris.read()))
bos.write(b);
System.out.println(new String(bos.toByteArray()));
}
}
给定它打印的字符串 "Hello xyz world"
的字节:
Given the bytes for the string "Hello xyz world"
it prints:
Hello abc world
这篇关于过滤(搜索和替换) InputStream 中的字节数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!