是否可以使用Kafka传输文件? [英] Is it possible to transfer files using Kafka?

查看:379
本文介绍了是否可以使用Kafka传输文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我每天要生成数千个文件,我想使用Kafka进行流式传输. 当我尝试读取文件时,每一行都被视为单独的消息.

I have thousands of files generated each day which I want to stream using Kafka. When I try to read the file, each line is taken as a separate message.

我想知道如何在Kafka主题中使每个文件的内容作为一条消息,并与消费者一起如何在单独的文件中编写来自Kafka主题的每条消息.

I would like to know how can I make each file's content as a single message in Kafka topic and with consumer how to write each message from Kafka topic in a separate file.

推荐答案

您可以编写自己的序列化器/反序列化器来处理文件. 例如:

You can write your own serializer/deserializer for handling files. For example :

生产者道具:

props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringSerializer);  
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, YOUR_FILE_SERIALIZER_URI);

消费者道具:

props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringDeserializer);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, YOUR_FILE_DESERIALIZER_URI);

序列化器

public class FileMapSerializer implements Serializer<Map<?,?>> {

@Override
public void close() {

}

@Override
public void configure(Map configs, boolean isKey) {
}

@Override
public byte[] serialize(String topic, Map data) {
    ByteArrayOutputStream bos = new ByteArrayOutputStream();
    ObjectOutput out = null;
    byte[] bytes = null;
    try {
        out = new ObjectOutputStream(bos);
        out.writeObject(data);
        bytes = bos.toByteArray();
    } catch (IOException e) {
        e.printStackTrace();
    } finally {
        try {
            if (out != null) {
                out.close();
            }
        } catch (IOException ex) {
            // ignore close exception
        }
        try {
            bos.close();
        } catch (IOException ex) {
            // ignore close exception
        }
    }
    return bytes;
}
}

反序列化器

public class MapDeserializer implements Deserializer<Map> {

@Override
public void close() {

}

@Override
public void configure(Map config, boolean isKey) {

}

@Override
public Map deserialize(String topic, byte[] message) {
    ByteArrayInputStream bis = new ByteArrayInputStream(message);
    ObjectInput in = null;
    try {
        in = new ObjectInputStream(bis);
        Object o = in.readObject();
        if (o instanceof Map) {
            return (Map) o;
        } else
            return new HashMap<String, String>();
    } catch (ClassNotFoundException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    } finally {
        try {
            bis.close();
        } catch (IOException ex) {
        }
        try {
            if (in != null) {
                in.close();
            }
        } catch (IOException ex) {
            // ignore close exception
        }
    }
    return new HashMap<String, String>();
}
}

撰写以下格式的邮件

final Object kafkaMessage = new ProducerRecord<String, Map>((String) <TOPIC>,Integer.toString(messageId++), messageMap);

messageMap将包含fileName作为键,并将文件内容作为值. 值可以是可序列化的对象. 因此,每条消息都将包含一个File_Name与FileContent映射的映射.可以是单个值或多个值.

messageMap will contain fileName as key and the file content as value. Value can be serializable object. Hence each message will contain a Map with File_Name versus FileContent map.Can be single value or multiple value.

这篇关于是否可以使用Kafka传输文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆