下载kafka并解压
tar xzf kafka_2.8.0-0.8.1.1.tgz
首先开启zookeeper服务:
./bin/zookeeper-server-start.sh config/zookeeper.properties &
然后开启一个broker:
./bin/kafka-server-start.sh config/server.properties &
开启一个producer
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
开启一个consumer
./bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test
查看所有topic:
./bin/kafka-topics.sh --list --zookeeper localhost:2181
API:
// Here are examples of using the producer API - kafka.producer.Producer<T> -
// First, start a local instance of the zookeeper server
./bin/zookeeper-server-start.sh config/zookeeper.properties
// Next, start a kafka broker
./bin/kafka-server-start.sh config/server.properties
// Now, create the producer with all configuration defaults and use zookeeper based broker discovery.
import java.util.Arrays;
import java.util.List;
import java.util.Properties;
import kafka.javaapi.producer.SyncProducer;
import kafka.javaapi.message.ByteBufferMessageSet;
import kafka.message.Message;
import kafka.producer.SyncProducerConfig;
...
Properties props = new Properties();
props.put(“zk.connect”, “127.0.0.1:2181”);
props.put("serializer.class", "kafka.serializer.StringEncoder");
ProducerConfig config = new ProducerConfig(props);
Producer<String, String> producer = new Producer<String, String>(config);
// Send a single message
// The message is sent to a randomly selected partition registered in ZK
ProducerData<String, String> data = new ProducerData<String, String>("test-topic", "test-message");
producer.send(data);
//--------------Send multiple messages to multiple topics in one request---------------------
List<String> messages = new java.util.ArrayList<String>();
messages.add("test-message1");
messages.add("test-message2");
ProducerData<String, String> data1 = new ProducerData<String, String>("test-topic1", messages);
ProducerData<String, String> data2 = new ProducerData<String, String>("test-topic2", messages);
List<ProducerData<String, String>> dataForMultipleTopics = new ArrayList<ProducerData<String, String>>();
dataForMultipleTopics.add(data1);
dataForMultipleTopics.add(data2);
producer.send(dataForMultipleTopics);
//------------Send a message with a partition key. Messages with the same key are sent to the same partition-------------
ProducerData<String, String> data = new ProducerData<String, String>("test-topic", "test-key", "test-message");
producer.send(data);
//-------------Use your custom partitioner--------------------
//If you are using zookeeper based broker discovery, kafka.producer.Producer<T> routes your data to a particular broker partition based on a kafka.producer.Partitioner<T>, specified through the partitioner.class config parameter. It defaults to kafka.producer.DefaultPartitioner. If you don't supply a partition key, then it sends each request to a random broker partition.
class MemberIdPartitioner extends Partitioner[MemberIdLocation] {
def partition(data: MemberIdLocation, numPartitions: Int): Int = {
(data.location.hashCode % numPartitions)
}
}
// create the producer config to plug in the above partitioner
Properties props = new Properties();
props.put(“zk.connect”, “127.0.0.1:2181”);
props.put("serializer.class", "kafka.serializer.StringEncoder");
props.put("partitioner.class", "xyz.MemberIdPartitioner");
ProducerConfig config = new ProducerConfig(props);
Producer<String, String> producer = new Producer<String, String>(config);
//-----------------Use custom Encoder-----------------------
// The producer takes in a required config parameter serializer.class that specifies an Encoder<T> to convert T to a Kafka Message. Default is the no-op kafka.serializer.DefaultEncoder. Here is an example of a custom Encoder -
class TrackingDataSerializer extends Encoder<TrackingData> {
// Say you want to use your own custom Avro encoding
CustomAvroEncoder avroEncoder = new CustomAvroEncoder();
def toMessage(event: TrackingData):Message = {
new Message(avroEncoder.getBytes(event));
}
}
// If you want to use the above Encoder, pass it in to the "serializer.class" config parameter
Properties props = new Properties();
props.put("serializer.class", "xyz.TrackingDataSerializer");
// Using static list of brokers, instead of zookeeper based broker discovery
// Some applications would rather not depend on zookeeper. In that case, the config parameter broker.list can be used to specify the list of all brokers in the Kafka cluster.- the list of all brokers in your Kafka cluster in the following format - broker_id1:host1:port1, broker_id2:host2:port2...
// you can stop the zookeeper instance as it is no longer required
./bin/zookeeper-server-stop.sh
// create the producer config object
Properties props = new Properties();
props.put(“broker.list”, “0:localhost:9092”);
props.put("serializer.class", "kafka.serializer.StringEncoder");
ProducerConfig config = new ProducerConfig(props);
// send a message using default partitioner
Producer<String, String> producer = new Producer<String, String>(config);
List<String> messages = new java.util.ArrayList<String>();
messages.add("test-message");
ProducerData<String, String> data = new ProducerData<String, String>("test-topic", messages);
producer.send(data);
//-------------------Use the asynchronous producer along with GZIP compression. This buffers writes in memory until either batch.size or queue.time is reached. After that, data is sent to the Kafka brokers--------------------
Properties props = new Properties();
props.put("zk.connect"‚ "127.0.0.1:2181");
props.put("serializer.class", "kafka.serializer.StringEncoder");
props.put("producer.type", "async");
props.put("compression.codec", "1");
ProducerConfig config = new ProducerConfig(props);
Producer<String, String> producer = new Producer<String, String>(config);
ProducerData<String, String> data = new ProducerData<String, String>("test-topic", "test-message");
producer.send(data);
// Finally, the producer should be closed, through
producer.close();
//----------------Log4j appender-------------------
// Data can also be produced to a Kafka server in the form of a log4j appender. In this way, minimal code needs to be written in order to send some data across to the Kafka server. Here is an example of how to use the Kafka Log4j appender - Start by defining the Kafka appender in your log4j.properties file.
// define the kafka log4j appender config parameters
log4j.appender.KAFKA=kafka.producer.KafkaLog4jAppender
// REQUIRED: set the hostname of the kafka server
log4j.appender.KAFKA.Host=localhost
// REQUIRED: set the port on which the Kafka server is listening for connections
log4j.appender.KAFKA.Port=9092
// REQUIRED: the topic under which the logger messages are to be posted
log4j.appender.KAFKA.Topic=test
// the serializer to be used to turn an object into a Kafka message. Defaults to kafka.producer.DefaultStringEncoder
log4j.appender.KAFKA.Serializer=kafka.test.AppenderStringSerializer
// do not set the above KAFKA appender as the root appender
log4j.rootLogger=INFO
// set the logger for your package to be the KAFKA appender
log4j.logger.your.test.package=INFO, KAFKA
Data can be sent using a log4j appender as follows -
Logger logger = Logger.getLogger([your.test.class])
logger.info("message from log4j appender");
//If your log4j appender fails to send messages, please verify that the correct log4j properties file is being used. You can add -Dlog4j.debug=true to your VM parameters to verify this.
//-------------------Consumer Code-----------------------------
// The consumer code is slightly more complex as it enables multithreaded consumption:
// specify some consumer properties
Properties props = new Properties();
props.put("zk.connect", "localhost:2181");
props.put("zk.connectiontimeout.ms", "1000000");
props.put("groupid", "test_group");
// Create the connection to the cluster
ConsumerConfig consumerConfig = new ConsumerConfig(props);
ConsumerConnector consumerConnector = Consumer.createJavaConsumerConnector(consumerConfig);
// create 4 partitions of the stream for topic “test”, to allow 4 threads to consume
Map<String, List<KafkaStream<Message>>> topicMessageStreams =
consumerConnector.createMessageStreams(ImmutableMap.of("test", 4));
List<KafkaStream<Message>> streams = topicMessageStreams.get("test");
// create list of 4 threads to consume from each of the partitions
ExecutorService executor = Executors.newFixedThreadPool(4);
// consume the messages in the threads
for(final KafkaStream<Message> stream: streams) {
executor.submit(new Runnable() {
public void run() {
for(MessageAndMetadata msgAndMetadata: stream) {
// process message (msgAndMetadata.message())
}
}
});
}
//------------------------Hadoop Consumer---------------------
// Providing a horizontally scalable solution for aggregating and loading data into Hadoop was one of our basic use cases. To support this use case, we provide a Hadoop-based consumer which spawns off many map tasks to pull data from the Kafka cluster in parallel. This provides extremely fast pull-based Hadoop data load capabilities (we were able to fully saturate the network with only a handful of Kafka servers).
// Usage information on the hadoop consumer can be found here.
//---------------------- Simple Consumer----------------------------
// Kafka has a lower-level consumer api for reading message chunks directly from servers. Under most circumstances this should not be needed. But just in case, it's usage is as follows:
import kafka.api.FetchRequest;
import kafka.javaapi.consumer.SimpleConsumer;
import kafka.javaapi.message.ByteBufferMessageSet;
import kafka.message.Message;
import kafka.message.MessageSet;
import kafka.utils.Utils;
...
// create a consumer to connect to the kafka server running on localhost, port 9092, socket timeout of 10 secs, socket receive buffer of ~1MB
SimpleConsumer consumer = new SimpleConsumer("127.0.0.1", 9092, 10000, 1024000);
long offset = 0;
while (true) {
// create a fetch request for topic “test”, partition 0, current offset, and fetch size of 1MB
FetchRequest fetchRequest = new FetchRequest("test", 0, offset, 1000000);
// get the message set from the consumer and print them out
ByteBufferMessageSet messages = consumer.fetch(fetchRequest);
for(MessageAndOffset msg : messages) {
System.out.println("consumed: " + Utils.toString(msg.message.payload(), "UTF-8"));
// advance the offset after consuming each message
offset = msg.offset;
}
}
http://my.oschina.net/ielts0909/blog/100645
分享到:
相关推荐
Kafka 配置调优实践 Kafka 配置调优实践是指通过调整 Kafka 集群的参数配置来提高其吞吐性能。下面是 Kafka 配置调优实践的知识点总结: 一、存储优化 * 数据目录优先存储到 XFS 文件系统或者 EXT4,避免使用 EXT...
CDH大数据平台kafka配置文件以及相关操作
**Kafka配置文件详解** Kafka是一个分布式流处理平台,其核心组件包括生产者、消费者和代理(broker)。在Kafka的运行中,`server.properties`是每个Kafka broker节点的核心配置文件,它定义了服务器的行为和参数。...
kafka 配置 kerberos,设置 ACL权限, java 客户端连接。
在本文中,我们将深入探讨如何在Apache Kafka中配置SASL/PLAIN认证机制,并通过具体的密码验证实现安全的通信。Kafka是一个分布式流处理平台,它在数据传输中扮演着重要角色,而安全性是其核心考量之一。SASL...
2. **Kafka配置文件** Kafka的配置主要通过修改`config/server.properties`文件进行。这个文件包含了Kafka服务器运行所需的各种参数。例如: - `broker.id`: 每个Kafka节点的唯一标识,通常从0开始。 - `...
Kafka 配置信息总结 Kafka 是一个基于 Publish-Subscribe 模式的分布式消息队列系统,配置信息是 Kafka 集群的核心组件。本文将对 Kafka 配置信息进行详细的解析,帮助读者更好地理解 Kafka 的配置机制。 Broker ...
Hyperledger Fabric默认使用solo共识,实际上它早就已经支持kafka共识,只是配置相对复杂点儿。该资源就是使用kafka共识的多orderer集群环境下的网络所需要使用的配置文件。你也可以参考下文帮您理解:...
### Kafka配置安装详解 #### 一、环境搭建与配置 Kafka是一款开源的消息队列中间件,被广泛应用于大数据处理领域。本篇文章将详细介绍如何在本地环境中安装并配置Kafka,以及进行基本的操作演示。 ##### 环境要求...
### Kafka配置Kerberos安全认证详解 #### 一、引言 Kafka 是一款高性能的消息队列服务,广泛应用于大数据处理领域。为了保障数据的安全性和完整性,Kafka 提供了多种安全认证机制,其中 Kerberos 认证是一种非常...
本篇将深入探讨Kafka配置参数,帮助你理解和优化Kafka集群的运行。 1. **broker.id**: 这个参数是每个Kafka broker的唯一标识,它必须在整个集群中是唯一的。值可以是任意整数,通常从0开始。 2. **zookeeper....
在实际使用过程中,需要对Kafka的配置参数进行详细理解,以便根据具体业务需求调整参数,优化性能。以下对Kafka主要配置参数进行详细解读: 1. broker.id:这是Kafka broker的唯一标识符,它是一个整数,用于唯一...
本资源是windows下kafka的环境配置及c++实现的kafka的producer相关代码,启动后可以测试c++的producer发送消息可以在windows下启动的kafka的customer接收消息。
这个压缩包提供了在CentOS6.5系统上安装和配置Kafka的详细文档,以及对应的软件版本,包括JDK1.7、Zookeeper-3.4.5和Kafka_2.10-0.10.0.0。 首先,我们需要理解Kafka的核心概念。Kafka是一个高吞吐量、低延迟的消息...
Kafka 配置步骤 Kafka 是一个流行的分布式流媒体平台,广泛应用于大数据处理、实时数据处理和日志处理等领域。为了成功地配置 Kafka 环境,需要按照某些步骤进行安装和配置。本文将详细介绍 Kafka 配置步骤,包括...
### Kafka配置详解 #### 一、Kafka简介与应用场景 Kafka是一款由LinkedIn开发并开源的分布式消息系统,采用Scala语言编写。它最初被设计用于LinkedIn的活动流和运营数据处理管道,具备高度可扩展性和高吞吐量的...
MySQL+Canal+Kafka 配置及 Python 实现文档 本文档将介绍如何使用 MySQL、Canal 和 Kafka 实现数据实时同步的配置和 Python 实现。 MySQL 配置 MySQL 需要开启日志记录功能,以便 Canal 监听日志变化。首先,...
【Kafka配置调优详解】 Kafka是一款高吞吐、分布式的流处理平台,它用于构建实时数据管道和流应用。在大型分布式系统中,为了保证高效稳定运行,对Kafka进行配置调优至关重要。本篇文章将深入解析Kafka的核心配置...
总的来说,Kafka配置涉及的要点包括安装、配置、环境变量设置以及服务启动,这些步骤都是大数据采集系统中Kafka部署的关键环节。理解并熟练掌握这些步骤,将有助于构建稳定高效的实时数据采集平台。
kafka配置文件zookeeper参数.md