噛み砕いてKafka Streams #kafkajp
date post
16-Apr-2017Category
Technology
view
678download
5
Embed Size (px)
Transcript of 噛み砕いてKafka Streams #kafkajp
20161215
1
1
Kafka Streams
@kokumutyoukan
Kafka, Storm,Cassandra, Elasticsearch
2
Kafka Streams
3
Word Count
Time, Window, Join
4
Word Count
Time, Window, Join
5
Kafka Streams is
Apache Kafka
0.10.0 20165
6
Confluent
Kafka Java API
Kafka Streams7
Kafka Java API
Java jar java
API
8
Consumer consumer = new KafkaConsumer(props);consumer.subscribe(topics);
Storm
9
Kafka Streams
Kafka Streams
Spark StreamingStormat least once
OK
10
jarjava
11
KafkaJava API Kafka Streams
Word Count
Time, Window, Join
12
13
maven
API high-level DSL low-level API
org.apache.kafkakafka-streams0.10.0.1
14
@Testpublic final void wordCount() {
KStreamBuilder builder = new KStreamBuilder();
KStream queryStream= builder.stream(stringSerde, stringSerde, search-query-topic); //
KStream wordCounts = queryStream.flatMapValues(value -> Arrays.asList(value.split(s+))) // .map((key, value) -> new KeyValue(value, value)) // key .countByKey(stringSerde, Counts) // KStream -> KTable.toStream(); // KTable -> KStream
wordCounts.to(stringSerde, longSerde, wordcount-output); // sink
KafkaStreams streams = new KafkaStreams(builder, props); // props Kafka Streams streams.start(); //
}
15
// producer.send(new ProducerRecord(search-query-topic, ));producer.send(new ProducerRecord(search-query-topic, ));producer.send(new ProducerRecord(search-query-topic, "));
consumer.subscribe(Arrays.asList("wordcount-output"));while (true) {
ConsumerRecords records = consumer.poll(100);for (ConsumerRecord record : records) {
System.out.println("record = " + record.key() + ", " + record.value());}
}// record = , 1record = , 1record = , 1record = , 2record = , 1record = , 2
Kafka Unit TestKafka FAQ
KStream? KTable?
KStream
record streamKStream
PV
KTable
changelog streamKTable
key
State
16
Word Count
Time, Window, Join
17
Time
1.
2. APIKafka
3. Kafka
4. Kafka Streams
18
Tweet!
TwitterAPI
my BEserver
Kafka Streams
Time
1.
2. APIKafka
3. Kafka
4. Kafka Streams
1
Kafka Streams timestamp.extractor
19
Kafka Streams event-time
Kafka message APIKafka
broker log.message.timestamp.type=CreateTime
Kafka0.10Message 0.9producer -1
ingestion-time Kafka
log.message.timestamp.type=LogAppendTime
Kafka Brokermessage
processing-time Kafka Streams
20
timestamp.extractor
21
Time timestamp.extractor
event-time
event-timemessage ConsumerRecordTimestampExtractor
ingestion-time ConsumerRecordTimestampExtractor
processing-time WallclockTimestampExtractor
import java.util.Properties;import org.apache.kafka.streams.StreamsConfig;
Properties props = new Properties();props.put(StreamsConfig.TIMESTAMP_EXTRACTOR_CLASS_CONFIG,
WallclockTimestampExtractor.class.getName());
22
import org.apache.kafka.clients.consumer.ConsumerRecord;import org.apache.kafka.streams.processor.TimestampExtractor;
// TimestampExtractorpublic class MyEventTimeExtractor implements TimestampExtractor {@Override public long extract(ConsumerRecord record) {
// timestampFoo myPojo = (Foo) record.value();if (myPojo != null) {
return myPojo.getTimestampInMillis();} else {// valuenullreturn System.currentTimeMillis();}
}}
http://docs.confluent.io/3.0.0/streams/developer-guide.html#timestamp-extractor
Window
23
Tumbling time window 5PV
Hopping time window 1
KStream viewsByUser = IDkeyPVStream;KTable userCounts =
viewsByUser.countByKey(TimeWindows.of(WindowName", 5 * 60 * 1000L));
TimeWindows.of(WindowName", 5 * 60 * 1000L).advanceBy(60 * 1000L);
Join
24
Join
KTable KVSRDB
KStream voteRegionStream = ...(vote-topic)KTable partyTable = ...("party-topic");
KStream voteParty= voteRegionStream.leftJoin(
partyTable, (region, party) -> region + ," + party);
k: Hillary v: California k: candidate v: party
Hillary Democratic
Trump Republican
k: Hillary v: California, Democratic
Word Count
Time, Window, Join
25
Kafka Streams
Kafkasink
Kafka-4160 () Kafka Streams
consumer
1topic60
26
Kafka 0.10.1.0 client, server 0.10.0.1
Kafka Streams
Apps built with Kafka Streams 0.10.1 only work against Kafka clusters running 0.10.1+.
0.10.0.1Kafka
0.10.1
27
| |
/. | Exception | |.... | Use 0.10.1 ! | | |.... | . | |.... | | _ .|
( )(_,
|||||
Kafka Streams
Kafka Streams
KafkaAPI
Kafka Streams API
1 Kafka
28
Word Count
Time, Window, Join
29
Kafka Streams
Kafka Streams
30
Appendix
31
Kafka Streams
32
consumer
KafkaJava
Kafka
Kafka
source topic
internal topic
sink topic
Kafka
Streams
Kafka
Connect
Kafka
Connect
Configuration
33
import java.util.Properties;
import org.apache.kafka.streams.StreamsConfig;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.clients.producer.ConsumerConfig;
Properties settings = new Properties();
settings.put(StreamsConfig.APPLICATION_ID_CONFIG, my-app); // StreamConfig3settings.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, localhost:9092");
settings.put(StreamsConfig.ZOOKEEPER_CONNECT_CONFIG, localhost:2181");
settings.put(ProducerConfig...., ); // settings.put(ConsumerConfig...., ); //
application.id . consumer groupinternal topic.
bootstrap.servers Kafkahost/port.
zookeeper.connect ZooKeeperhost:port/chroot.
num.stream.threads .
replication.factor internal topic
state.dir State Store
timestamp.extractor
StateKafkachangelog topictopic
34
Node
Task
source part-1
changelog part-1
Node
Task
source part-0
changelog part-0
StateKafkachangelog topictopic
35
Node
Task
source part-1
changelog part-1
Node
Tasksource part-0
changelog part-0Task
changelog topic topicKafka Streams
topicKafkaauto.create.topics.enable=false
topiccompact keyvalue
36
full
37
@Testpublic final void wordCount() {
final Serde stringSerde = Serdes.String(); // Serde is Serializer/DeserializerKafkafinal Serde longSerde = Serdes.Long(); // Serdes
KStreamBuilder builder = new KStreamBuilder();// KStream. 1: key Serde, 2: value Serde, 3: KStream queryStream = builder.stream(stringSerde, stringSerde, search-query-topic);
KStream wordCounts = queryStream// value.flatMapValues(value -> Arrays.asList(value.split(s+)))// keykeyvalue.map((key, value) -> new KeyValue(value, value)).countByKey(stringSerde, Counts) // KStream -> KTableKTable.toStream(); // KTable -> KStreamwordCounts.to(stringSerde, longSerde, wordcount-output); // sink
KafkaStreams streams = new KafkaStreams(builder, props); // propsKafka StreamsClientPropertiesstreams.start(); //
}
full
38
KafkaJava API
?
Kafka Streams