Streaming Data with scalaz-stream
-
Upload
gary-coady -
Category
Technology
-
view
289 -
download
1
Transcript of Streaming Data with scalaz-stream
![Page 2: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/2.jpg)
• Why do we want streaming APIs?
• Introduction to scalaz-stream
• Use case: Server-Sent Events implementation
Contents
![Page 3: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/3.jpg)
Why do we want streaming APIs?
![Page 4: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/4.jpg)
Information with Indeterminate/unbounded size• Lines from a text file
• Bytes from a binary file
• Chunks of data from a TCP connection
• TCP connections
• Data from Kinesis or SQS or SNS or Kafka or…
• Data from an API with paged implementation
![Page 5: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/5.jpg)
“Dangerous” Choices
• scala.collection.Iterable Provides an iterator to step through items in sequence
• scala.collection.immutable.Stream Lazily evaluated, possibly infinite list of values
![Page 6: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/6.jpg)
Do The Right Thing• Safe setup and cleanup
• Constant memory usage
• Constant stack usage
• Refactor with confidence
• Composable
• Back-pressure
![Page 7: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/7.jpg)
• Creates co-data • Safe resource management • Referential transparency • Controlled asynchronous effects
What is scalaz-stream
![Page 8: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/8.jpg)
User code
Process.await
“Waiting” for callback
User code
Callback
![Page 9: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/9.jpg)
sealed trait Process[+F[_], +O]
Effect
Output
![Page 10: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/10.jpg)
case class Halt(cause: Cause) extends Process[Nothing, Nothing]
![Page 11: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/11.jpg)
case class Emit[+O](seq: Seq[O]) extends Process[Nothing, O]
![Page 12: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/12.jpg)
case class Await[+F[_], A, +O]( req: F[A], rcv: (EarlyCause \/ A) => Process[F, O] ) extends Process[F, O]
![Page 13: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/13.jpg)
Composition OptionsProcess1[I, O] -‐ Stateful transducer, converts I => O (with state) -‐ Combine with “pipe”
Channel[F[_], I, O] -‐ Takes I values, runs function I => F[O] -‐ Combine with “through” or “observe”.
Sink[F[_], I] -‐ Takes I values, runs function I => F[Unit] -‐ Add with “to”.
![Page 14: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/14.jpg)
Implementing Server-sent Events (SSE)
This specification defines an API for opening an HTTP connection for
receiving push notifications from a server in the form of DOM events.
![Page 15: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/15.jpg)
case class SSEEvent(eventName: Option[String], data: String)
data: This is the first message.
data: This is the second message, it data: has two lines.
data: This is the third message.
event: add data: 73857293
event: remove data: 2153
event: add data: 113411
Example streams
![Page 16: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/16.jpg)
We want this type:
Process[Task, SSEEvent]
“A potentially infinite stream of SSE event messages”
![Page 17: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/17.jpg)
async.boundedQueue[A]
• Items added to queue are removed in same order
• Connect different asynchronous domains
• Methods:def enqueueOne(a: A): Task[Unit]def dequeue: Process[Task, A]
![Page 18: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/18.jpg)
HTTP Client Implementation
• Use Apache AsyncHTTPClient • Hook into onBodyPartReceived callback • Use async.boundedQueue to convert chunks into
stream
![Page 19: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/19.jpg)
def httpRequest(client: AsyncHttpClient, url: String): Process[Task, ByteVector] = {
val contentQueue = async.boundedQueue[ByteVector](10)
val req = client.prepareGet(url)
req.execute(new AsyncCompletionHandler[Unit] {
override def onBodyPartReceived(content: HttpResponseBodyPart) = { contentQueue.enqueueOne( ByteVector(content.getBodyByteBuffer) ).run
super.onBodyPartReceived(content) } })
contentQueue.dequeue }
![Page 20: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/20.jpg)
How to terminate stream?
![Page 21: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/21.jpg)
req.execute(new AsyncCompletionHandler[Unit] {
...
override def onCompleted(r: Response): Unit = { logger.debug("Request completed") contentQueue.close.run }
...
}
![Page 22: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/22.jpg)
How to terminate stream with errors?
![Page 23: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/23.jpg)
req.execute(new AsyncCompletionHandler[Unit] {
...
override def onThrowable(t: Throwable): Unit = { logger.debug("Request failed with error", t) contentQueue.fail(t).run }
...
}
![Page 24: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/24.jpg)
Process[Task, ByteVector]
Process[Task, SSEEvent]
Process[Task, Underpants]
Step 1
Step 2
Step 3
![Page 25: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/25.jpg)
• Split at line endings
• Convert ByteVector into UTF-8 Strings
• Partition by SSE “tag” (“data”, “id”, “event”, …)
• Emit accumulated SSE data when blank line found
![Page 26: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/26.jpg)
• Split at line endingsByteVector => Seq[ByteVector]
• Convert ByteVector into UTF-8 StringsByteVector => String
• Partition by SSE “tag” (“data”, “id”, “event”, …)String => SSEMessage
• Emit accumulated SSE data when blank line foundSSEMessage => SSEEvent
![Page 27: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/27.jpg)
Handling Network Errors
• If a network error occurs:
• Sleep a while
• Set up the connection again and keep going
• Append the same Process definition again!
![Page 28: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/28.jpg)
def sseStream: Process[Task, SSEEvent] = { httpRequest(client, url) .pipe(splitLines) .pipe(emitMessages) .pipe(emitEvents) .partialAttempt { case e: ConnectException => retryRequest case e: TimeoutException => retryRequest } .map(_.merge) }
def retryRequest: Process[Task, SSEEvent] = { time.sleep(retryTime) ++ sseStream }
![Page 29: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/29.jpg)
Usage
sseStream(client, url) pipe jsonToString to io.stdOutLines
![Page 30: Streaming Data with scalaz-stream](https://reader034.fdocuments.net/reader034/viewer/2022051318/58773cd71a28ab342e8b5d25/html5/thumbnails/30.jpg)
Questions?