← Back to Blog

AMZ Kinesis Real-Time Streaming

Kinesis Streams is a real time stream processing framework on the Amazon web service as managed service that is optimized to run on massive scale. This is ideal for realtime reporting/analytics and has the capability of being integrated into other AWS services which makes it highly scale-able and can be used  build reliable data processing systems. The data retention period for the records in the streams is 24 hours(configurable) till then the consuming applications will have access to the data.

Concepts Data Records –A data record is the unit of data stored in an Amazon Kinesis stream.Data records are composed of a sequence number, partition key, and data blob, which is an immutable sequence of bytes. Streams does not inspect, interpret, or change the data in the blob in any way. A data blob can be up to 1 MB. Each data record has a unique sequence number that is assigned by the stream. Producer-Producers put records into Amazon Kinesis Streams. For example, a web server sending log data to a stream is a producer. Consumers(Amazon Kinesis Streams Applications.)-Consumers get records from Amazon Kinesis Streams and process them. These usually run on EC2 instances Shards(Partitions)-A stream is composed of one or more shards, each of which provides a fixed Maximum capacity for read and writes Max Read Rate: 5 Transactions/second( 2 MBps) Max Write Rate :1,000 records/second(1MBps) The no of shards can be changed according to utilization on the Stream. A partition key needs to be specified by the application that put data into the stream. The partition key decides as to into which shard the data records must be written. Libraries and Connectors

Producer-  KPL, SDKs in Different Languages(Java , Python ,Scala) and Agent provided by AWS written in Java, Consumer-  KCL, SDKs in Different Languages(Java , Python, Scala)

Below if flow is for twitter data being streamed live into Kinesis  and further stored in AWS S3(Python and Scala) Github repo: https://github.com/ashwinaravind/KINESIS-TWITTER-STREAMING.git

← Back to All Articles