Towards Data & Cloud #8: AWS Cloud Watch to Kinesis — Reference Architecture (Part 1)

Data Ingestion Framework, Tools & Technology, Reference Architectures

LAKSHMI VENKATESH
5 min readMar 22, 2024

Details of AWS Kinesis as multi-part series.

Amazon Kinesis: This is a real-time data streaming service that can handle massive streams of data. It consists of several tools:

  • Chapter 1 — Kinesis Data Streams: For building custom applications that process or analyze streaming data for specialized needs.
  • Chapter 2 — Kinesis Data Firehose: For loading streaming data into AWS data stores.
  • Chapter 3 — Kinesis Data Analytics: For processing and analyzing streaming data with SQL or Apache Flink.
  • Chapter 4 — Kinesis Video Streams: Low latency video streaming at scale.
AWS Kinesis

If you want to send data either to S3 or DynamoDB from streams of source data, Kinesis can help to route the data.

Chapter 1: Kinesis Data Streams

  • Section A: Input into Kinesis Data Streams
  • Section B: Output from Kinesis Data Streams
Overall Reference Architecture

Section A: Input into Source Kinesis Data Streams

AWS CloudWatch Logs to Kinesis Data Streams (KDS)

  • Use CloudWatch Logs subscription filter to stream log data directly to KDS.
  • Ensure that the IAM role associated with CloudWatch has permission to put records into the Kinesis stream.
  • Sample: aws logs put-subscription-filter — log-group-name “MyLogGroup” — filter-name “MyFilter” — filter-pattern “” — destination-arn “arn:aws:kinesis:region:account_id:stream/stream_name”
  • Where used? Ideal for real-time monitoring and analysis of log data.
  • Example: Reference Architecture — flow of logs within AWS services.
AWS CloudWatch Logs -> AWS Kinesis Reference Architecture
  1. AWS CloudWatch Logs: This is the source of the log data. CloudWatch Logs can collect logs from various AWS services, EC2 instances, and on-premises servers.
  2. Subscription Filter: This is a feature within CloudWatch that allows you to filter log data in real-time, based on specified patterns, and deliver the filtered logs to a designated destination.
  3. IAM Role: This is an AWS Identity and Access Management (IAM) role that provides the necessary permissions for the CloudWatch Logs to interact with other services, such as Amazon Kinesis Data Streams.
  4. Log Destination: This typically represents the target to which the filtered logs will be delivered. It could be another AWS service such as a Kinesis Data Stream, Lambda function, or another CloudWatch Logs log group in a different account.
  5. Amazon Kinesis Data Streams: This is a scalable and durable real-time data streaming service. In this context, it acts as a destination for the logs that pass through the subscription filter. Once the logs are in Kinesis Data Streams, they can be processed in real-time or sent to other services for further analysis or storage.
  6. AWS Lambda: Lambda is an event-driven, serverless computing platform provided by AWS. It can be triggered by Kinesis Data Streams to process the incoming log data in real-time.
  7. AWS CloudWatch Logs: This service could also be a destination for the logs, especially for log aggregation purposes. If needed, logs from multiple sources can be aggregated into a single CloudWatch Logs group for centralized monitoring and analysis.
  8. S3 (Simple Storage Service): S3 is an object storage service that offers industry-leading scalability, data availability, security, and performance. Logs from CloudWatch can be further aggregated and stored in S3 buckets for long-term retention, compliance, or further processing with other tools like Amazon Athena for query and analysis.

To use Amazon CloudWatch Logs with Amazon Kinesis Data Streams (KDS) for real-time monitoring and analysis of log data, follow these steps. This guide assumes you already have a CloudWatch Log Group and a KDS stream created.

Step 1: Create an IAM Role and Policy

First, we need an IAM role that CloudWatch Logs can assume to write to your KDS stream. This role needs permissions to put records into the Kinesis stream.

  1. Create a Policy: Go to the IAM console, create a new policy that allows the kinesis:PutRecord and kinesis:PutRecords actions on your KDS stream. A sample policy statement:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"kinesis:PutRecord",
"kinesis:PutRecords"
],
"Resource": "arn:aws:kinesis:<region>:<account_id>:stream/<stream_name>"
}
]
}

2. Attach Policy to a Role: Create a new IAM role and attach the policy you just created. Make sure to select the trust relationship that allows CloudWatch Logs to assume this role.

Step 2: Create a Subscription Filter in CloudWatch Logs

A subscription filter defines which log events get delivered to your KDS stream and what format they are delivered in.

  1. Open the CloudWatch Console: Navigate to the CloudWatch service in the AWS Management Console.
  2. Select Log Groups: Go to Logs → Log groups and select the log group you want to stream from.
  3. Create Subscription Filter: Click on the “Actions” dropdown and select “Stream to Amazon Kinesis Data Stream”. Follow the wizard to select your KDS stream, set the filter pattern (if you want to filter logs), and choose the IAM role you created earlier.

Alternatively, we can use the AWS CLI command as you mentioned:

aws logs put-subscription-filter \
--log-group-name "MyLogGroup" \
--filter-name "MyFilter" \
--filter-pattern "" \
--destination-arn "arn:aws:kinesis:region:account_id:stream/stream_name"

Replace "MyLogGroup", "MyFilter", "region", "account_id", and "stream_name" with your specific details. The --filter-pattern "" argument can be customized to match only specific log events.

Step 3: Monitor and Use the Streamed Data

Once the subscription filter is in place, log events that match the filter pattern (if any) will start flowing into your KDS stream. We can then process or analyze these logs in real-time using AWS services or custom applications that consume data from KDS.

Additional Tips:

  • Testing and Validation: Initially, set a broad filter pattern (or none) to ensure data flows as expected, then gradually refine it based on the specific log data you need.
  • Monitoring: Utilize CloudWatch metrics and Kinesis monitoring tools to keep an eye on the data flow and performance, making adjustments as necessary.
  • Security: Regularly review the IAM policies and roles for the principle of least privilege to ensure only necessary permissions are granted.

By following these steps, we can effectively use AWS CloudWatch Logs with Kinesis Data Streams for real-time log monitoring and analysis, enhancing our ability to react to application or infrastructure issues promptly.

--

--

LAKSHMI VENKATESH

I learn by Writing; Data, AI, Cloud and Technology. All the views expressed here are my own views and does not represent views of my firm that I work for.