Implementing AWS FireLens with Grafana Loki in AWS ECS

Implementing AWS FireLens with Grafana Loki in AWS ECS

Amazon Elastic Container Service (ECS) is a powerful container orchestration service that simplifies the deployment, management, and scaling of containerized applications. One of its key features is its deep integration with the AWS platform, providing a secure and efficient solution for running container workloads. In this blog post, we’ll explore how to enhance your ECS-based applications’ logging capabilities by using AWS FireLens and Grafana Loki for advanced log routing and storage.

Why Choose Grafana Loki Over AWS CloudWatch?

While AWS CloudWatch is a widely used service for monitoring and logging in the AWS ecosystem, it may have limitations in terms of log collection and analytics, especially for complex, multi-container applications. Here’s why Grafana Loki is a compelling alternative:

  • Flexible Log Storage: Grafana Loki is a powerful log aggregation system that offers flexibility in log storage, allowing you to choose a storage backend that suits your specific needs.

  • Cost-Effective: Loki uses a unique indexing mechanism, making it more cost-effective for storing logs compared to traditional solutions like Amazon CloudWatch.

  • Querying and Visualization: Loki integrates seamlessly with Grafana, providing advanced query and visualization capabilities, making it easier to analyze and troubleshoot log data.

  • Highly Scalable: Loki can be scaled horizontally to handle massive log volumes, ensuring it can grow with your application’s demands.

Fortunately, Grafana provides an official manual for ECS and Loki integration, but it is insufficient when you want to go beyond it. In this article, I will demonstrate my approach to integration.

Implementing FireLens in AWS ECS

What is FireLens?

FireLens is an Amazon ECS container log router that provides a flexible and robust solution for managing logs in your containerized applications. It allows you to route logs to various log storage options, such as Grafana Loki, Elasticsearch, AWS Kinesis Data Firehose/Streams, or other custom solutions.

In this section, I’ll provide you with an ECS task definition that leverages AWS FireLens for advanced log routing to Grafana Loki. We’ll highlight the key components and configurations of the task definition.

[
        {
            "name": "log_router",
            "image": "grafana/fluent-bit-plugin-loki:2.9.1",
            "cpu": 0,
            "portMappings": [],
            "essential": true,
            "environment": [],
            "mountPoints": [],
            "volumesFrom": [],
            "user": "0",
            "readonlyRootFilesystem": false,
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-group": "log_router_log_group",
                    "awslogs-region": "us-west-1",
                    "awslogs-stream-prefix": "firelens"
                },
                "secretOptions": []
            },
            "firelensConfiguration": {
                "type": "fluentbit",
                "options": {
                    "enable-ecs-log-metadata": "true"
                }
            }
        },
        {
            "name": "application",
            "image": "application_image:tag",
            "cpu": 512,
            "memory": 1024,
            "portMappings": [
                {
                    "name": "application-3000-tcp",
                    "containerPort": 3000,
                    "hostPort": 3000,
                    "protocol": "tcp"
                }
            ],
            "essential": true,
            "environment": [
                {
                    "name": "ENVIRONMENT",
                    "value": "dev"
                }
            ],
            "mountPoints": [],
            "volumesFrom": [],
            "readonlyRootFilesystem": false,
            "logConfiguration": {
                "logDriver": "awsfirelens",
                "options": {
                    "LabelKeys": "container_name,ecs_task_definition,source,ecs_cluster",
                    "Labels": "{job=\"firelens-logs\",environment=\"dev\"}",
                    "LineFormat": "key_value",
                    "Name": "grafana-loki",
                    "RemoveKeys": "container_id,ecs_task_arn",
                    "Url": "https://<grafana_loki_url>/loki/api/v1/push"
                },
                "secretOptions": []
            }
        }
]

Key Highlights

Here are the important parts of the ECS task definition:

  • name: The name of the container within the task.

  • image: The container image to use.

  • cpu: The CPU allocation for the container.

  • memory: The memory allocation for the container.

  • portMappings: Defines port mappings for the container.

  • logConfiguration: Configures the log driver and log options for the container.

  • firelensConfiguration: Specifies FireLens as the log router for the container.

Now, let’s proceed to the Terraform code for implementing this task definition in your AWS ECS environment.

Terraform Code for FireLens Implementation

Let’s dive into the Terraform code used to implement FireLens in AWS ECS. This code snippet sets up a CloudWatch log group and configures the FireLens log router:

resource "aws_cloudwatch_log_group" "cloudwatch_lg" {
  name = "${var.project_name}-${var.service}-${var.env}"
}

module "fluentbit_definition" {
  source          = "cloudposse/ecs-container-definition/aws"
  version         = "0.58.1"
  container_image = "grafana/fluent-bit-plugin-loki:2.9.1"
  container_name  = "${var.project_name}-log-router-${var.service}-ct"
  firelens_configuration = {
      type = "fluentbit",
      options = {
          enable-ecs-log-metadata = "true"
      }
  }
  log_configuration = {
    logDriver = "awslogs"
    options = {
      awslogs-group         = "${var.project_name}-${var.service}-${var.env}"
      awslogs-region        = var.aws_region
      awslogs-stream-prefix = "firelens"
    }
    secretOptions = []
  }
}

module "app_definition" {
  source          = "cloudposse/ecs-container-definition/aws"
  version         = "0.58.1"
  container_image = "${var.service_ecr_url}:${var.service_image_tag}"
  container_name  = "${var.project_name}-${var.service}-ct"
  container_cpu   = "${var.container_cpu}"
  container_memory = var.container_memory
  port_mappings = [{
    containerPort = var.port
    hostPort      = var.port
    protocol      = "tcp"
  }]

  log_configuration = {
    logDriver = "awsfirelens"
    options = {
      Name  = "grafana-loki"
      Url   = "https://${var.grafana_username}:${var.grafana_password}@${var.grafana_host}/loki/api/v1/push"
      Labels    =  "{job=\"firelens-${var.project_name}-${var.service}\",environment=\"${var.env}\"}"
      RemoveKeys    = "container_id,ecs_task_arn"
      LabelKeys = "container_name,ecs_task_definition,source,ecs_cluster"
      LineFormat    = "key_value"
    }
    secretOptions = []
  }
}

module "taskdefination" {
  source          = "cn-terraform/ecs-fargate-task-definition/aws"
  name_prefix      = "${var.project_name}-${var.service}-td-${var.env}"
  container_cpu    = var.container_cpu
  container_memory = var.container_memory

  containers = [
      module.fluentbit_definition.json_map_object,
      module.app_definition.json_map_object
  ]
}

resource "aws_ecs_service" "service" {
  name                               = "${var.project_name}-${var.service}-${var.env}"
  cluster                            = var.cluster
  deployment_maximum_percent         = 200
  deployment_minimum_healthy_percent = 100
  desired_count                      = 1
  enable_ecs_managed_tags            = false
  health_check_grace_period_seconds  = 0
  launch_type                        = "FARGATE"
  force_new_deployment               = true
  load_balancer {
    target_group_arn = var.target_group_arn
    container_name   = var.container_name
    container_port   = var.container_port
  }
  network_configuration {
    security_groups  = var.security_groups
    subnets          = var.vpc_pub_subnet_ids
    assign_public_ip = true
  }
  platform_version = "1.4.0"
  propagate_tags   = "SERVICE"
  task_definition  = module.taskdefination.aws_ecs_task_definition_td_arn
}

Let’s break down the key components of this Terraform code:

  1. aws_cloudwatch_log_group Resource: This resource creates a CloudWatch log group with a unique name based on your project, component, and environment. This log group is where your container logs will be stored in CloudWatch.

  2. fluentbit_definition Module: This module configures the Fluent Bit container with FireLens for log routing. It uses the grafana/fluent-bit-plugin-loki:2.9.1 image for this purpose. Key configurations include:

  • container_name: Sets the name of the Fluent Bit container.

  • firelens_configuration: Defines FireLens as the log router and enables ECS log metadata.

  • log_configuration: Configures the log driver as awslogs, specifying the CloudWatch log group created earlier, the AWS region, and a stream prefix for log entries.

3. app_definition Module: This module defines the configuration for your application's container. It includes configurations for the application's container image, CPU and memory allocations, port mappings, and log routing. Notable settings include:

  • container_name: Sets the name of your application's container.

  • log_configuration: Configures the log driver as awsfirelens, routing logs to Grafana Loki. It includes settings for Loki URL, labels, and log format.

  • environment: Allows you to specify environment variables for your container.

  1. taskdefination Module: This module creates an ECS task definition, combining the Fluent Bit and application containers defined in the previous modules. It sets up CPU and memory allocations for the task.

  2. aws_ecs_service Resource: This resource creates an ECS service that manages the task deployment. It ensures that the specified number of tasks are running, manages load balancing, and connects to the network. Notable settings include the desired count, load balancing configuration, network setup, and task definition.

With this Terraform code, you have set up a robust logging solution for your AWS ECS containers, routing logs to Grafana Loki for advanced analysis and visualization.

Using Grafana Loki with FireLens

Now that you have FireLens set up with Grafana Loki, you can start enjoying the benefits of Grafana’s powerful log management and visualization capabilities. Here’s how the integration works:

  1. Containers in your ECS tasks generate log data.

  2. FireLens routes these logs to Fluent Bit, which is configured to send logs to Grafana Loki.

  3. Grafana Loki stores the log data efficiently, making it available for querying and analysis.

You can use Grafana’s query language and visualization features to gain insights from your log data, helping you troubleshoot issues and monitor the health of your application effectively.

Conclusion

In this blog post, we’ve explored the implementation of AWS FireLens with Grafana Loki in AWS ECS, enhancing your containerized applications’ logging capabilities. By choosing Grafana Loki over AWS CloudWatch, you can enjoy greater flexibility, cost-effectiveness, and advanced log management features.

The provided Terraform code snippet is a valuable resource to get started quickly with FireLens and Grafana Loki. It simplifies the setup, allowing you to focus on gaining insights from your log data and ensuring the reliability of your containerized applications.

With this powerful logging solution in place, you can effectively monitor, troubleshoot, and optimize your containerized applications, ensuring they run smoothly in an AWS ECS environment.

Also published on Medium