Microservices Log Aggregation

I recently rolled out ELK for log aggregation across a microservices architecture.  This post discusses the details of this implementation without regurgitating the documentation available at elastic.co.

All code is in private repositories.  I may create open source packages at a later date that demonstrate the techniques for implementing the below, however this post is focused on the model rather than code implementation details.

Context:

  • Mesos cluster with typically 5+ slaves
  • Inter-service communications over HTTP
  • Services fall into two categories;
    • Controllers; service requests from user agents and orchestrate operations across n backend services including authentication
    • Backend Services; single domain services that depend on their own data stores and caches only
  • Controllers all NodeJs
  • Backend services NodeJS and Go
  • All services deployed as Docker containers of at least 2 instances per service type on distinct slaves

Log Generation

The more structure there is to log data the more useful it is.  Logstash exists for the purpose of adding structure to un-structured log data.  In this case I opted not to use Logstash and instead generate structured log output at source as this meant one less piece of infrastructure to manage.

Log Structure:

{
  "ServiceKey" : "key to identify the service type e.g. FORUM_CONTROLLER",
  "ComponentKey" : "key to identify a component e.g. AUTH_MIDDLEWARE",
  "RequestKey" : "Guid to identify the original user agent request",
  "HostAddress" : "address of the machine generating the log event",
  "TimeStampUTC" : "timestamp for event",
  "Title" : "Log event title",
  "Additional" : "additional info, may be text or an object",
  "Level" : "DEBUG|INFO|ERROR"
}

There are NPM and Go packages used for logging that enable the above and can be configured for JSON pretty and JSON single line output.  These packages have an API consumed directly from the application code.  Log output generated by external packages, such as superagent, is redirected to go through the logging package and therefore is also structured.

The ComponentKey field is worthy of further discussion;  there are various packages that are consumed in multiple services.  These packages in turn use the logging package.  The signature for instantiating a logger is as follows:

const logger = logging.LoggerFromConfig({component key})

The configuration of the entry point application contains the ServiceKey.  In this way the log output from the package will include the entry point ServiceKey and the ComponentKey, enabling a distinction between auth-middleware output in the forum-controller vs the news-controller for example.

The Title field may simply contain a single line of log output, as in the case where external package log events are redirected to the logger package. However, where the logging API is called directly this is used as a unique key to identify the log event type and thus form a hard link from log event to a specific line of code.

Request Tracking

A single request from a user agent to a controller may be the seed for many inter-service requests.  Being able to trace the entire interaction within the scope of that original request across all services is invaluable.  This is achieved via the following:

Controllers handling user agent requests will;

  • Generate a guid for the request; the RequestKey
  • Create a log event including the RequestKey and supplemental request information, such as user agent details, user id, session id etc.
  • Append the RequestKey to all log events generated within the context of the request
  • Append a RequestKey header to all outbound HTTP requests made within the context of the request

Backend Services handling requests will:

  • Look for a RequestKey header in the request, if one is not present a new one is generated; this is less than ideal but it at least allows for the request to be traced from that point on.
  • Append the RequestKey to all log events generated within the context of the request
  • Append a RequestKey header to all outbound HTTP requests made within the context of the request

Log Harvest

The sole purpose of Filebeat is to harvest log files and publish the contents to either Logstash or directly to Elastic Search; the latter in this scenario as logs are structured at source.

Filebeat primarily needs three things:

  • A search pattern from which to find log files
  • A file in which to store markers so it knows where it is up to in publishing files
  • A destination to publish to, in this case an Elastic Search cluster.

As all services are run in Docker containers I wanted to do the same for Filebeat rather than get into the messy business of altering the slaves:

  • Create a Filebeat Docker image
  • Run a single instance of the above on each slave
  • Map the Filebeat container to a directory on the Mesos slave that contains the stdout files generate for every container running on that slave
  • Configure Filebeat with a log harvest file pattern that finds the stdout files
  • Map a directory in the Filebeat container to the host and configure Filebeat to store it’s placeholder file

As services are migrated to structured log output over time there will be a mix of structured and un-structure log output on the slaves.  To distinguish between the two:

  • Configure Filebeat to harvest files with a specific suffix only, e.g. *.structoutv1
  • Create symlinks with the above suffix to point to stdout file at service bootstrap time.

Summary

The result is that all log output is now in one place, Elastic Search, from where it can be effectively mined as well as have alerts configured.  All log output for a single user agent request across the entire network of services can be very easily returned via a single query.

A Kibana instance is also installed for visualising log data (by virtual of AWS ElasticSearch service), however this doesn’t get particularly interesting until entity centric indexes are built that will enable some interesting aggregates such as response times to be displayed.

 

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s