Photo by Ciprian Boiciuc on Unsplash

Enterprise Amazon EventBridge Schema Validation

Adding Amazon EventBridge event schema validation using OpenAPI 3.0 data types, patterns (regexes), min and max values etc, with code examples using TypeScript and the AWS CDK

Introduction 👋

This article covers why I believe we should enhance the basic Amazon EventBridge schemas generated through schema discovery for true ‘publisher’ and ‘consumer’ schema validation, using OpenAPI 3.0; allowing us to take advantage of regexes (patterns), min and max integer values, enums and more… We will also discuss potentially using a production shared EDA account for making our versioned schemas searchable to all consumers.

“This article covers why I believe we should enhance the basic Amazon EventBridge schemas for true publisher and consumer schema validation, using OpenAPI 3.0.0; allowing us to take advantage of regexes (patterns), min and max integer values, enums and more…” — Lee Gilmore

So what are we building to show the validation working in practice?

We are going to build a very basic example of producing and consuming events across domain teams, which in reality would be cross account in an enterprise; with a shared central event bus and event schema.

Example of what we are building
  1. Customers create orders through Amazon API Gateway
  2. The Create Order Lambda function validates the order against the event schema before publishing the event to Amazon EventBridge (OrderCreated event)
  3. The event is published to the shared central event bus (in reality for this demo it is in the same AWS account)
  4. A rule targets the Generate Invoice Lambda function.
  5. The Generate Invoice Lambda function as the consumer validates the event before generating the invoice.

The code repo can be found here:

https://github.com/leegilmorecode/serverless-event-bridge-validation

⚠️ Note: The code repo is a basic example to show the high level method in practice, and is not production ready. For example, I have tried to keep all of the code in handlers (one place) to make it easier to scroll through, rather than splitting up into separate functions.

ℹ️ Info: Instead of creating a full monorepo or sharing packages through NPM/Yarn we are faking this with path aliases using TypeScript for the demo.

Where should we validate events? 💭

That is an interesting debate, as we could feasibly validate in the producer, the consumer, or both. David Boyne asked this very question recently which had interesting answers:

Example of the code repository that we will use to showcase the validation in practice

The overall consensus was that we should validate both producing and consuming

The overall consensus was that we should validate both producing and consuming, but as Michael Walmsley indicated this could cause you headaches at scale if not implemented correctly from a process perspective too:

“Validating everywhere at enormous scale could actually be a cloud cost burden you can negate in this simple way. I think a lot of scale problems in cloud end up being process or human problems rather than technical ones and can be handled better.” — Michael Walmsley

Why should we validate both?

The diagram below shows why we should do this from a producer perspective specifically:

Producers not being ‘good event citizens’ and not validating events before sending them

As you can see from above, if the producer is not a ‘good event citizen’ they may create duff events that are then distributed across many consumers.

This then causes 1..n consumers to potentially validate the events and not being able to process them as they fail.. (what then?).

“As you can see from above, if the producer is not a good event citizen they may create duff events that are then distributed across many consumers.” — Lee Gilmore

The diagram below shows why we should do this from a consumer perspective specifically:

As you can see from the diagram above, in an enterprise with many distributed services and accounts, we don’t actually know where the the event originated from (other than it being the central event bus which it is routed from).

“As you can see from the diagram above, in an enterprise with distributed services and accounts, we don’t actually know where the the event originated from (other than it being the central event bus which it is routed from).” — Lee Gilmore

This opens us up to unbounded security issues with under posting, over posting, event injection etc, across many many service boundaries. This then gets extrapolated out across further AWS services if not validated correctly (for example DynamoDB, SQS, SNS etc..).

“This opens us up to unbounded security issues with under posting, over posting, event injection etc. This then gets extrapolated out across further AWS services if not validated correctly (for example DynamoDB, SQS, SNS etc..)” — Lee Gilmore

In an ideal World in my opinion we would be validating the events at both the producer and consumer sides with the same versioned events as shown below:

As we have the versioned event schemas stored in a centralised AWS account in production they are easy to find and pull into consuming repos from a developer experience perspective.

The ‘Multi Account, Single Bus’ pattern can be viewed below, which shows the approach we take with the Serverless Architecture Layers pattern and a central enterprise event bus:

The link below goes into this in more detail

Event Schema production account and developer experience

In the ‘Multi Account, Single Bus’ pattern, turning on schema discovery for the individual domain accounts in pre-prod means that the schemas will only be discovered there, and not available to other domains and consumers in other AWS accounts. It is also not best practice to leave schema discovery on in production, you will only get anaemic schemas, and it is costly.

This is shown below, where developer experience is poor, and individual domain teams need to contact each-other for schema information:

Development teams only have access to their own discovered schemas

⚠️ Note: Generally, you only enable schema discovery in your development environments (AWS Free Tier includes 5 million ingested events). Schemas of any new events you create are automatically added to your registry to use when developing your application. If you need to audit all of the events going through your event bus, you can enable discovery on your production event bus, and pay $0.10 per million events ingested for any usage outside of the Free Tier.

What we ideally want from a developer experience perspective in an enterprise is that all production versioned event schemas are accessible and findable in one particular place, and the central production EDA account is probably that place in my opinion:

Development teams having access to all other domains versioned schemas in the central EDA account

What are the problems we could face with this approach?

If we use highly validated events rather than the ‘anaemic events’ discovered using Amazon EventBridge schema discovery, we could see the following potential issues when validating at both the producer and consumer sides:

❌ The events should always be backwards compatible — and should never break for consumers.

❌ If we validate to a low level degree we could increase the chance of issues with backwards compatibility if we are not careful (especially with the use of regexes and max properties set on objects). This needs a lot of test effort around it of course.

These issues are no different than what we have with versioned APIs across an enterprise, and this approach creates no more coupling than it would with anaemic events in my opinion.

How do we mitigate these potential issues?

To mitigate this we can do the following:

✔️ When there is a breaking change to an event which we can’t prevent it becomes a new event (this includes breaking changes with patterns too). Again this is no different to APIs.

✔️ The use of Jest snapshots and testing for consumer breaking changes i.e. ensure that your events have no breaking changes before publishing them.

✔️ If we add new fields they should ideally allow for nullable if possible.

✔️ In our event metadata we could add the version of the schema that was used to validate the event before it was published i.e. metadata.schemaVersion which will help consumers with debugging issues.

👇 Before we go any further — please connect with me on LinkedIn for future blog posts and Serverless news https://www.linkedin.com/in/lee-james-gilmore/

What are Event Schemas and why do we need them? 💭

OK, so let’s first go through what an EventBridge schema is, and how we can enhance this with OpenAPI 3.0.

ℹ️️ Note: EventBridge supports both OpenAPI 3 and JSONSchema Draft4 formats.

EventBridge Schemas and the Registry

A schema defines the structure of events that are sent to EventBridge. EventBridge provides schemas for all events that are generated by AWS services. You can also create or upload custom schemas or infer schemas directly from events on an event bus.

Once you have a schema for an event, you can download code bindings for popular programming languages and speed up development. You can work with code bindings for schemas and manage schemas from the EventBridge console, by using the API, or directly in your IDE by using the AWS toolkits.

What validation do we get with OpenAPI 3.0?

Here are some of the main ones available, but this does not include every type of validation in OpenAPI 3.0 that is available to us:

✔️ Data Types — String, Number, Integer, Boolean etc. We get this automatically from the Amazon EventBridge schema discovery which infers this based on events being raised on the bus.

✔️ Patterns — This allows us to use regex validation per property, which allows us to perform very specific and powerful validation for our models.

✔️ Enums — This allows us to check that the values provided for a property are from a defined list of applicable values, for example for orderStatus this could be ‘Created’ or ‘Cancelled’.

✔️ Min/Max— We can ensure that integer values are within a set range.

Let’s look at what we get for as standard through Amazon EventBridge Schema Discovery vs what we can build out from ourselves

The following anaemic schema below is based on the automatic event discovery for our ‘OrderCreated’ event in development which is simply inferred:

The following schema below is version two where we have added actual validation to the event body ourselves i.e. over and above the inferred anaemic schema in version 1:

If we look at line 94 we ensure that the addressLine4 property complies with a given orders address regex for example.

As you can see we are importing the order specific regexes so they are defined in one place, as shown below (this allows us to reuse the regexes across many entities and services, not just Amazon EventBridge):

These regexes can therefore be used in other validation such as API Gateway, as well as tested through Jest

Another example is the use of enums on line 54 for orderStatus. We ensure that the orderStatus is one of two valid options.

We validate this using a simple custom package we have created as shown below using AJV (this would typically be pushed to AWS CodeArtifact or equivalent for consumers to install through NPM or Yarn):

This then allows us to import the relevant schemas and validate them before publishing the events, as shown in this sample from the create-order.ts handler file:

As you can see we generate an idempotency key using UUID version 5 which given the same namespace and a stringified payload, the same UUID returned is always deterministic. This then allows consumers to pull in the same orders namespace to validate if required (or they could just use as is on their side as a lookup of events already actioned)

We then create the event, and validate it before publishing to the central event bus, which we can see in lines 22–26.

From a consumer point of view i.e. our create-invoice function (which would typically reside in another AWS account), we can perform the same validation using the same imported schemas which have been pulled from the central event bus account registry.

Testing the validation with the v2 of the schema

OK, so let’s test the version 2.0.0 of the OrderCreated schema to see this in action.

ℹ️ Note: You can use the Postman file found in the code repo if you would like to test yourself after deploying.

We can see that in our version 2.0.0 schema we have validation to ensure that the quantity property is between 1 and 20.

quantity: { type: 'number', minimum: 1, maximum: 20 },

We can see below a valid call of 10 and then an invalid one of 99 (where the second call fails on the event validation — returning to the user the reasons for failure):

Example of our validation in motion

Our TypeScript path alias setup

Just a quick note on the path alias setup to allow us to run this example as if shared packages had been created and published to NPM/Yarn. We can see below that we have the shared modules in the following folders:

our folder setup for shared packages

This then allows us in our tsconfig.json files to create path aliases to these ‘modules’, as shown below:

our tsconfig.json setup for path alias’s

And this allows us to import the files in our handlers using the following import paths for example:

import { schema } from '@schemas/orders/order-created/orders.acme@OrderCreated-v2';import { validate } from '@packages/validation';

Creating and Updating schemas in our pipelines 🏗️

OK, so we have our schemas now defined in our code base, however we want to make them available for other domains (consumers) in our business, through uploading them to the production schema registry in one AWS account.

Using the AWS CLI & AWS SDK

We can use the following AWS CLI calls in our pipelines to populate the schemas in our central registry :

We could also look to use Custom Resources perhaps using the AWS SDK as an alternative:

Reusable validation 👮

This approach to our schemas and validation also allows us to now perform the same steps with SQS, SNS, API Gateway etc; whilst allowing us to reuse the regexes and schemas etc. Using our validation package this makes it very easy..

This is shown with our approach to Guaranteed ordering of events with EventBridge which is detailed in the following article:

As you can see from the diagram below and the attached article, we could use the same event schema validation for SNS & SQS too if we do this at the OrderCreated level of the payload when validating; which means the consumers can also be validated in the same way regardless of it coming from a topic, queue or bus:

https://leejamesgilmore.medium.com/guaranteed-event-ordering-when-using-amazon-eventbridge-as-your-enterprise-service-bus-ca7a2b62afea

You can then also couple this approach with Internationalisation in Global Enterprise organisations using a similar approach found in this previous article, allowing you to change the schemas, regexes etc at build time depending on the locale:

Summary

I hope you found that useful showing a very basic example of how we can move from anaemic EventBridge schemas to more powerful schemas when it comes to validation; as well as why we should validate events when both producing and consuming.

Wrapping up 👋

Please go and subscribe on my YouTube channel for similar content!

I would love to connect with you also on any of the following:

https://www.linkedin.com/in/lee-james-gilmore/
https://twitter.com/LeeJamesGilmore

If you enjoyed the posts please follow my profile Lee James Gilmore for further posts/series, and don’t forget to connect and say Hi 👋

Please also use the ‘clap’ feature at the bottom of the post if you enjoyed it! (You can clap more than once!!)

About me

Hi, I’m Lee, an AWS Community Builder, Blogger, AWS certified cloud architect and Enterprise Serverless Architect based in the UK; currently working for City Electrical Factors (UK) & City Electric Supply (US), having worked primarily in full-stack JavaScript on AWS for the past 6 years.

I consider myself a serverless advocate with a love of all things AWS, innovation, software architecture and technology.

*** The information provided are my own personal views and I accept no responsibility on the use of the information. ***

You may also be interested in the following:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store