A Technical Infrastructure Overview

I built a project prototype on AWS recently. I was looking for a few things, 1) that it persists on servers that I don't have to maintain 2) it's cheap to keep running, and 3) it could scale up if the project gains traction. This post focuses on the design, and less so on the app itself. 

The project uses a standard client server model. The client is the web site, a web page running in a browser. The server runs in AWS on Lambda functions. These are small servers that spin up when they receive a request, spin down when there is no traffic, and can spin up multiple instances if there's lots of traffic. The database behind the Lambda functions uses serverless RDS MySQL, which also spins up and down with traffic.

The benefits of using AWS over self hosting is that it lets you dynamically rent compute resources proportionate to the current traffic, rather than owning hardware with a static max capacity. This is beneficial because when there's no traffic, costs are a small fixed amount. The fixed costs include things like buying the domain name and running an internet gateway to allow the Lambdas to issue external API calls. The variable costs increase with usage. Presumably if the system is being used, there is a corresponding increase in revenue that covers the variable costs. When traffic increases, the necessary resources will spin up and scale to support the current load. A primary downside to this approach is that it takes time to spin up the Lambdas and the database for the first user after a period of no traffic.

The client is a React App hosted in AWS S3, which talks to the Lambda functions via an API. React's primary virtue is that you can turn your frontend into building blocks, like legos, and then compose functionality to build layers of sophistication. For example, you could build a Loader Button component, which shows a spinner while work executes. And you could build a Modal component, an overlay to display info on top of a page. The modularity of React lets you put the same Loader Button into the Modal as well as into input forms throughout the site. Stylistic and functional changes made to the button apply equally in the modal context as they do in the form context.

AWS S3 is where the React client is hosted from. S3 is a publicly accessible cloud storage system. It is similar in concept to a computer file system, but the files can be accessed over the internet and the storage capacity is "infinite". When you go to the app's homepage, you are requesting to download some html/javascript/css files into your browser. These resources get downloaded from S3. AWS provides all sorts of nice infrastructure to make the user's web experience feel top tier - things like, allowing you to create and install a certificate so that your web site has HTTPS, distributing out your S3 client code to a "content delivery network" (CDN) so that page loads are zippy, and customizing HTTP code behavior so you don't see ugly white Server Failure pages when things go wrong.

The React client talks to AWS Lambda functions via an API. The API is the glue that maps individual buttons on the web site to actions taken on the server. When you click a given button on one of the pages, that translates to an API call that submits information to a Lambda that was built to handle that specific context. Any action on the web site that requires data be stored and accessed later requires issuing an API call to a corresponding Lambda function. Most API calls require that a user is logged in, which is handled by tagging the call with an authorization header to uniquely identify the user. The technology that defines the mapping of API call to Lambda functions is another piece of AWS infrastructure called API Gateway. 

Lambda functions are built to handle exactly one kind of interaction, though they can be used in multiple contexts. For example, there is an Update Project function that allows admins to set internal notes. That same function gets called by users when they modify their own project state. The small, single purpose "server function" idea is a relatively new software engineering paradigm. In the past, it was common to have large servers that handled every API call on the sites it supported - "monoliths", and more recently it's been common to bundle groups of semi related API calls together - "microservices". Both of those approaches led to different sets of code complexity and scalability issues. Lambdas, by contrast, are very small, easy to create new ones, dedicated to specific use, and scale up and down with traffic. 

The Lambdas in this prototype were written in Go. Go makes for a good choice here because it is compiled into small executables that contain all dependencies. A Go executable is uploaded and associated with a single Lambda function, and executing that Lambda means running that Go executable. Updating the behavior of a Lambda function means changing the Go code and reuploading the executable. A critical idea of these APIs is that all information necessary to perform a task is either directly included in the input or can be indirectly gathered from external resources using information provided in the input - this idea is known as "stateless". Another core idea is that Lambdas can call each other, which allows them to be used modularly, similar conceptually to React components. The Lambdas I have written mostly take the form of creating/reading/updating/deleting (CRUD) the "database resources", but some have relatively sophisticated behavior by composing the more basic Lambdas together into higher level ideas. For example, imagine a function "Get User's Primary Project" - this would work by looking up the user's primary primary project ID, and then invoking "Get Project" using that ID.

As an aside, the prospect of Lambdas invoking each other can be potentially dangerous if a recursive cycle is created. I once had a use case for a Lambda to call itself, ie, it was recursive, and failed to code a base case. When I invoked the function, it resulted in an explosion of recursive Lambda calls. On a single machine, unbounded recursive function calls eventually exhaust the memory and/or stack depth and crash the program. In a cloud environment like this, there's no difference between a recursive API call and a legitimate one and so memory/stack depth is effectively infinite. I chewed through almost a million Lambda invocations in minutes, and if I hadn't realized it, I may have been looking at a ridiculously large AWS bill. 

The concept of a resource is what lives at the database layer. This prototype uses MySQL, which is relational. The "relational" part means that, for example, the resources, User and Project, both have an account ID column, which relate to each other. That is how the system knows that a Project belongs to a given User. The CRUD operations that Lambdas perform refer to interacting with database resources. 

Ultimately, all data is accessed via React buttons that issue API calls to Lambdas that interact with resources stored in MySQL. These systems make for a clear set of layers. The React client is concerned with the look and feel of the site, and naively issues API calls. The Lambda functions don't care about why an API call is being made, it just does it, and if a given Lambda function is receiving an influx of requests, AWS knows to scale just that one up so that there's sufficient compute resources to handle the traffic. All the while leaving less trafficked Lambda functions appropriately scaled down. Data is persisted across Lambda functions via MySQL and the functions know how to interact with specific database resources because of information provided in the request.

This prototype largely achieves its goals. It will only consume resources, and therefore money, if it's being accessed. It will dormantly sit, available but spun down, awaiting traffic. Other ideas I explored in this prototype include defining the Lambda infrastructure with code using a project called Serverless, deploying changes using Github Actions, creating a notion of user types via AWS Cognito, and real time chat on AWS IoT.