Author: Michael Hunter
License: MIT
Infra Generator is a proof-of-concept library designed to streamline the creation and orchestration of distributed systems. Through its versatile architecture, developers can efficiently encapsulate components, leverage dependency injection, and maintain type-safety. The library provides tools to programmatically define infrastructure configuration, server logic, and client logic, while automatically generating necessary Docker and Docker Compose files to seamlessly deploy multiple architectural components.
Please note that this library is a proof-of-concept and is not intended for production use. It is provided as-is, and the author makes no warranties regarding its functionality, completeness, or reliability. Use it at your own risk. The author shall not be responsible for any damages or loss resulting from the use of this library.
Install docker and docker-compose.
npm install @coder-mike/poc-infra-generator
The following is an example that creates an Express API server called the "customer-server" backed by a postgres database containing a table of customers, and an example client application that sends a record to the database through the API and reads it back again.
The following is aspirational, since the library is a WIP:
import { rootId, Store, ApiServer, ID, run, Worker } from '@coder-mike/poc-infra-generator';
interface Customer {
id: string;
name: string;
}
interface CustomerServer {
postCustomer(customer: Customer): Promise<void>;
getCustomer(id: string): Promise<Customer>;
}
const id = rootId('my-app');
// Create the server (which will create its own database)
const server = createCustomerServer(id`customer-server`);
// Create the client, with injected reference to server
createExampleClient(id`example-client`, server);
// Run the current persona
run();
function createCustomerServer(id: ID): CustomerServer {
// Create a store for customers (backed by postgres)
const db = new Store(id`db`);
// Create an Express API server
const server = new ApiServer(id`api`);
// Endpoint to post a customer to the database
const postCustomer = server.defineEndpoint(
'/api/customer',
async (customer: Customer) => {
await db.set(customer.id, customer);
},
{ method: 'POST' }
);
// Endpoint to get a customer from the database
const getCustomer = server.defineEndpoint(
'/api/customer',
async (id: string) => {
return db.get(id);
},
{ method: 'GET' }
);
return {
postCustomer,
getCustomer,
}
}
function createExampleClient(id: ID, server: CustomerServer) {
// The client will just be a docker container that runs at deployment time
new Worker(id, async () => {
// Save customer to the database via the API server
await server.postCustomer({ id: '1', name: 'John Doe' });
// Load customer from the database via the API server
const customer = await server.getCustomer('1');
console.log(`Loaded customer: ${JSON.stringify(customer)}`);
})
}
To run this example:
# 1. Build the example. This also generates the docker files and docker-compose file
npm run example:build
# (2. Note: Please make sure docker desktop is running)
# 3. Run the whole distributed system (client, server, and database)
docker-compose -f build/docker-compose.yml up
Or to run the whole example in-process instead of using docker:
# Run everything in-process and in-memory. This is useful for debugging.
npm run example:start:in-process
Note: it's recommended to also include a .dockerignore
file in your project to prevent the node_modules
directory from being copied into the docker containers. This will make the docker images smaller and faster to build.
There are a few key points to highlight in this example before I explain it:
-
This example is a single script (representative of a single application of many files) but contains code that executes in 3 different places: the client, the server, and build-time configuration of the infra.
-
The
db
increateCustomerServer
is fully encapsulated -- it's a local variable that's not accessible to other parts of the system (e.g. the client). The pattern proposed in this POC makes encapsulation of infra components possible in a way that's a lot harder with traditional infra patterns (e.g. writing Terraform scripts). -
The
server
increateExampleClient
is passed in as a parameter. This is an example of dependency injection at the infra level. -
The client's connection to the server here is encapsulated in the function returned from
server.definePost
andserver.defineGet
. These functions handle the details of how to connect to the server and send the HTTP request. In a real-world example, these could also encapsulate authorization and encryption details. -
The client and server here have a strongly-typed connection between them, without doing any type-casts.
-
Infra components such as
ApiServer
,Store
, andonDeploy
have a dual implementation: they can either run in-process or set up the docker infrastructure to run themselves. The ability to run in-process makes debugging easier since you can just breakpoint anywhere in the code and step across component boundaries such as stepping from the client into the calls to the server.
-
The example script is run at build time. So functions like
createCustomerServer
andcreateExampleClient
are run at build time and in turn runnew Store
andonDeploy
, which register the relevant pieces that ultimately lead to the generation of the docker and docker-compose files. -
The example script is copied into each docker container, so it can be run again at runtime in each environment (in this case, the client and server environments).
-
When running in each environment, the script takes on a different persona. In the client environment, the script behaves as the client. In the server environment, the script behaves as the server. For example, in the client environment, the on-deploy callback is called, but it the server environment, it is not, even though the callback is instantiated in both.
-
The
id
function is used to generate a deterministic ID for each component -- an ID which is the same for each component at build time and in each runtime environment. This ID is used for all the wiring under the hood, such as the naming of environment variables, docker services, etc. IDs are also functions which can be called using tagged-template syntax to create child IDs, such as thecustomer-server
child ID of themy-app
root ID, and thedb
child ID of thecustomer-server
ID. So the full ID of the database ismy-app.customer-server.db
. If the client also wanted a database, it might have it's owndb
ID, but the full ID would bemy-app.example-client.db
to distinguish it from the server.
Execution in each environment (e.g. server and client) is split into two phases: startup and runtime. The startup phase should behave identically in each environment to instantiate the component tree of the app with exactly the same set of IDs. The startup phase should be deterministic and perform no I/O or random operations, so that the state of the process is identical at the end of each run of the startup phase in each different environment.
After startup, the "runtime" phase of execution can begin, where the app behaves differently in each environment (e.g. behaving as a particular client or server).
The concept of a persona is a way of describing the environment in which the application is running. For example, the application might be running in a client environment, a server environment, or a build-time environment. The application can behave differently in each environment, but the same code is used in each environment. The running persona is typically determined by the environment variables that are set in the environment. For example, the PERSONA
environment variable might be set to my-app.example-client
or my-app.customer-server.api
. The PERSONA
environment variable is set by the docker-compose.yml
file, which in turn is generated by the application script.
The library reifies the persona idea in the Persona
class which is constructed with a callback to be executed at runtime when that persona is active. The run
function in the library then inspects the PERSONA
environment variable and determines which Persona
to run.
The build-time persona is a special built-in personal. Like the other personas, it executes the same startup sequence that instantiates the application component tree. But it then specializes its behavior to perform build-time actions:
- Generating docker and docker-compose files based on the application component tree.
- Reading and writing persistent build-time data stores for things like secrets.
The concept behind this library is based on the idea behind Microvium snapshotting. Snapshotting in Microvium runs the application at compile time, and then a snapshot of that application is taken and stored in a binary format. The snapshot is then loaded at runtime and the application is resumed from the snapshot. The same snapshot can be distributed to multiple target environments, such as a server and a client, and they take with them the entire state of the application.
Through the snapshotting mechanism, you do not need to manually ensure that the startup phase is deterministic across environments.
This library provides a weaker form of the Microvium idea. Rather than deploying a snapshot of the application, the application is re-run in each environment (e.g. client, server, and build-time). The identity of objects is preserved across environments by using the deterministic ID generator. The IDs of each component can be used to identify the same component in other environments, allowing components to behave in a cohesive, distributed manner.
This is weaker than the snapshotting paradigm used in Microvium for two reasons:
-
Microvium snapshots are guaranteed to be identical whereas applications based on this library are relying on the developer to make sure that the application tree is the same each time the startup phase is executed in each environment.
-
This library relies on always manually defining the IDs to associate components in each environment, whereas in Microvium, all objects have implicit identity, and non-deterministic processes like random number generation can be used to generate unique IDs, which propagate naturally with the snapshot. In this library, if you used a RNG at startup it would just generate different numbers in each environment.
-
It's manual work to keep the startup sequence the same in every environment, including ID generation.
-
There is no compile-time mechanism to enforce that you call things at the right time. E.g. to stop you from calling
onDeploy
at runtime, or to indicate that you should only useonDeploy
at build-time. This is mitigated a bit by callingassertStartup
,assertRuntime
, etc. at the start of each function, as a check but also as an indication to the reader of the intended location of the execution, but this is not enforced by the compiler.
-
The POC version of this library uses docker-compose as an IaC foundation, but this is not suitable for production purposes. A future version could use Terraform or some equivalent.
-
There is no resource cleanup process implemented if a new deployment doesn't contain a persistent resource that a previous one did.
-
"Secrets" such as port numbers and passwords are currently given to every container, even if the container doesn't need it. A future version could be more selective.
-
Similarly, this POC assumes that everything is accessible to everything on the network, which is not suitable for a production environment. A future version could be more restrictive. This could be as simple as having a
dependsOn
clause in each service to declare the injected dependencies, and then this can be used to auto-generate the network restrictions and environment variables. -
Indexer functions in the Store are assumed to be immutable. If you change the implementation of an indexer function, you need to change the ID of the indexer (e.g. appending a version number to the ID).
-
Because indexes are not cleaned up, if you have orphaned indexes in the database, they will cause foreign key constraint errors when you try to delete data that's referenced by those indexes. You need to manually clean up the index tables (e.g. delete them with pgAdmin).
-
Postgres instances launched with docker-compose have their password configuration embedded into the volume after the first use, so if you change the password (or you delete the
passwords.json
file in the build output), you need to manually delete the volume.
A key-value store for JSON values, with support for indexing. The store is implemented as a Postgres database when running on docker-compose and implemented as an in-memory store when running in-process.
The store supports the following operations:
store.get
: Get a value by key.store.set
: Set a value by key.store.has
: Check if a key exists.store.del
: Delete a value by key.store.modify
: Atomic read-modify-write operation.store.allKeys
: Get all keys in the store.new store.Index
: Define a new index on the store.
Example:
interface Message {
text: string;
from: string;
to: string;
}
// At startup
assertStartupTime();
const store = new Store<Message>(id`store`)
// Let's say we want to index by the `from` field
const index = new store.Index(id`index`, (value) => [{
indexKey: value.from,
// Inline the `to` value in the index for quick access (optional)
inlineValue: value.to,
}]);
// ...
// Later, at runtime
assertRuntime();
// Get all messages from Alice
const fromAlice = await fromIndex.get('Alice');
console.log(fromAlice.map(m => `message from ${m.indexKey} to ${m.inlineValue}`));
Under the hood, indexes work by creating a separate table in the database for each index and keeping it in sync with the main table.
Register a new CLI command with new CliCommand(id, name, entrypointCallback)
.
When running in-process, the application will enter an interactive REPL loop where it accepts commands from the user and dispatches them to the corresponding entrypointCallback
by name.
When running in docker-compose, all the CliCommands are generated as independent shell scripts with access to the shared secrets (passwords, host names, and port numbers) necessary to connect to other components of the system (e.g. stores and API servers).
You can invoke a CliCommand
with command.run()
. If running in-process, this will parse the arguments and call the callback directly. If running in docker-compose, this will execute the generated shell script.
An HTTP server to expose an API.
Call apiServer.defineEndpoint
to register a new handler for a particular endpoint route. You can optionally specify which HTTP method to use (e.g. GET
or POST
).
The return value from defineEndpoint
is a function that you can call to make a request to the endpoint from a client. If running in-process, this will call the handler directly. If running in docker-compose, this will make an HTTP request to the generated API server using axios.
If running in-process, there is no actual HTTP server. Instead, the API server is implemented as a set of functions that you can call directly. This is useful for testing and debugging. If running in docker-compose, the API server is implemented as an express.js HTTP server in its own docker container.
const worker = new Worker(id, callback, opts)
A worker represents a long-running background process. If the system is in in-process mode, the worker is executed as a direct call to the callback at startup. If the system is in docker-compose mode, the worker is instantiated as a separate docker container and the callback is called when the docker container starts.
For example, the Worker
component is the basis of the ApiServer
component, where the callback
starts an express.js server.
All personas share the same startup epoch, which should run deterministically to establish the tree of components.
Use one of the following assertions at the beginning of each function that you want to run in a particular epoch. The assertion serves as a check but also as a statement of intention to the reader.
assertStartupTime
- In the startup phase of any persona (including the build-time persona).assertNotStartup
- Opposite ofassertStartupTime
.assertRuntime
- Running in a non-build persona and not in the startup phase.assertBuildTime
- Running in the build-time persona, either at startup or not.
onBuild
Register a callback to be executed at build time, such as to generated a file.BuildTimeFile
A file that is only written at build time, to keep information persistent across multiple builds.BuildTimeStore
A simple build-time key-value store built onBuildTimeFile
.gitIgnorePath
Add a path to the.gitignore
file in thebuild
directory.DockerService
Add a docker service to thedocker-compose.yml
file.DockerVolume
Add a docker volume to thedocker-compose.yml
file.DockerFile
Add a docker file to thebuild
directory.Secret
Add a piece of configuration information to pass from build time to runtime. TheDockerService
andCliCommand
components have built-in knowledge of the secrets and so can integrate them into the docker-compose file or.env
file respectively so that they can be rehydrated at runtime.Password
Create aSecret
that's randomly generated once and then persisted in thepasswords.json
file.Persona
Define a persona that can be run at runtime.Port
Allocate a new port number. This is persisted in theports.json
compile-time file so that it's consistent across builds. The port numbers start at350000
.