Architecting Out Loud
Back to posts
kubernetesdistributed-systemsdevopsblockchain

Init Containers and Sidecars for Distributed State: Genesis Generation at Scale

Feb 22, 20269 min read

You have ten nodes. They all need to agree on a shared starting state before any of them can do anything useful. One node creates it. The rest need to wait, fetch it, and configure themselves. Welcome to every distributed system's least favorite problem.


The Bootstrap Problem

Distributed systems don't just start. They bootstrap. And bootstrapping has ordering constraints that most orchestration tools ignore.

Consider any system with a leader-follower initialization pattern. A database cluster where the primary must initialize the schema before replicas connect. A message broker where the controller must register the cluster before brokers join. A blockchain network where one validator must create the genesis file before others can participate.

The requirements are always the same:

  • One node must go first and produce shared state
  • Other nodes must wait until that state is available
  • Each node must configure itself using the shared state
  • The main process can only start after all initialization is complete

Kubernetes init containers solve this natively. They run sequentially, each completing before the next starts, and all of them finishing before the main container launches. No custom orchestration needed.

The ordering guarantee is the feature. Init containers don't run concurrently. They can't. That constraint is exactly what you want for distributed initialization.


The Genesis Sequence

In Starship, the bootstrap validator (the first node in the network) runs a sequence of init containers before the main process starts. Two containers are always present. The others are injected based on what the chain's config enables.

flowchart TD
    A["1. init-build-images (optional)<br/>Build chain binary from source"] --> B["2. init-genesis<br/>Create genesis.json, generate node keys"]
    B --> C["3. init-config<br/>Update config.toml, apply patches"]
    C --> D["4. init-faucet (optional)<br/>Copy faucet binary if enabled"]
    D --> F["Main containers start<br/>(validator + exposer sidecar)"]

Each step depends on the output of the previous one:

Init ContainerWhat It DoesWhen It Runs
init-build-imagesBuilds the chain binary from source, sets up cosmovisor with upgrade pathsOnly when the chain config specifies a build step (custom binary, cosmovisor upgrades)
init-genesisCreates genesis.json, generates node ID and consensus keys, adds accounts and balancesAlways
init-configUpdates config.toml and app.toml, applies genesis patches via jq mergeAlways
init-faucetCopies the faucet binary into positionOnly when faucet.enabled: true in the chain config

The two required containers (init-genesis and init-config) do the heavy lifting. init-build-images only appears when a chain needs a binary built from source rather than pulled from a pre-built image. init-faucet only appears when the chain config enables a faucet. The system composes the init container list dynamically based on what's actually needed.

All of these containers share a single writable volume. Each one reads from and writes to it, building up the node's state incrementally.


The Shared Volume

The mechanism behind all this file sharing is a Kubernetes emptyDir volume. It's an ephemeral, pod-scoped directory that exists for the lifetime of the pod. Every container in the pod (init containers, main container, sidecars) can mount it and see the same files.

The pod defines three volume types: one writable emptyDir for state, and ConfigMap volumes for read-only inputs.

volumes:
  - name: node
    emptyDir: {}              # writable state, shared across all containers
  - name: addresses
    configMap:
      name: keys              # mnemonic phrases and account keys
  - name: scripts
    configMap:
      name: setup-scripts-osmosis-1  # init shell scripts
  # - name: faucet            # second emptyDir, only when faucet.enabled: true
  #   emptyDir: {}

Each init container mounts the same volumes. The emptyDir goes to the chain's home directory. The ConfigMaps go to fixed paths.

# init-genesis container
volumeMounts:
  - mountPath: /root/.osmosisd   # emptyDir: writable state
    name: node
  - mountPath: /configs           # ConfigMap: key material
    name: addresses
  - mountPath: /scripts           # ConfigMap: setup scripts
    name: scripts

The init containers read scripts from /scripts and key material from /configs, then write their output to the emptyDir volume. Each container picks up where the previous one left off.

flowchart TD
    subgraph emptyDir["/root/.osmosisd (emptyDir volume)"]
        direction TB
        G["config/genesis.json"]
        NK["config/node_key.json"]
        CT["config/config.toml"]
        AT["config/app.toml"]
        NID["config/node_id.json"]
    end

    IG["init-genesis"] -->|writes| G
    IG -->|writes| NK
    IC["init-config"] -->|reads genesis, writes| CT
    IC -->|writes| AT
    IC -->|writes| NID
    EXP["exposer sidecar"] -->|reads| G
    EXP -->|reads| NK
    EXP -->|reads| NID
ContainerReads from emptyDirWrites to emptyDir
init-genesis(nothing yet)config/genesis.json, config/node_key.json
init-configconfig/genesis.jsonconfig/config.toml, config/app.toml, config/node_id.json
exposer sidecarconfig/genesis.json, config/node_id.json, config/node_key.json(nothing, read-only)
validator (main)everythingdata directory

emptyDir is ephemeral and pod-scoped. No PVCs, no external storage, no cross-pod lock contention. If the pod restarts, the volume persists. If the pod is deleted, it's gone. For initialization state that gets regenerated on startup, that's exactly right.


Validators Joining

The bootstrap node creates the network. Additional validators need to join it. Their init sequence is different:

flowchart TD
    A["1. init-build-images (optional)<br/>Build chain binary"] --> B["2. wait-for-chains<br/>Poll genesis exposer until ready"]
    B --> C["3. init-validator<br/>Recover keys, fetch genesis.json"]
    C --> D["4. init-config<br/>Set genesis node as persistent peer"]
    D --> F["Main container starts<br/>(validator daemon)"]
    F --> G["postStart hook<br/>Request tokens, submit create-validator tx"]

The critical difference is step 2: wait-for-chains. This init container polls the genesis node's exposer sidecar in a loop, waiting until the node ID is available. It's a distributed readiness gate implemented as a blocking init container.

The validator joining process works like this:

  • wait-for-chains polls the genesis pod's exposer service until it responds with a valid node ID
  • init-validator recovers the validator's private key from a ConfigMap (using the pod's hostname as a key index) and downloads genesis.json from the genesis node's exposer
  • init-config fetches the genesis node's ID and configures it as a persistent_peer in config.toml
  • After the main container starts, a postStart lifecycle hook requests tokens from the faucet and submits a create-validator transaction

The pod hostname determines the key index. Pod validator-1 gets key 1, validator-2 gets key 2. No external coordination service needed.


The Exposer Sidecar

Here's the piece that ties it all together. Init containers handle sequencing within a single pod, but they can't share state across pods. That's where the exposer sidecar comes in.

The genesis StatefulSet doesn't just run the blockchain node. It also runs an exposer sidecar container on port 8081. This is a lightweight HTTP service that serves the genesis file and node metadata to any pod that needs it.

flowchart LR
    subgraph genesis-pod["Genesis Pod"]
        direction TB
        INIT["Init Containers<br/>(sequential)"]
        VAL["Validator<br/>(main container)"]
        EXP["Exposer<br/>(sidecar, port 8081)"]
    end

    subgraph validator-pod["Validator Pod"]
        direction TB
        WAIT["wait-for-chains<br/>(polls exposer)"]
        VINIT["init-validator<br/>(fetches genesis)"]
    end

    EXP -->|"GET /node_id"| WAIT
    EXP -->|"GET /genesis"| VINIT

The exposer mounts the same emptyDir volume as the init containers. Environment variables tell it where each file lives:

Environment VariablePath on Shared Volume
EXPOSER_GENESIS_FILE{home}/config/genesis.json
EXPOSER_NODE_ID_FILE{home}/config/node_id.json
EXPOSER_NODE_KEY_FILE{home}/config/node_key.json
EXPOSER_PRIV_VAL_FILE{home}/config/priv_validator_key.json

Once the init containers finish and the sidecar starts, all that state becomes available over HTTP. The API surface is deliberately small:

EndpointReturnsUsed By
GET /node_idThe genesis node's peer IDwait-for-chains (readiness gate)
GET /genesisThe full genesis.json fileinit-validator (genesis download)
GET /keysValidator public keysOther validators joining the network

This is the coordination layer between pods, and it works without any external infrastructure. No shared persistent volumes across pods, no S3 buckets, no distributed file system. Just an HTTP endpoint backed by a local volume.

The exposer turns a single pod's local state into a service that other pods can depend on. Init containers sequence the writes. The sidecar serves the reads.

The pattern avoids a common anti-pattern: trying to share files between pods via persistent volumes or external storage. Persistent volumes create lock contention and ordering headaches. External storage adds latency and a dependency on infrastructure that might not be available during bootstrap. HTTP is simpler. The genesis pod serves the data, validators fetch it. If the exposer isn't ready yet, the wait-for-chains init container just keeps polling.


Ordering Guarantees

Three mechanisms enforce correct ordering across the system:

Sequential init containers. Kubernetes guarantees that init containers run one at a time, in order. If init-genesis fails, init-config never starts. If init-config fails, the main containers never start. This is not eventual consistency. It's strict sequential execution.

Idempotency checks. The init-genesis container checks whether genesis.json already exists on the volume. If it does, the container exits immediately. This makes pod restarts safe. A pod that crashes and restarts won't regenerate a different genesis file, it will reuse the existing one.

Polling readiness gates. The wait-for-chains init container in validator pods polls the genesis node's exposer endpoint in a loop. It doesn't proceed until it gets a valid response. No timeout assumptions, no sleep-and-hope. Just poll until the dependency is actually ready.

These three together give you something that's surprisingly hard to get in distributed systems: deterministic initialization ordering across multiple pods without a central coordinator.


Beyond Blockchain

The blockchain genesis use case is specific, but the pattern is general. Any system with leader-follower initialization can use this approach:

  • Database clusters: Primary initializes the schema and replication slots. Replicas poll a readiness endpoint, fetch the base backup, configure themselves as standbys.
  • Message brokers: The controller node registers the cluster. Brokers poll until the controller is ready, then join with the correct cluster ID.
  • ML training clusters: The coordinator node sets up the training job. Workers poll until the job parameters are available, then join the distributed training run.
  • Service mesh bootstraps: The control plane must be healthy before data plane proxies can fetch their configuration.

The ingredients are the same every time: ordered init containers for single-pod sequencing, an exposer sidecar for cross-pod state sharing, and a polling init container as a distributed readiness gate.


Resources