Architecting Out Loud
Back to posts
Cross-Repo Integration Testing · Part 1 of 2architecturesystemsdevopsmonorepo

134 Packages, 3 Monorepos, Zero Guarantees

Jan 15, 20266 min read

Every CI pipeline is green. Every repo passes its tests. And the staging is broken. Again.


The Setup

We run three monorepos. Together they contain 134+ packages, and each repo ships independently.

flowchart LR
    subgraph backend["backend (77 packages)"]
        direction TB
        B1["GraphQL server & API"]
        B2["Database tooling"]
        B3["Job workers & uploads"]
        B4["CLI"]
    end

    subgraph database["database (51 packages)"]
        direction TB
        D1["Schema framework & migrations"]
        D2["21 reusable SQL modules"]
        D3["Seed & test tooling"]
    end

    subgraph dashboard["dashboard (6 packages)"]
        direction TB
        U1["Next.js 15 admin app"]
        U2["Shared component library"]
        U3["Type-safe GraphQL data layer"]
    end
RepoPackagesWhat It Does
backend77GraphQL server, database tooling, job workers, file uploads, CLI (public)
database51Schema framework, migrations, 21 reusable SQL modules, seed tooling (private, pgpm workspace)
dashboard6Next.js 15 admin app, shared component library, type-safe GraphQL data layer (private)

The split isn't arbitrary. One repo is public open-source, two are private. The database repo is a pgpm workspace with its own package structure. Consolidating to one repo isn't practical, so we need to make multi-repo actually work.

Each repo has its own CI. Each repo's CI passes. Each repo merges independently.

And then things break.


What "Broken" Actually Looks Like

It's never a clean error. It's a slow realization.

  • Demo falls apart. You go days on staging without noticing. Then someone demos a feature and the data grid is empty or forms don't save.
  • Screenshot Slack thread. Frontend reports a staging bug, you spend an hour syncing environments just to reproduce it. Different backend version, different database state.
  • Fix breaks something else. A backend change fixes the API, but the dashboard's type-safe queries were generated against the old schema. There's no way to know until someone clicks through the app.
  • Version drift. Frontend runs an older backend locally. "It works on my machine", except the machine is an entire stack of mismatched versions.
  • kubectl debugging session. You're tailing logs across three services trying to figure out if it's a schema mismatch, type drift, or stale codegen. You don't know where to start.

The common thread: if no one actively looked at staging, no one knew it was broken.

Not the bugs themselves (those were usually small). It was the time spent figuring out where the bug was, which versions were involved, and whether the fix actually worked across all three repos.


The Multi-Monorepo Problem

This isn't a monorepo problem. In a monorepo, everything is in one repo, and a single CI run catches cross-package breakage. This isn't a microservices problem either. Microservices communicate over versioned APIs with contracts.

This is a multi-monorepo problem. Three repositories with deep, implicit dependencies. No API contracts between them, just shared assumptions about database schemas, GraphQL types, and generated code.

The failure is always in the seams between repos.

Here's how the cascade works:

flowchart LR
    A[DB: column type changes] --> B[Backend: codegen runs]
    B --> C[Dashboard: queries stale]
    C --> D[UI: renders garbage]
  • A SQL module changes a column type
  • The backend's GraphQL codegen produces different TypeScript types
  • The dashboard's type-safe queries (generated from those types) are now wrong
  • The data grid renders garbage, optimistic updates write incorrect data

Each repo's tests pass because each repo tests against its own snapshot of the world.

Manual smoke-testing doesn't scale. You can't manually verify 134 packages across 3 repos every time someone merges a PR. You need automated cross-repo integration testing. But where does it live?


The Integration Hub

The answer is a fourth repo, a dedicated integration-hub, that exists solely to test the three repos together.

It contains the three monorepos as git submodules. It has its own test suite: end-to-end tests that exercise the full stack with database, backend, and dashboard running together. It doesn't own any application code. Its only job is to verify that a specific combination of the three repos actually works.

flowchart TB
    subgraph hub["integration-hub repo"]
        direction TB
        SM_BE["git submodule: backend"]
        SM_DB["git submodule: database"]
        SM_UI["git submodule: dashboard"]
        E2E["E2E test suite"]
    end

    BE["backend repo<br/>77 packages"] --> SM_BE
    DB["database repo<br/>51 packages"] --> SM_DB
    UI["dashboard repo<br/>6 packages"] --> SM_UI

    SM_BE --> E2E
    SM_DB --> E2E
    SM_UI --> E2E

    E2E -->|pass| MERGE["Advance submodule pointers on main"]
    E2E -->|fail| ALERT["Alert: combination is broken"]

The submodule pointers on the hub's main branch always point to commits that have been tested together and passed. If you check out main, you get a known-good combination of all three repos.


The Known-Good Stack Pointer

The core concept is the known-good stack pointer, a triple of commit SHAs that represents a tested, working combination of the entire stack:

backend:   a1b2c3d
database:  e4f5g6h
dashboard: i7j8k9l
graph LR
    subgraph "Known-Good Stack Pointer"
        direction TB
        BE_SHA["backend @ a1b2c3d"]
        DB_SHA["database @ e4f5g6h"]
        UI_SHA["dashboard @ i7j8k9l"]
    end

    BE_SHA --- TESTED["E2E Tested Together"]
    DB_SHA --- TESTED
    UI_SHA --- TESTED
    TESTED -->|"All pass"| SAFE["Safe to Deploy"]

This triple is what gets deployed. Not "latest main from each repo" (that's untested). The stack pointer is the last combination that passed integration tests. It might lag behind the individual repos by a few commits. That's fine. That's the point.

When a new commit lands on any of the three repos, the integration hub proposes a new candidate triple: the current known-good versions of the other two repos, plus the new commit. If E2E passes, the stack pointer advances. If it fails, the pointer stays where it is, and someone gets alerted.


What's Next

This is the concept. A dedicated integration repo. Git submodules as version pins. A known-good stack pointer that only advances when tests pass.

We're not claiming this is the final answer, but it gave us something we didn't have before: confidence. Confidence that when the pointer advances, the stack actually works. Confidence that when it doesn't, we find out immediately, not during a demo.

In Part 2, we'll cover how to automate this entirely with GitHub Actions: rolling PRs, cross-repo dispatch, automated merging, and failure handling.


Resources