Design Infrastructure for AI Teams

How I built the design infrastructure that lets Drova’s entire team ship on-brand, at speed, without me in the loop

The artefact of this work isn’t a screen, a deck, or a redesign. It’s a piece of design infrastructure: a system that puts Drova’s design rules into the working context of every AI session, every prototype, every PR, so anyone in the company can produce output that honours the brand, on the first pass, without a design review cycle.

I didn’t design eight marketing assets in a single session. I designed the system that designed them.

The problem

Drova ships fast. Six product modules, a Sheila AI layer woven through everything, a marketing surface that has to feel like the same company, and a four-person marketing headcount expected to produce cross-channel output weekly.

As Head of Design, I couldn’t be in every prototype, every PR, every Figma file, every Slack thread where a teammate was about to invent a hex.

Two incidents made the cost legible.

A first cut of RunSafe’s inline Sheila pattern came back with ten distinct design drifts: invented hexes, off-stack typography, ALL CAPS column headers, filled MUI icons where the system spec says outlined. Not because the rules didn’t exist. Because the rules lived in Confluence, which nobody opens mid-session when they’re building something in Claude. A teammate prototyping a new flow couldn’t get the typescale right for the same reason. The rules weren’t wrong. They just weren’t accessible at the moment of work.

The real problem wasn’t a documentation gap. It was an infrastructure gap. And infrastructure is where I went.

My approach

I had three options. Write more documentation (same problem, new location). Add more design review gates (doesn’t scale to the volume of AI-driven output Drova was now producing). Or embed the rules into the working context, at the moment of work.

The third option is the only one that scales with AI-era velocity. When an engineer asks Claude to build a modal, Claude already knows the spacing system, the type stack, the casing conventions, the forbidden hex values. The rules become enforced by construction, not policed by review.

The delivery mechanism was Claude Code’s plugin system: installable knowledge packs that any Drova teammate can pull into any repo, any session.

What I built

The drova-design plugin

I structured the plugin as a cascade, borrowing the mental model from CSS.

Universal foundations sit at the base: brand fonts, brand colours, logo rules, photography, illustration. Two context-specific skill files inherit from them, one for product UI and one for marketing. Three specialist layers sit on top: a UX pattern library, a prototype preflight that forces explicit token citation before Claude produces any component, and a design research methodology covering intake through synthesis.

Five decisions shaped the architecture.

Universal foundations in their own file. Small, always-loaded, never bloated with context-specific rules. When someone asks what the Drova brand font is, they shouldn’t trigger a 5,000-line NDS context load to find it.

Product and marketing typescales kept separate. Product UI runs 12/16/20/24/32/40/80. Marketing runs larger. The temptation to merge them and add qualifiers creates sprawl. Separate files with the route built into the skill description work cleanly.

Loose coupling over duplication. The plugin points at other teams’ work by name rather than duplicating content. When Drova’s voice infrastructure changes, my design rules don’t need updating: they just point.

Designed to degrade gracefully. The interface repo lives on Bitbucket, gated by infrastructure access. If the token source is unreachable, the plugin carries a 3,114-line snapshot bundled inside. It works offline, in any repo, without infrastructure access.

Explicit scope boundaries. Every SKILL.md has a “what this does not cover” section. Restraint is a feature. A plugin that tries to answer everything answers nothing well.

Sibling plugins

I also shaped two sibling plugins that form the voice and copy half of Drova’s brand infrastructure: drova-voice-tone (Sheila’s three-field suggestion structure, personalised outputs by job title and industry, report narrative, a voice and tone self-audit linter) and drova-product-copy (generation and review of every product copy surface, from button labels to empty-state heroes).

Together, these three plugins are Drova’s design, voice, and AI personality infrastructure: accessible to every team, every session, every prototype.

Proving the system under load

The test pass

In May 2026 I ran the plugin against Drova’s AI GTM Automation Engine, a seven-layer system that takes a campaign brief and produces every asset across every channel. The question I was testing: could the engine produce production-quality marketing assets, on-brand, across every channel, at speed?

The first pass failed. Not visually: the output would have passed a casual review. It failed on-system. The AI had picked font sizes by eye from inline legacy template values rather than from the canonical typescale.md. The option set it surfaced was too narrow, based on which templates it had just read in the folder rather than on the full campaign model. Direct mail didn’t appear, even though I’d shipped three PostGrid templates the week prior. I called it.

The second-pass brief was unambiguous: use every available content-type skill, declare a body anchor on every asset, derive every h-level from the canonical typescale, audit against it before declaring done. The result was eight production assets across five channels in a single session.

Asset	Channel	Output
Stat-hook LinkedIn tile	linkedin-organic	1080×1080 PNG
Pull-quote LinkedIn tile	linkedin-organic	1080×1080 PNG
Braze resource email header	email-marketing	1960×1084 PNG
Sales-enablement one-pager	sales-enablement	A4 PDF
Industry-edition report (3pp)	industry-edition-report	A4 PDF
PostGrid Tier-1 ABM letter	direct-mail	A4 PDF
PostGrid postcard front	direct-mail	888×600 PNG
PostGrid postcard back	direct-mail	888×600 PNG

Every one on the marketing typescale. Every Reckless Neue use restricted to its four documented reserved appearances.

What the test surfaced

The most valuable output wasn’t the assets. It was the gaps.

The SOLAR hex in my plugin (#E9EAE1) didn’t match the SOLAR hex in every production template (#F8F7F0). The AI had followed the templates, which is the right behaviour. The drift was mine, an inconsistency I’d been carrying and hadn’t acted on. The test didn’t test the AI. It tested me.

I flagged it publicly in the team Slack update, along with the fact that three of eight assets used existing templates and five were authored inline, with the reason per asset. One of the five was because the AI had substituted a fabricated phone number for a merge-variable placeholder. The number fell in an Ofcom drama-use range so it was harmless, but the fabrication was the mistake. I named it. Honesty in that moment is leadership. Letting AI’s quiet overreach propagate through team communication is how design standards degrade.

Deepening the system

Vocabulary as architecture

The same session surfaced a structural question in the design skill layer. The graphic-design SKILL.md I’d been working from was 350 lines covering social tiles, multi-page reports, PDF pipelines, font fidelity, and PostGrid rendering, all combined. The AI’s first instinct was to reorganise it by render pipeline: social tiles, PDF collateral, PDF reports.

I caught the miss. Marketers don’t think in pipelines. They think in content types: brochures, flyers, one-pagers, direct mail. The vocabulary chosen at the architecture layer constrains every downstream interaction with the system. The workforce became eight content-type skills named in the language of the system’s actual consumer: social-media, email-marketing, brochures, flyers, one-pagers, sales-enablement, reports, direct-mail.

Grounding AI in the design system

While drafting PostGrid templates for the new workforce, I asked what the AI had used for the greys in its first version. Several values were invented: not from the documented TEXT ramp, not from the GREY palette. A Dawn radial gradient on the postcard: improvised. The CTA pill: uppercase with letter-spacing, which contradicts the drova.com Button component spec.

This is the question that defines AI-leveraged design work. Not whether the AI can produce a template (it can) but whether the AI is grounded in your design system or making things up. The answer is always: making things up, unless you force grounding.

I replaced the invented greys with TEXT MEDIUM #3F4E66 and GREY 200 #E5E1E1. Dropped the Dawn gradient (no documented print surface treatment). Rebuilt the CTA pill to match the actual Button component: sentence case, no letter-spacing, no arrow.

I also added a foundation rule that hadn’t existed: h1, h2, and h3 never carry a terminal full stop. One line in drova-design-foundations, propagating universally across product UI, marketing, print, and social on next plugin install. A single editorial decision that reaches every customer-facing surface Drova ships, encoded once, inherited everywhere.

Typography as calibrated craft

PostGrid doesn’t support custom fonts. Drova’s web-safe fallback mapping (Reckless Neue to Times New Roman, NeutrifPro to Poppins) had existed as policy but had never been calibrated at the per-level typescale.

I built a contact sheet: brand-baseline next to web-safe-fallback at every level, identical starting values, a long-body specimen for paragraph rhythm. I worked through it level by level.

h2 Times New Roman bumped 1.094× to match the ascender/descender envelope, matching visible mass rather than x-height. h3 through h6 Poppins reduced 0.917× because Poppins reads visually larger at equivalent nominal size. Body level diverges intentionally: brand stays at 1.125 leading because NeutrifPro feels more decisive there; fallback goes to 1.2 because Poppins’s looser default feel suits it.

These weren’t aesthetic preferences. Each value was calibrated against the contact sheet and locked into the typescale itself, not left as per-template overrides. The contact sheet stays in the repository as the canonical visual reference. Future changes must re-validate against it.

The outcome

The plugin system works. Product, marketing, engineering, and brand all consume it. Every prototype now passes a design drift grep on the first pass. Every Sheila surface uses canonical token names. The cost of finding a design drift moved from the day before a release to the moment the code is written.

The design system is now AI-native. The plugin treats Claude as a first-class consumer. Every SKILL.md is written so an AI agent reads it correctly: trigger phrases, decision trees, explicit scope boundaries. This is design language designed for machine readers.

Eight production assets across five channels in a single session. Every one on the marketing typescale, every Reckless Neue use restricted to its four reserved appearances, every Earth-band closing quote following print-PDF fidelity rule eleven exactly. Full per-asset typescale audit with pass/fail verdict on every element, for every asset.

Eight stacked PRs across two repositories. Approximately 1,300 lines of substantive change: new foundation rule, calibrated typescale, eight-skill graphic-design workforce, plugin v1.5.0.

Ten documented gaps ranked P0/P1/P2. Each one a system improvement I now know to make. Each gap raised the ceiling for the next cycle.

What I learned

The most leveraged thing a senior designer can do in 2026 isn’t to design more screens. It’s to design the system that ensures every screen, built by anyone, with any tool, at any speed, honours the brand.

This project also clarified what the pushback moment looks like in AI-era design work. The AI produced v1 output that was “good enough.” I said: no, the rails exist for a reason. That correction, and the discipline to make it every time rather than letting “good enough” pass, is what keeps a design system from quietly degrading under AI-driven volume. It’s not a technical skill. It’s a design leadership one.

The arc of the role I’m building at Drova: less time on the critical path for every asset, more time authoring the conditions under which every asset arrives already right.

Chantelle Staples

Your cart (items: 0)

Designing the rails AI drives on