NuMart is NULogic's serverless, headless commerce accelerator — a battle-tested GraphQL BFF layer orchestrating a mesh of internal microservices and 20+ external SaaS platforms. Production-proven on a major North American apparel retailer. Cloud-agnostic runtime (AWS Lambda today, GCP Cloud Run in flight). This portal is a live walkthrough of the platform, the code, and how we deliver.
These are the accelerator assets inside the NuMart platform — reusable on every engagement. Clients do not pay to rebuild this.
Microservices
8
GraphQL + 7 domain services
Lambda Resolvers
104
Auto-discovered at deploy
GraphQL Schemas
28
Unified BFF schema
Integration Modules
28
Cart, payments, search, fraud…
External SaaS
20+
Braintree, Bloomreach, SFMC…
Storefronts Served
3
Multi-brand, multi-locale
Cache Hit Target
85%
Redis ElastiCache
Cold-start Defense
0ms
Warmup pings every 5 min
High-Level Architecture
A thin GraphQL gateway over a mesh of domain services.
Each GraphQL field is an independently deployed Lambda. Internal microservices expose REST APIs; the @numart-gcp/lib-domain/request package standardises every inter-service call. External SaaS is reached via NAT gateway.
Every internal call goes through domainRequest.<service>.<method>(). Authentication, retries, channel context, and error shape are uniform across the mesh.
🔐
identity
User authentication, Cognito token management, session refresh, account CRUD.
Storefront configuration, feature flags per brand & region, locale and currency bindings.
REST
🧩
components · pim
CMS-driven UI component data and long-tail product metadata.
REST
External SaaS Layer
Twenty+ third-party systems, fully pre-wired.
Every vendor below has production-ready integration code with retry policies, credential resolution via Secrets Manager/SSM, and a consistent error envelope.
💳
Braintree · PayPal
Credit-card auth, Venmo, PayPal checkout, tokenization. Pre-auth via custom gateway.
RESTOAuth2
🎁
SVS Gift Cards
Gift-card balance, virtual card issuance, pre-auth via SOAP envelope.
Real-time tax calculation, BNPL checkout, alternative payment rails.
REST
Build vs Buy Analysis
What the platform gives you vs what we build per client.
Scope clarity on Day 1. The left column is already in the box — wired, tested, and running in production. The right column is the work that is genuinely unique to the client.
✓ Out of the box — NuMart gives you this
GraphQL schema for the entire commerce journey (PDP, PLP, cart, checkout, account, orders)
Activities: MAO / OMS order-dispatch wiring. Warmup tuning. Load test at 3× peak traffic. Cache-hit-rate tuning. Observability stack. Deliverables: Load-test report, runbooks, alert catalogue. Risks: OMS partner SLAs around peak throughput.
Phase 5 · 2 weeks
Cutover & Hypercare
Activities: Canary cutover per channel (usually outlet first), PSP vault swap, DNS switch. 24×7 war room for 2 weeks. Deliverables: Go-live, post-cutover metrics, handover doc. Risks: Black-Friday proximity — cutover windows are short.
Risk Register
What goes wrong — and what we do about it.
These are the specific failure modes we've hit on comparable headless-commerce migrations. Every mitigation is something we can point to in code today.
Critical
PSP vault migration during cutover
Moving tokenized cards to a new PSP without breaking saved wallets is the single highest-risk step on go-live day.
Mitigation: Dual-write + shadow auth for 14 days pre-cutover. Canary by brand. Vault diff report nightly.
Owner: Payments Tech Lead
Critical
Cold-start regressions on Black Friday
Lambda concurrency spikes cause 3-second cold starts that cascade into cart-abandonment.
Mitigation: Warmup plugin with 25ms hot loop; provisioned concurrency on top-15 resolvers during peak; forced concurrency floor.
Owner: Platform SRE
High
Inventory oversell during flash events
Stale inventory cache + concurrent add-to-cart results in overselling constrained SKUs.
Mitigation: Short 2-min TTL on availability; pre-reserve at add-to-cart for low-stock SKUs; MAO callback to invalidate cache.
Owner: Inventory SME
High
Secrets leakage in source maps
Bundled webpack source maps exposing API keys to CloudWatch or client error tools.
Mitigation:nosources-source-map in webpack; secrets loaded at cold-start only; IAM policy bars console GetSecretValue from dev accounts.
Owner: Security Engineer
High
MAO order-sync lag
SQS backlog on MAO delivery delays order-status updates — customers see "pending" for hours.
Mitigation: Per-region FIFO queues, DLQ with auto-replay, synthetic order monitor on 5-min cron.
Owner: Order Domain Lead
Medium
SFMC journey breakage on email-template drift
Marketing team updates templates; server-side event payload no longer matches schema.
Mitigation: Contract tests between NuMart and SFMC on every marketing template change; alert to #commerce-marketing.
Owner: Marketing Tech Lead
Medium
Forter false-positive spike
Promotional events trigger fraud model retraining; false-positive rate spikes for 48h.
Mitigation: Pre-event Forter tuning session; fall-back to manual review queue sized for 3× normal volume.
Owner: Fraud Ops
Medium
Bloomreach ranking regression
New product taxonomy confuses ranking; conversion drops on critical categories.
Mitigation: Pre-launch A/B on top-20 categories; roll-back path retains the prior ranking JSON for 30 days.
Owner: Merchandising Tech
War Room Stories
Real incidents. Real lessons. Already in the code.
We don't pretend nothing has gone wrong. Here's a sample of what did — and the fixes that are now standard in NuMart.
The JWT Refresh Death Spiral
— Platform Lead, post-mortem 2023
On a peak Saturday, an Apollo Link retry policy interacted badly with the Nucleus GraphQL token-refresh endpoint. Each expired token produced two refresh calls; each failed refresh produced three retries. 180k resolvers fan-out'd into 1.1M identity calls in under 4 minutes. Identity service throttled; every downstream cart call 401'd. Checkout gross at 0 for 11 minutes. We learned that client-side exponential backoff is not enough — you need a single-flight lock on refresh.
Baked into NuMart:apollo-link-error now uses a lock-gated single-flight refresh with jittered backoff. Added a circuit-breaker on identity with a 500ms half-open interval. This pattern is live in app/lib/apollo-client.js.
The Inventory Cache That Went Negative
— Inventory SME, post-mortem 2022
A 15-minute cache TTL on store-availability data collided with a flash sale on a style that had four units across the US. Because every resolver read from cache while MAO updates were still in flight, we sold 47 of that style. Thirty-seven customer service apology calls later, we learned: TTL-based invalidation is wrong for low-stock SKUs. Event-driven invalidation is the only correct answer.
Baked into NuMart: MAO pushes availability deltas to an SQS fan-out that invalidates Redis keys per SKU. Inventory TTL dropped to 2 minutes as safety net. SQL_INV_FLAG feature flag lets you fall back to SQL during MAO outages.
The Clean Black Friday
— SRE lead, 2024
2024 Black Friday: 11.4× baseline traffic, peak TPS of 2,340 on the cart-item-add Lambda. Zero P1 incidents. The combination of warmup pings at 25ms intervals, provisioned concurrency on the top 15 resolvers, and Redis pre-warming of the homepage PDP set meant cold starts never bit. Every engineer in the war-room slept before 2am on Friday. That's the bar we run the platform at.
Baked into NuMart: The Black-Friday playbook (warmup schedules, provisioned-concurrency thresholds, cache pre-warm scripts) is now a standard capability in the platform. Clients inherit it.