Skip to content

Findings surfaced during documentation pass (2026-05-28)

While writing the reference docs in this directory, several real bugs and gaps were surfaced. Listed here so they aren't lost. These are not opinions or design suggestions — each is a concrete code-grounded finding with a file:line cite. They should be triaged into GitHub issues on QuantaTradeAI/platform.

Triage state (2026-05-29)

All 10 findings closed. Platform-side scope merged for every item; one cross-repo follow-up tracked separately (platform#30 for the exchange-core Java change to honour LIMIT_POST_ONLY at fill time — F7's plumbing landed in platform#29 so the wire layer is ready).

Finding Issue PR State Severity
F1 platform#7 platform#18 ✅ merged 🚨 high
F2 platform#8 platform#23 ✅ merged 🚨 high
F3 platform#9 platform#22 ✅ merged 🚨 high
F4 platform#10 platform#26 ✅ merged 🟠 medium
F5 platform#11 platform#21 ✅ merged 🟠 medium
F6 platform#12 platform#24 ✅ merged 🟠 medium
F7 platform#13 platform#29 ✅ merged (platform-side); engine: #30 🟠 medium
F8 platform#14 platform#27 ✅ merged 🟡 low
F9 platform#15 platform#28 ✅ merged 🟡 low
F10 platform#16 platform#25 ✅ merged 🟡 low

F1 follow-ups opened: while fixing F1, two sibling broken WebSocket channels were surfaced — marketdata.ticker.* (no publisher, platform#19) and marketdata.orderbook.* (no publisher, platform#20). Same root cause; separate fixes.

🚨 High — likely user-visible

F1. WebSocket trade fan-out is broken on main

✅ Merged: platform#18 (ws-gateway now subscribes to trades.executed.*, strips private fields, drops maker side). Sibling brokenness for ticker + orderbook tracked separately: #19, #20.

Where: ws-gateway subscribes to marketdata.trade.*; no service publishes on that subject. order-router publishes trade events on trades.executed.<SYM>. Effect: Real-time trade messages do not reach connected WebSocket clients today. See: 08-event-bus.md §Inventory, 05-services-reference.md §ws-gateway. Fix sketch: Either change ws-gateway to subscribe to trades.executed.*, or have order-router (or a new market-data fan-out service) republish on marketdata.trade.*. The choice depends on whether marketdata.* is supposed to be the public-stream namespace (transformed trade shape) or just a synonym for trades.executed.*.

F2. No server-side admin RBAC enforcement

✅ Merged: platform#23. Adds UserTypeGuard that reads the USER_TYPE_KEY metadata, plus an @AdminOnly(...types) composite decorator that pairs AuthGuard + UserTypeGuard + RequireUserType in one annotation. 11 unit tests cover every branch including fail-closed when request.user is missing. No controllers gated today (api-gateway has no admin REST routes); the defence makes the next admin route safe by default.

Where: api-gateway/src/.../decorators.ts:56 defines RequireUserType('admin') but no controller uses it anywhere. Every other admin/operator gate is client-side only. Effect: A user-tier JWT can hit admin-flavoured endpoints if any exist. Currently no admin-flavoured REST endpoints are mounted in api-gateway (the admin-panel goes direct to the engine via gRPC-web), but the gap means any future admin endpoint added without explicit thought will inherit zero auth. See: 06-admin-panel.md §RBAC posture.

F3. timeInForce persisted but ignored at engine call site

✅ Merged: platform#22 (mapOrderType now takes (type, timeInForce?); extracted to engine-types.ts with 6 unit tests).

Where: services/order-router/src/main.ts:1027 — when calling the matching-engine Place RPC, only order.type drives the engine OrderType mapping. order.timeInForce is persisted in the orders table but never sent to the engine. Effect: A user requesting ioc or fok gets a gtc order — silently. Risk-checker accepts the field, the DB stores it, the engine never sees it. See: 02-trading-system.md §3 Time-in-force.

🟠 Medium — silent but consequential

F4. Fee tier user volume is hardcoded mock data

✅ Merged: platform#26. New getUserVolume30d(userId) sums price × quantity across Trade rows in the 30d window via Trade(userId, createdAt desc) index; cached in Redis at fees:userVolume30d:<userId> for 5 min. 8 unit tests including the Starter / Bronze / VIP tier transitions.

Where: services/api-gateway/src/fees/fees.service.ts:37-45userVolume30d = 25000 is hardcoded. Effect: Every user is on the same fee tier regardless of actual volume; maker rebates / VIP discounts are inactive. See: 02-trading-system.md §Fee tiers.

F5. settleTrade has no idempotency guard

✅ Merged: platform#21 (findFirst short-circuit on referenceType='trade' + referenceId=trade.id at the top of the $transaction callback in settleTrade).

Where: services/ledger-service/src/ledger.ts:327-433 does not check for an existing ledger_entries row with the same transactionId before settling. Compare with credit (ledger.ts:84-91) and debit which both short-circuit on hit. Effect: A duplicate trades.executed NATS delivery would re-settle the trade — double credits, double debits, double fees. The existing test (ledger.test.ts:848) asserts idempotency, but only by mocking ledgerEntry.findFirst to return an existing row; the production code never calls findFirst in settleTrade, so the mock returning data has no effect on real behaviour. See: 03-ledger-accounting.md §4.2. Fix sketch: Add a findFirst({ where: { referenceType: 'trade', referenceId: trade.id } }) short-circuit at the top of the $transaction callback in settleTrade.

F6. Transaction isolation level requested in tests but not in production code

✅ Merged: platform#24. All 5 $transaction call sites now pass { isolationLevel: 'Serializable', maxWait: 5000, timeout: 10000 }. Wrapped in withSerializableRetry() — Postgres 40001 / Prisma P2034 retry with 50/100/200/400/800 ms exponential backoff + jitter. 7 new wrapper tests + the 2 existing isolation tests now pass against real code instead of failing silently.

Where: services/ledger-service/src/ledger.test.ts:1098-1133 asserts Serializable isolation level on $transaction calls; production code at ledger.ts:93,174,240,289,342 passes no options. Effect: credit / debit / lock / unlock / settleTrade run at Postgres default (READ COMMITTED), not Serializable. Under load this opens phantom-read and lost-update windows for balance writes that the test suite claims are eliminated. See: 03-ledger-accounting.md §6 Concurrency posture. Fix sketch: Pass { isolationLevel: 'Serializable' } as the second argument to every db.$transaction(...) call in ledger.ts. The test already specifies the expected shape.

F7. LIMIT_POST_ONLY order type not exposed

✅ Platform-side merged: platform#29. Added ORDER_TYPE_LIMIT_POST_ONLY = 5 to the gRPC proto, 'limit_post_only' to @quantatrade/types, mapper case in engine-types.ts emitting 'LIMIT_POST_ONLY', and risk-checker treats it as a flavour of 'limit'. 5 new tests. Engine-side follow-up: platform#30 — verify QuantaTradeAI/exchange-core (Java) honours the new wire value at fill time. Until that lands, traders can submit post-only orders but the engine may still cross them.

Where: exchange-core2 (the Java matching engine fork) supports LIMIT_POST_ONLY, but the platform's gRPC proto only enumerates LIMIT / MARKET / STOP_LIMIT / STOP_MARKET. Maker-only orders cannot be placed via the platform. Effect: Market-maker rebate flow (per M1 amendment) cannot be exercised. The engine supports it; the wire layer doesn't. See: 02-trading-system.md §Market-maker support.

🟡 Low — symmetry and dead code

F8. ledger.debit, ledger.lock, ledger.unlock RPCs have no callers

✅ Merged: platform#27. Extracted registerLedgerHandlers(subscriber, publisher, ledger) to services/ledger-service/src/handlers.ts with narrow NatsSubscriber / NatsPublisher interfaces, then wrote 9 tests covering the wiring contract — subject names, queue group, argument forwarding, and the balances.updated.<userId> publishes from credit/debit. Dead RPCs can no longer drift silently.

Where: services/ledger-service/src/main.ts registers handlers; no service in the platform monorepo calls them. Effect: Dead code; future callers must verify the handlers still work. The handlers themselves use the same patterns as credit, so risk is low. See: 05-services-reference.md §ledger-service, 08-event-bus.md §Inventory.

F9. packages/metrics is a stub

✅ Merged: platform#28. packages/metrics now wraps real prom-client@^15.1.0 with API surface preserved 1:1 — every existing metrics.foo.inc(...) call site across api-gateway, order-router, ws-gateway, ledger-service now actually emits. ledger-service gains a /metrics HTTP endpoint and per-RPC instrumentation (ledger_service_rpc_calls_total, _rpc_duration_seconds, _rpc_failures_total). 11 unit tests on the package + 3 new handler-instrumentation tests.

Where: packages/metrics/src/index.ts:4 — the package re-exports nothing functional; services that import it get no actual metrics. Effect: No service-level Prometheus metrics emit today, despite the architectural intent. See: 09-deployment-and-ops.md §Observability.

F10. IdempotencyKey Postgres table has no writer

✅ Merged: platform#25. Triage confirmed Redis-only intent (grep of db.idempotencyKey / prisma.idempotencyKey returned zero hits across the monorepo). Dropped via migrations/20260529000000_drop_idempotency_keys_table + model removed from schema.prisma.

Where: packages/db/prisma/schema.prisma defines the table; api-gateway's idempotency guard uses Redis. No code path writes to the IdempotencyKey table. Effect: Likely intended as a Redis-failure fallback; today it's dormant. Safe; just confirm Redis-only is the desired design. See: 07-data-model.md §Idempotency model.

How to use this list

These should become individual GitHub issues on QuantaTradeAI/platform (or QuantaTradeAI/admin-panel for F2 if scoped there). F1 and F3 are user-visible misbehaviour and should be triaged first. F5 and F6 are correctness bugs in the audit-trail layer — important for ISO and for the M1 ledger acceptance gate, which is currently 🟢 in the milestone doc on the strength of a verified deposit replay; the trade-settle path was never exercised with replay-load.

After triage, link each issue back to the relevant reference doc section. Update this file when an issue is opened so a reader knows the fix is tracked, not lost.