AI Guide Map
Historical Backfill with Live WebSocket Streams
Shared lifecycle for exchange data available through historical REST or read endpoints and realtime WebSockets. Candle-close pipelines are one implementation; the same pattern also applies to trades, order books, funding snapshots, mark/index series, open interest, and private read-plus-stream data where supported.
Exchange-neutral reference for readiness gates, buffering, replay, finality, and reconnect behavior. Implementation pages for Binance, Bybit, and future exchanges add package-specific subscription helpers, subscription acknowledgement payloads, finality fields, sequence rules, and shutdown APIs.
Default Scope
- Runtime: Node.js LTS
- Recommended language: TypeScript
- Package: selected exchange SDK
- Data families: candles, trades, order books, funding, mark/index, open interest, or private read-plus-stream state
- Default execution boundary: data pipeline only; no order placement
- Credentials: public endpoints only unless the selected data family requires read-only account keys
Resources
Start with the task-specific resources below, then use SDK and exchange docs to verify exact method names, request fields, topics, and product rules.
Machine-readable recipe JSON
Historical Backfill with Live WebSocket Streams Conformance Pack
Markdown snapshot
AI Pattern Library
AI prompt generator
Task-focused LLM index
SDK catalog
Agent skill
Candle-Close Pipeline with Binance APIs & WebSockets
Candle-Close Pipeline with Bybit APIs & WebSockets
Implementation Steps
Follow these in order; use the linked artifacts only where they clarify the current step.
1. Define data identity and authority
Before subscribing or backfilling, define the data family, product scope, symbols, stream names, historical endpoint, primary timestamp, sequence or update ID if present, and local replay key.
- Do not mix exchange product families or symbols in one unscoped store key.
- Record whether REST history, WebSocket updates, or a combination of both is authoritative for each field.
2. Subscribe before backfill where possible
Open the WebSocket and send the subscription request before starting REST backfill when the exchange supports it. This narrows the window where live updates could be missed.
- Treat socket open, subscription request sent, and subscription acknowledgement as separate states.
- Do not assume subscribe() or an open socket means subscription acknowledgement.
3. Buffer live events while hydrating history
After subscription acknowledgement, buffer raw live events with local receive timestamps while REST history or scoped hydration is running. Do not apply correctness-sensitive side effects from buffered events yet.
- Keep raw payloads or selected-field summaries sufficient for replay diagnostics.
- Normalize timestamps, product scope, symbol, stream identity, sequence/update IDs, and finality fields before state application.
4. Backfill into one normalized store
Load historical rows into a local store through the same normalization boundary used by live events. Use structured keys and exchange filters or metadata where relevant.
- Deduplicate historical rows before replaying live buffered events.
- Avoid ad hoc string parsing when SDK types or structured payloads expose the fields directly.
5. Replay, then enable live processing
Drain buffered events in deterministic order, skip stale or duplicate records, and only then mark the pipeline live-ready for downstream workflows.
- Do not run strategy, indicator, signal generation, optional external alert, order-intent, or account-decision workflows until subscription acknowledgement, backfill, replay, and readiness are complete.
- Use the same state transition path for replayed buffered events and normal live events.
6. Respect finality and sequence rules
Some streams have final/closed/terminal fields. Others have snapshots, deltas, sequence IDs, checksums, or replacement views. Downstream workflows must use the data family’s real correctness boundary.
- For candles, only run downstream logic after the exchange or SDK marks the candle final or closed.
- For order books, pause downstream logic on sequence gaps, checksum failures, stale data, or impossible crossed books until resync succeeds.
7. Resync after reconnect
A WebSocket reconnect restores transport, not application correctness. Pause downstream correctness-sensitive workflows, keep buffering where possible, confirm subscriptions, run scoped REST resync, replay buffered updates, and then re-enable live processing.
- Log reconnecting, reconnected, resync started, resync complete, buffered replay complete, and live-ready transitions.
- Reconnects must not duplicate state transitions or run the same final event workflow twice.
8. Validate the full lifecycle
Review the data chain from config scope through SDK/API inspection, acknowledgement, backfill, replay, readiness, downstream side effects, and reconnect recovery.
- The linked Historical Backfill with Live WebSocket Streams Conformance Pack or equivalent local replay cases cover startup gating, duplicate records, out-of-order events, finality or sequence boundaries, reconnect resync, and sample-symbol/config handling.
- List unsupported behaviors as non-claims instead of implying they are covered.
- Do not mark the pipeline complete until three consecutive full data-lifecycle review passes produce no code, tests, fixtures, or documentation changes.
Readiness gates
| State | Source | Required before workflow |
|---|---|---|
| Transport open | WebSocket open event | No |
| Subscription request sent | SDK subscribe call or WebSocket command send | No |
| Subscription acknowledged | Package-specific subscription acknowledgement event | Yes |
| Historical backfill complete | REST/read endpoint records normalized into the store | Yes |
| Buffered replay complete | Local replay of buffered WebSocket events | Yes |
| Live processing enabled | Local readiness flag after reconciliation | Yes |
Data-Family Overlay: Candles, Klines, OHLCVReference
Apply this overlay to candle-close systems on any exchange or product family before using exchange-specific request examples.
- Verify the selected product family or category, REST candle/history method, public candle stream, subscription acknowledgement payload, final/closed field, reconnect hook, and shutdown method from the installed SDK or current docs.
- Treat examples as request-shape references only; do not copy sample symbols, intervals, categories, or product families as runtime defaults.
- Normalize REST rows and WebSocket updates into one candle shape keyed by product family, symbol, interval, and candle start time.
- Open candles may update local state, but cannot trigger strategy, indicator, signal generation, optional external alert, order-intent, or account-decision workflows.
- Run downstream logic only when the exchange or SDK marks the candle final or closed.
- Deduplicate final candles so reconnect, replay, or repeated final updates cannot run the same workflow twice.
Implementation pagesReference
These pages specialize the core lifecycle for concrete exchange SDK surfaces.
- Candle-Close Pipeline with Binance APIs & WebSockets adds spot kline helpers, formatted kline fields, Binance subscription acknowledgement, and SDK shutdown behavior for Spot workflows.
- Candle-Close Pipeline with Bybit APIs & WebSockets adds RestClientV5.getKline, subscribeV5 kline topics, response acknowledgement, WSKlineV5.confirm, and closeAll(true) for the documented category.
- Future trade, order-book, funding, or private data guides should link back here and add only the data-family and exchange-specific semantics.
Core invariants
- Transport readiness is not application readiness.
- Subscription acknowledgement, historical backfill, buffered replay, and live enablement are separate states.
- Historical rows and live events should feed one normalized store boundary.
- Downstream side effects wait for the data family’s real finality, sequence, or resync boundary.
- Reconnect handling must restore application correctness before live workflows resume.
- Completion requires fixtures or replay cases for each lifecycle claim.