Catalog
Data
catalog.
Datasets we originate, license, and develop. Backtest-ready for quant. Training- and eval-ready for AI. Delivered via API, S3, Webhook, or WebSocket.
All datasets
Financial Social Intelligence
Exclusive retail investor sentiment from the largest social finance platform.
Sports Intelligence
The only sports data source that powers real-time AI tool calls for the largest language models.
Venture Intelligence
Private company technographics mapped to public market tickers. See what startups adopt before the market prices it in.
Federal IT Contract Intelligence
Ticker-mapped US federal government spending on enterprise software. See which vendors win before the street does.
Prediction Market Intelligence
Cross-platform prediction market events mapped to affected securities with real-time probability streams.
GPU Spot Pricing Intelligence
Hourly GPU rental prices across every major cloud, normalized to canonical SKUs. A leading indicator of AI capex before it reaches earnings.
GPU Inference & Token-Demand Intelligence
Real-time inference economics across hosted-model providers - token prices, throughput, and latency as a demand-side read on AI compute.
Collaboration Workflow Intelligence
Cross-tool operational data from collaboration systems. Entity-resolved into a unified schema for AI training and workflow evals.
Developer Workflow Intelligence
Engineering coordination data across source control, issue tracking, and data platforms. Unified SDLC schema for coding agents and SWE evals.
Codebase Intelligence
Full-history private codebases - commits, pull requests, reviews, and linked issues - packaged as training data for coding agents and SWE evals.
Company Archive Intelligence
Complete operating histories of real companies - code, business data, communications, documents, and databases - as a training corpus for frontier models.
Robot Teleoperation
Success-labeled robot manipulation episodes collected via human teleoperation across diverse tasks and embodiments.
Human POV & Motion Capture
First-person human video paired with full-body 3D motion capture - the human-demonstration layer for embodied pretraining.
Multi-Modal Sensor Streams
High-frequency proprioception, inertial, and audio streams with sub-millisecond synchronization across embodiments.
Start with a trial.
Every dataset includes 90-day restricted access for evaluation. No commitment required.