One Big Table (OBT)
One Big Table (OBT) is a pragmatic alternative to star schemas for teams that prioritize speed and simplicity over normalized design. Rather than maintaining separate fact and dimension tables, OBT flattens everything into a single wide table per analytical use case — one row per event, with all relevant dimension attributes inlined. No joins at query time; columnar warehouses scan only the columns you need.
TalkingSchema's AI copilot generates OBT schemas from your star schema or OLTP source — ideal for dbt-centric workflows, internal dashboards, and small teams that don't need the full Kimball toolkit.
OBT vs Star Schema: The Trade-Off
| Dimension | Star Schema | One Big Table |
|---|---|---|
| Table structure | Fact + dimension tables | Single wide table per use case |
| Query joins | Multiple joins | None (or minimal) |
| Data redundancy | Low (dimensions normalized) | High (attributes repeated per row) |
| Governance | Strong (conformed dimensions) | Weaker (definitions can drift) |
| SCD handling | Explicit (effective_from, is_current) | Awkward (requires full refresh or complex logic) |
| ETL/ELT complexity | Higher | Lower (one table to build) |
| BI tool setup | Semantic layer for joins | Direct table connection |
| Best for | Enterprise, multi-use-case | Small teams, narrow use cases, dbt marts |
When to Use OBT
- Small teams — Fewer tables to maintain, faster iteration
- Narrow use cases — One report or domain per OBT (e.g.,
sales_obt,inventory_obt) - dbt-centric workflows — OBT as a serving layer built from staging or dimensional models
- BI tools that prefer flat tables — Some tools simplify when there are no joins
- Prototyping — Get to insights quickly; refactor to star schema later if needed
Avoid OBT when you need conformed dimensions across multiple fact tables, strict SCD Type 2 history, or strong governance across many analysts.
Example: GSSC Sales OBT
Flattening the GSSC star schema into a single sales analytics table:
-- ============================================================
-- GSSC Analytics: One Big Table for Sales Reporting
-- Grain: one row per sales order line item
-- All dimension attributes inlined — no joins at query time
-- ============================================================
CREATE TABLE obt_sales (
-- Degenerate / identifiers
sales_order_bk UUID NOT NULL,
line_item_bk UUID NOT NULL,
order_date DATE NOT NULL,
-- Date attributes (from dim_date)
order_year SMALLINT NOT NULL,
order_quarter SMALLINT NOT NULL,
order_month SMALLINT NOT NULL,
order_month_name VARCHAR(10) NOT NULL,
-- Customer attributes (from dim_customer)
customer_id UUID NOT NULL,
customer_name VARCHAR(200) NOT NULL,
customer_country VARCHAR(100) NOT NULL,
customer_tier VARCHAR(20) NOT NULL,
-- Product attributes (from dim_product)
product_id UUID NOT NULL,
product_sku VARCHAR(50) NOT NULL,
product_name VARCHAR(200) NOT NULL,
product_category VARCHAR(100) NOT NULL,
carbon_score_band VARCHAR(20),
-- Warehouse attributes (from dim_warehouse)
warehouse_name VARCHAR(200) NOT NULL,
warehouse_city VARCHAR(100) NOT NULL,
warehouse_country VARCHAR(100) NOT NULL,
-- Supplier attributes (from dim_supplier)
supplier_name VARCHAR(200) NOT NULL,
supplier_carbon_tier VARCHAR(1) NOT NULL,
-- Measures
quantity INTEGER NOT NULL,
unit_price_usd DECIMAL(12, 4) NOT NULL,
line_revenue_usd DECIMAL(14, 4) NOT NULL,
line_carbon_kg DECIMAL(10, 3)
);
-- Partition by order_date for efficient date-range queries
-- (BigQuery: PARTITION BY order_date, Snowflake: CLUSTER BY order_date)
CREATE INDEX idx_obt_sales_order_date ON obt_sales(order_date);
CREATE INDEX idx_obt_sales_customer ON obt_sales(customer_id);
CREATE INDEX idx_obt_sales_product ON obt_sales(product_category);
Querying the OBT
No joins — filter and aggregate directly:
-- Revenue by product category and month
SELECT
product_category,
order_month_name,
order_year,
SUM(line_revenue_usd) AS total_revenue
FROM obt_sales
WHERE order_year = 2025
GROUP BY product_category, order_month_name, order_year
ORDER BY order_year, order_month, product_category;
Prompts for TalkingSchema
Generate OBT from star schema
Using the current GSSC star schema, generate a One Big Table for sales analytics.
Requirements:
- Table name: obt_sales
- Grain: one row per sales order line item (same as fact_sales)
- Inline all attributes from: dim_date, dim_customer, dim_product,
dim_warehouse, dim_supplier
- Include all measures from fact_sales: quantity, unit_price_usd,
line_revenue_usd, line_carbon_kg
- Use descriptive column names (e.g., customer_name, product_category)
- Add indexes on order_date, customer_id, product_category
Generate dbt model to build OBT
Write a dbt model that builds obt_sales from the star schema.
Source tables: fact_sales, dim_date, dim_customer, dim_product,
dim_warehouse, dim_supplier.
Join logic:
- fact_sales to each dimension on the surrogate key
- Select the dimension attributes to inline (no SCD filtering — use current only)
- Include incremental logic using order_date_key as the partition key
Frequently Asked Questions
Should OBT replace my star schema?
Usually not. OBT works best as a serving layer or data mart built on top of a star schema (or raw/staging layer). The star schema provides governance and conformed dimensions; the OBT simplifies consumption for specific use cases. Building OBT directly from raw sources is possible but loses the benefits of dimensional modeling.
How do I handle dimension changes in OBT?
OBT does not handle SCD elegantly. Options: (1) Current state only — rebuild the OBT periodically with the latest dimension attributes; historical reports show current attributes, not point-in-time. (2) Snapshot columns — add effective_from / effective_to to the OBT if you need history; this widens the table further. (3) Keep star schema for historical — use OBT only for current-state dashboards.
Does OBT work with BigQuery and Snowflake?
Yes. Columnar storage means scanning a wide table is efficient — only the columns in your SELECT are read. Partition by date (e.g., order_date) and cluster by common filter columns (e.g., customer_id, product_category) for best performance. OBT is well-suited to cloud warehouse economics where storage is cheap and compute is billed per query.