If you’re choosing between Fabric Warehouse and Fabric Lakehouse, you’re not really choosing a storage format—you’re choosing the default way your team will build, transform, secure, and serve data.
Here’s a decision framework that works in the real world, even when the honest answer is “we’ll use both.”
Step 1: Start with the consumer, not the technology
Ask: What’s the primary outcome for the next 4–8 weeks?
- A single source of truth for reporting (finance/ops KPIs) with fast BI delivery → bias toward Warehouse
- A scalable engineering foundation (ingestion, transformations, experimentation, mixed data types) → bias toward Lakehouse
- Both: engineering foundation + governed reporting layer → plan for Lakehouse → Warehouse (the most common pattern)
Why this matters: the “right” choice is the one that reduces friction for the team delivering value now, while keeping you safe from rework later.
Step 2: Match the tool to the team’s working style (skills win)
Fabric gives you multiple experiences; the fastest path is usually the one your builders already know.
Choose Warehouse-first if:
- Your BI/data team is SQL-first
- Your transformations are mostly ELT in T-SQL
- Your priority is dimensional modeling, governed metrics, and stable reporting
Choose Lakehouse-first if:
- Your team is Spark/notebook-first (Python/Scala), or you already operate like a data engineering team
- You need to do heavier transformations, complex pipelines, or data science workflows
- You’re handling semi-structured/unstructured data (JSON, logs, files) as a first-class citizen
Quick litmus test: If the people building your pipelines live in notebooks and think in dataframes, Lakehouse will feel natural. If they live in SQL and think in star schemas, Warehouse will feel natural.
Step 3: Decide based on the shape and messiness of your data
Not all data behaves the same.
- Mostly structured (ERP tables, finance, inventory, master data) → Warehouse is usually the fastest route to reliable reporting
- Mixed types (IoT telemetry, machine logs, JSON exports, files, images, event streams) → Lakehouse is typically the better landing zone
- High-volume event data that needs shaping before it’s “report-ready” → Lakehouse for engineering, then promote curated outputs into Warehouse
A practical rule: If you need a place where raw + curated + experimental can coexist cleanly, Lakehouse is the better “workbench.” If you need a place that’s curated-by-default for business consumption, Warehouse is your “serving counter.”
Step 4: Pick the “serving layer” deliberately (this prevents dashboard chaos)
Most teams get burned because they don’t define where “truth” lives.
Ask: Where will business users and BI models get their data from?
- If your goal is consistent KPIs, fewer dashboards, fewer conflicting numbers, you want a clear serving layer.
- In many organizations, Warehouse is the simplest and cleanest serving layer because it naturally fits SQL-based modeling and BI consumption patterns.
Even if you engineer everything in a Lakehouse, you can still choose to serve curated, governed tables (and dimensional models) through a Warehouse so reporting becomes predictable.
Step 5: Stress-test with constraints (these often decide it)
Now run your situation through these constraint checks:
A) Do you need multi-table transactional behavior or highly relational modeling as a core requirement?
If yes, that often pushes you toward Warehouse for the curated serving layer.
B) Do you need rapid time-to-value for BI with minimal platform fiddling?
If yes, bias toward Warehouse-first (especially for “reporting stabilization” projects).
C) Do you need advanced engineering workflows (notebooks, complex transformations, feature engineering)?
If yes, bias toward Lakehouse-first.
D) Do you expect lots of ad-hoc exploration, landing messy data, and iterating fast?
If yes, Lakehouse is typically the safer sandbox.
E) Do you need a strong separation of responsibilities?
- Data engineering owns raw/curated pipelines → Lakehouse
- BI team owns semantic definitions + reporting layer → Warehouse
This separation is a huge accelerator for teams that currently have “everyone changing everything.”
Step 6: The most common answer: “Lakehouse for engineering, Warehouse for serving”
If you’re stuck, this default architecture is hard to regret:
- Land raw data (including files and semi-structured) in the Lakehouse
- Transform and curate into clean Delta tables in the Lakehouse
- Promote a governed subset into the Warehouse as the reporting/serving layer
- Build Power BI semantic models on top of the serving layer so KPIs are standardized
This gives you:
- Engineering flexibility upstream
- BI stability downstream
- A clear place where “truth” is defined and protected
A 2-minute decision cheat sheet
| Choose this | When your priority/need is… |
| Fabric Warehouse | Fast, reliable BI outcomes (stable reporting quickly)SQL-first development (T-SQL-centric workflows)Dimensional modeling + standardized KPIs (clear “one version of truth”)Curated serving layer with strong governance habits |
| Fabric Lakehouse | Mixed data types (files + tables; semi/unstructured like JSON/logs)Spark/notebooks + heavier engineering work (data engineering + DS workflows)Experimentation / feature engineering / advanced transformsScalable medallion-style foundation (raw → curated layers) |
| Both (Lakehouse → Warehouse) | Engineering + business-ready reporting (flexibility upstream, stability downstream)Clear ownership boundaries (engineering builds/curates; BI serves/standardizes)Messy data → curated truth without rework (promote governed subsets to serve) |
What a Fabric Lakehouse is (and what it’s best at)
A Fabric Lakehouse is the place in Microsoft Fabric designed for data engineering-style work: landing data (including messy data), transforming it at scale, and working in a way that’s natural for teams who use notebooks, Spark, and files—while still supporting tables through Delta.
At a practical level, think of the Lakehouse as your workbench:
- It’s where you can keep raw + curated + experimental data close together.
- It’s where you can iterate quickly as you learn what the data really looks like.
- It’s where you can build repeatable pipelines that turn “whatever we get from source systems” into something usable.
What makes it a “lakehouse” in Fabric terms
In classic data architecture, a “data lake” often meant “a place to dump files,” and a “data warehouse” meant “a structured SQL system for analytics.” A lakehouse aims to blend the two: files + tables, engineering + analytics, flexibility + structure.
In Fabric, the Lakehouse gives you:
- A home for files (raw extracts, JSON, logs, parquet/csv, etc.)
- A home for Delta tables (table format that supports reliable reads/writes and scalable analytics)
- A strong development experience for Spark notebooks and jobs (where many teams do heavy transformations)
What the Fabric Lakehouse is best at
1) Landing “real world” data (including messy and semi-structured)
Manufacturing, operations, and modern apps rarely hand you perfectly modeled relational tables. You often get:
- JSON exports from systems
- log-like event streams
- machine/IoT telemetry
- “flat” files from partners or plants
- inconsistent schemas between sites
The Lakehouse is ideal as a landing zone where you can keep the raw data as-is, then progressively standardize it.
2) Heavy transformations and engineering workflows
When your transformations go beyond a few SQL statements—think complex parsing, windowing, sessionization, deduplication, enrichment, or joining event streams—the Lakehouse is usually the smoother choice.
It’s also where teams typically implement:
- medallion-style layering (raw → cleaned → curated)
- reusable transformation logic (so you don’t re-implement the same business rules in every report)
- data quality checks (e.g., “reject rows with invalid part numbers” or “flag missing work center”)
3) Advanced analytics and data science readiness
If you want to support:
- experimentation and feature engineering
- model training workflows
- notebook-driven exploration
…a Lakehouse-first setup removes friction, because it’s already aligned with those working styles.
4) Keeping flexibility without losing structure
A common misconception is: “Lakehouse = chaos.” It doesn’t have to be.
Used well, the Lakehouse becomes a structured engineering space:
- Raw files remain available for traceability and reprocessing
- Curated Delta tables become the stable, reusable backbone for downstream use
- You can promote only the “trusted” outputs to the layer you use for business reporting
How teams typically use a Lakehouse in Fabric (a simple mental model)
Most successful implementations treat the Lakehouse as the place to do three jobs:
- Ingest
Bring data in from ERP/MES/quality/IoT/apps/files, and keep a copy in a raw format. - Curate
Clean it (types, null handling, standardization), reconcile keys (parts, work centers, plants), and create trustworthy tables. - Prepare for serving
Create “consumption-ready” tables that downstream layers (often a Warehouse + semantic model) can rely on without rework.
When a Lakehouse is the wrong first move
A Lakehouse can still be the best foundation, but it’s not always the best starting point.
If your immediate goal is to stabilize reporting and you have:
- a SQL-heavy BI team,
- mostly structured data,
- a tight timeline to consolidate KPIs,
…starting with a Warehouse serving layer can be faster—while still using a Lakehouse upstream if needed.
Bottom line
Choose a Fabric Lakehouse when you need a place that’s optimized for:
- engineering-heavy transformations
- mixed and messy data
- notebook/Spark workflows
- building a scalable foundation that can evolve as new sources and use cases appear
And if your end goal is consistent KPIs and fewer conflicting dashboards, the Lakehouse is often the upstream engine—with a governed serving layer (commonly Warehouse) downstream.
What a Fabric Warehouse is (and what it’s best at)
A Fabric Warehouse is the Fabric experience built for SQL-first analytics and BI delivery. If the Lakehouse is your engineering workbench, the Warehouse is your curated serving counter—the place you put trusted, business-ready data so reporting is fast, consistent, and maintainable.
In practice, teams use a Warehouse to:
- build and maintain clean, governed analytic tables
- model dimensions and facts (star schemas) for reporting
- standardize KPIs so “the number” means the same thing everywhere
- support BI users with predictable performance and a familiar SQL workflow
What makes it a “warehouse” in Fabric terms
A data warehouse isn’t just “data in tables.” It’s a commitment to structure, governance, and stability.
In Fabric, the Warehouse is designed around:
- a T-SQL-centric development experience
- curated tables intended for analytics consumption
- patterns that align naturally with Power BI semantic models and enterprise reporting
So rather than being the place where you experiment and reshape raw data endlessly, it’s typically where you publish the datasets you’re ready to stand behind.
What the Fabric Warehouse is best at
1) A governed “single source of truth” for reporting
If your organization has:
- multiple dashboards saying different things
- duplicated logic across reports
- KPI definitions that vary by department (“OEE” is never just one thing)
…a Warehouse is often the best anchor for standardization.
You can centralize:
- the canonical tables used for reporting
- KPI logic and dimensional structures
- consistent naming, grain, and business rules
The outcome is less “dashboard sprawl” and more trustworthy metrics.
2) SQL-first productivity (especially for BI-heavy teams)
For teams that live in SQL—BI developers, analytics engineers, data analysts—the Warehouse typically reduces friction:
- fewer context switches into notebooks
- cleaner handoff between “data model” and “report model”
- easier collaboration using established SQL conventions
If your near-term plan is “deliver 10 critical reports correctly and fast,” the Warehouse experience is purpose-built for that.
3) Dimensional modeling and BI-friendly structures
Warehouses naturally align with:
- star schemas (facts + dimensions)
- conformed dimensions (e.g., one product hierarchy used everywhere)
- slowly changing dimensions (where appropriate)
- stable grains (e.g., “production events” vs “daily production summary”)
These structures make Power BI models simpler, measures easier to validate, and performance easier to manage.
4) When you need relational/transactional-style guarantees in the analytics layer
Some workloads are easiest when you can rely on stronger relational behavior—for example:
- maintaining multiple related tables where consistency across them matters
- updating curated structures in controlled ways
If that’s central to your use case, a Warehouse serving layer can be the safer choice than trying to do everything in a more free-form engineering space.
How teams typically use a Warehouse in Fabric (a simple mental model)
A clean implementation usually treats the Warehouse as the last mile:
- Receive curated data
From upstream engineering work (often a Lakehouse), you bring in validated, standardized tables. - Model for analytics
Build star schemas, business-friendly naming, and consistent grains. - Serve Power BI
Put semantic models and dashboards on top of this layer so business users hit stable, governed data—not raw extracts.
When a Warehouse is the wrong first move
Warehouse-first can be great for BI stabilization, but it’s not ideal if:
- you’re dealing with lots of files and semi-structured data that needs heavy shaping
- you expect frequent schema shifts and exploratory work
- you’re building advanced engineering/ML workflows where notebooks are the primary interface
In those cases, a Lakehouse-first approach upstream will usually save time and reduce rework—then you can still use the Warehouse as the serving layer once the data is trustworthy.
Bottom line
Choose a Fabric Warehouse when you need:
- a BI-ready serving layer
- SQL-first development and fast reporting delivery
- dimensional modeling and standardized KPIs
- stable, governed datasets that reduce “multiple versions of truth”
And if your data is messy upstream, the Warehouse still fits perfectly as the destination: curate in Lakehouse, serve in Warehouse.
Side-by-side comparison: Lakehouse vs Warehouse in Fabric
Most teams don’t fail because they picked the “wrong” option—they fail because they picked one option for every layer. This comparison focuses on what you gain and what you trade off when you choose each as your default build experience in Fabric.
Quick comparison table (high-level)
| Decision factor | Fabric Lakehouse (best when…) | Fabric Warehouse (best when…) |
| Primary development style | You want Spark/notebooks and engineering-first workflows | You want T-SQL and BI-first workflows |
| Data types | You need files + tables (semi/unstructured included) | You’re primarily serving curated, structured tables |
| Best “role” in architecture | Landing + transformation + curation (workbench) | Serving + modeling + KPI standardization (counter) |
| Time-to-value for BI | Great once curated, but can drift if used as the reporting layer | Typically fastest path to stable reporting outputs |
| Dimensional modeling | Possible, but often not the most natural “center of gravity” | A natural fit for facts/dimensions + conformed dimensions |
| Iteration & experimentation | Excellent for exploration and shifting schemas | Best when structures are defined and stable |
| Team fit | Data engineers / DS teams | BI devs / analytics engineers / SQL-heavy teams |
1) Interface & workflow: how your builders actually work
Lakehouse feels best when:
- Your team thinks in pipelines + notebooks
- Transformations are complex (parsing, enrichment, heavy joins, advanced logic)
- You want a place where raw and intermediate artifacts can live without forcing “final form” too early
Warehouse feels best when:
- Your team wants to move quickly with SQL
- You need a straightforward path to curated analytics tables
- Your developers spend most of their time in semantic models and reports and want the data layer to match that
Practical takeaway: choose the experience that removes the most friction for the people building and maintaining the system—not just the people consuming dashboards.
2) Data reality: “messy first” vs “curated first”
Lakehouse is built for messy reality
- You can land data as files, keep raw history, and still build tables
- It’s forgiving when schemas change or sources behave inconsistently
- It supports the engineering habit of “capture now, model once we understand”
Warehouse is built for curated reality
- It shines when data is already trustworthy—or when you are committed to making it trustworthy before it’s used
- It encourages clean structures and consistent grains
- It’s ideal for publishing datasets you want the business to rely on
Practical takeaway: if your sources are volatile, start upstream in Lakehouse. If your goal is stable KPIs, serve from Warehouse.
3) Transformations: where business logic should live
This is where “dashboard chaos” is born: business logic duplicated across reports.
Lakehouse excels for transformation-heavy logic
- Great for building canonical cleaned/curated tables
- Strong when transformations look like “engineering work”
- Better when your logic includes complex parsing or multi-step processing
Warehouse excels for analytics modeling logic
- Great for structuring curated data into facts and dimensions
- Great for KPI-ready tables designed for consumption
- Makes it easier to enforce consistent naming and grains that BI teams depend on
Practical takeaway: Use Lakehouse to standardize and cleanse. Use Warehouse to model and serve.
4) BI consumption: what Power BI teams feel day-to-day
If BI is the main workload, you’ll care about:
- predictable refreshes
- consistent KPI definitions
- reusable datasets across departments
- fewer duplicated models and measures
A Warehouse serving layer typically supports those goals better because it pushes you toward a deliberate curated layer and dimensional structures.
A Lakehouse-only approach can absolutely work, but it tends to require more discipline to prevent:
- many intermediate tables becoming “production”
- multiple teams consuming different versions of “curated”
- logic creeping into reports instead of living centrally
Practical takeaway: when you’re trying to reduce dashboard sprawl, Warehouse is often the simplest forcing function.
5) Governance & ownership: who is responsible for what?
A durable Fabric setup often has two owners:
- Data engineering owns ingestion + curation
- BI/analytics owns serving tables + semantic models + KPI definitions
Lakehouse supports engineering ownership
- Raw ingestion, transformations, data quality, standardization
- Experimentation without breaking the reporting layer
Warehouse supports analytics ownership
- Publishing “certified” tables
- Controlling changes to KPIs and grains
- Supporting a predictable contract to downstream consumers
Practical takeaway: if you want clean handoffs and less chaos, Lakehouse + Warehouse is the cleanest ownership boundary.
6) CI/CD and change management: how you avoid breaking reports
Regardless of tool choice, change management is where systems get expensive.
Lakehouse change dynamics
- Faster iteration, more schema drift upstream
- Great for evolving pipelines—but that flexibility can surprise BI consumers if Lakehouse becomes the serving layer
Warehouse change dynamics
- Encourages stable schemas and controlled changes
- Easier to treat as an “interface contract” for Power BI and downstream users
Practical takeaway:
Keep fast-changing stuff upstream (Lakehouse). Keep stable contracts downstream (Warehouse).
7) Cost & operations: what usually drives effort (and spend)
Costs in Fabric aren’t just compute—they’re also people time: debugging refresh failures, reconciling KPI disputes, and maintaining redundant logic.
Lakehouse can reduce ops pain when
- it prevents constant re-ingestion and reprocessing by keeping raw history
- engineering pipelines are centralized and reusable
Warehouse can reduce ops pain when
- it reduces BI complexity with cleaner models
- it prevents metric drift by forcing a curated contract
Practical takeaway: the cheapest design is often the one that minimizes rework and KPI disputes—not the one with the fewest components.
The default recommendation that works for most teams
If you don’t have a strong reason to go “all-in” on one, use the pattern that fits how organizations actually operate:
- Lakehouse for landing + transforming + curating (engineering)
- Warehouse for serving + dimensional modeling + KPI standardization (BI)
That combination gives you the best chance to move fast early and avoid rework when adoption grows.
The “use both” architecture: Lakehouse for engineering, Warehouse for serving
If you’re trying to move fast and avoid rework, this is the most reliable Fabric pattern:
- Use a Lakehouse to ingest, clean, and curate data (engineering work).
- Use a Warehouse to publish governed, BI-ready structures (serving work).
It sounds like “more pieces,” but it usually reduces complexity because each layer has a clear job—and your BI consumers stop depending on upstream tables that are constantly changing.
Why “both” is often the best answer
Teams typically pick a single option and then hit one of these walls:
- Lakehouse-only wall: engineers iterate quickly, but BI teams struggle with shifting schemas, “intermediate” tables becoming production, and KPI logic leaking into reports.
- Warehouse-only wall: BI is stable, but ingestion and complex transformations become painful—especially with semi-structured data, files, or engineering-heavy workloads.
Using both lets you:
- keep upstream flexible without breaking downstream consumers
- enforce a “contract” for reporting (stable tables, stable grains)
- separate responsibilities cleanly between engineering and analytics teams
The reference pattern (simple and durable)
Think in three layers. The names can vary, but the responsibilities should not:
- Raw / Landing (Lakehouse)
- Curated / Conformed (Lakehouse)
- Serving / Semantic-ready (Warehouse)
Layer 1: Raw / Landing (Lakehouse)
This is where you land data as it arrives, with minimal assumptions.
What goes here:
- full extracts from ERP/MES systems
- IoT telemetry files or streaming landings
- CSV/JSON partner feeds
- logs and event data
How to treat it:
- keep it immutable (append-only when possible)
- store enough metadata to trace where it came from and when
- don’t over-model—your future self will thank you
Layer 2: Curated / Conformed (Lakehouse)
This is where engineering turns “raw” into “reliable.”
What happens here:
- type casting, null handling, deduplication
- key reconciliation (part IDs, plant codes, work centers)
- standardization across sites/systems
- data quality rules (reject/flag anomalies)
- creation of reusable “golden” tables that represent business concepts
This layer is your engineering asset: it’s reusable and scalable, but it can still evolve.
Layer 3: Serving / BI-ready (Warehouse)
This is what the business should actually use.
What goes here:
- dimensional models (facts + dimensions)
- summary tables designed for reporting grains (daily production, weekly scrap, downtime by line)
- certified KPI tables (the definitions you want everyone to agree on)
This layer is your contract:
- it changes less often
- it’s controlled
- it’s built to make Power BI semantic models simpler and more consistent
How data moves between them (the “promotion” mindset)
Instead of letting everyone query whatever looks convenient, treat movement from Lakehouse → Warehouse as a promotion:
A dataset is ready to promote when:
- it has a clear owner
- the grain is defined (“one row per production order per day”)
- quality checks pass
- KPI logic is documented (even briefly)
- downstream reports won’t break next week because someone “improved the pipeline”
This single discipline eliminates a huge percentage of “why did the number change?” firefights.
What to build first (so you don’t boil the ocean)
A practical implementation sequence that works well:
- Pick 1–2 high-value reporting use cases
- e.g., production output + scrap, downtime/OEE, on-time delivery, inventory accuracy
- Land the required sources into Lakehouse (raw)
- Create curated tables in Lakehouse
- only what you need for those use cases
- Publish a small serving model in Warehouse
- one fact table + key dimensions, or a clean summary table
- Build one semantic model in Power BI
- prove that the same model can power multiple dashboards
The goal is not “complete platform”—it’s one trusted pipeline that you can repeat.
Ownership model that keeps teams sane
This architecture also makes responsibility obvious:
- Data engineering owns (Lakehouse): ingestion, raw retention, transformations, conformance, quality checks
- Analytics/BI owns (Warehouse): dimensional models, KPI tables, semantic models, reporting contracts
- Business owners own: KPI definitions and acceptance criteria (what “scrap rate” includes/excludes)
When ownership is fuzzy, every dashboard becomes a separate data product. When ownership is clear, dashboards become views of the same truth.
The payoff: faster delivery, fewer dashboards, fewer arguments
When you use Lakehouse for engineering and Warehouse for serving:
- engineering can move quickly without breaking downstream consumers
- BI can build consistent models faster
- your organization gets closer to one version of the truth—and you stop multiplying dashboards just to reconcile numbers
This is the architecture that scales best when adoption grows, especially in environments with multiple plants, multiple source systems, and multiple teams touching the data.
Lakehouse vs. Warehouse vs. both: 3 manufacturing-ready examples
The easiest way to choose between Lakehouse, Warehouse, or both is to walk through the kinds of data manufacturing teams actually deal with: ERP structure, MES events, quality records, maintenance logs, and sometimes high-volume telemetry. Below are three common scenarios with a clear recommendation for where each piece belongs in Fabric—and why.
Example 1: ERP-driven finance + inventory reporting (structured, KPI-sensitive)
Typical sources
- ERP (e.g., orders, shipments, invoices, inventory movements, BOMs, item master)
- Reference/master data (plants, work centers, product hierarchies, customers/suppliers)
What the data looks like
- Mostly structured tables
- Stable schemas (compared to telemetry/logs)
- High pressure for “one number” (e.g., inventory value, margin, on-time delivery)
Recommended Fabric pattern: Both (Lakehouse → Warehouse)
- Lakehouse (engineering):
- Land raw ERP extracts (snapshot or incremental)
- Standardize keys (item IDs, plant codes), handle late-arriving updates
- Create conformed, reusable curated tables (e.g., “cleaned inventory movements”)
- Warehouse (serving):
- Publish facts/dimensions for reporting (e.g., FactInventoryMovement, DimItem, DimPlant, DimDate)
- Create certified KPI tables (e.g., “Inventory turns” definitions)
- Make it the default source for Power BI models
Why this works: ERP reporting becomes messy when KPI logic is duplicated across reports (“inventory on hand” vs “available” vs “valuated”). A Warehouse serving layer is the simplest way to enforce consistent grains + consistent definitions.
Example 2: MES production + downtime + OEE (event-heavy, needs shaping)
Typical sources
- MES events (start/stop, scrap/rework, production counts, changeovers)
- Line sensors or PLC-derived events (sometimes)
- Operator reason codes (often messy and inconsistent)
What the data looks like
- Event streams and time-based records
- Lots of joins to master data (line, shift, product, work order)
- Data quality issues (missing reason codes, duplicate events, clock drift)
Recommended Fabric pattern: Both (Lakehouse → Warehouse)
- Lakehouse (engineering):
- Land raw events (keep them immutable for traceability)
- Clean and reconcile: dedupe events, standardize timestamps, align shifts
- Build curated “production intervals” and “downtime intervals” tables
- Derive base metrics (runtime, planned vs unplanned downtime, scrap counts)
- Warehouse (serving):
- Publish BI-ready reporting tables at stable grains:
- daily/shift summaries by line
- downtime by reason category
- OEE component summaries (Availability/Performance/Quality)
- Lock KPI definitions so OEE doesn’t change by dashboard
- Publish BI-ready reporting tables at stable grains:
Why this works: OEE is a classic “dashboard chaos” metric: every team calculates it slightly differently. Lakehouse is ideal for the heavy lifting (interval creation, event shaping). Warehouse is ideal for the certified, consistent reporting layer.
Example 3: Maintenance + reliability + IoT telemetry (semi-structured + high volume)
Typical sources
- CMMS/EAM (work orders, preventive maintenance schedules, asset registry)
- Condition monitoring systems (vibration, temperature, pressure)
- IoT telemetry files/streams (often JSON, parquet, or vendor-specific formats)
What the data looks like
- Mixed: structured work orders + semi-structured telemetry
- Very large volumes (telemetry can dwarf everything else)
- Evolving schemas (new sensors, new fields, firmware changes)
Recommended Fabric pattern: Lakehouse-first, Warehouse for curated consumption
- Lakehouse (engineering):
- Land telemetry as files and/or Delta tables
- Normalize sensor schemas, enrich with asset registry (asset IDs, location)
- Create curated feature tables (e.g., rolling averages, anomaly flags)
- Keep raw history for reprocessing when models improve
- Warehouse (serving):
- Publish only what most BI consumers need:
- daily asset health summary
- maintenance KPIs (MTBF, MTTR, PM compliance)
- anomalies by asset/line/plant
- Avoid pushing raw telemetry into BI models unless the use case truly requires it
- Publish only what most BI consumers need:
Why this works: Telemetry is where Warehouse-only approaches often struggle. Lakehouse gives you a scalable way to manage volume and evolving schemas. Warehouse gives business users stable, digestible outputs without forcing them to swim in raw sensor data.
What these examples have in common
Across ERP, MES, and maintenance/IoT, the winning pattern is consistent:
- Lakehouse handles the reality of messy, high-volume, changing data and the engineering required to make it reliable.
- Warehouse provides the governed, BI-ready layer where KPI definitions are standardized and reused.
- The combination reduces rework and makes it easier to move from “we have data” to “we trust the number.”
Next step: choose your target architecture (and avoid dashboard chaos)
If you take one thing from the Warehouse vs. Lakehouse debate, let it be this: the goal isn’t picking the “right Fabric object.” The goal is designing a flow where data becomes trustworthy—and stays that way as new sources, new sites, and new dashboards appear.
1) Pick your default pattern (most teams should start here)
For most manufacturing organizations, the safest target architecture is:
- Lakehouse = engineering (landing, cleaning, standardizing, curating)
- Warehouse = serving (certified tables, dimensional models, KPI-ready outputs)
This creates a clear contract: BI uses the Warehouse, and the Lakehouse can evolve upstream without breaking the business.
2) Decide what gets “certified” (this is how you kill dashboard chaos)
Dashboard chaos happens when:
- multiple teams model the same metric differently,
- logic lives in reports instead of centrally,
- and nobody can answer “which dataset is the official one?”
So define—and enforce—three rules:
- Certified tables live in the Warehouse
- Each KPI has one owner and one definition
- Semantic models are built on the serving layer, not raw tables
When those rules are in place, you don’t need 40 dashboards to reconcile numbers—you need one set of metrics that everyone trusts.
3) Start small: one domain, one model, one source of truth
You don’t need a “full platform rollout” to get value. A strong first milestone looks like this:
- Choose one business domain (e.g., inventory, downtime/OEE, quality, maintenance)
- Ingest only the needed sources into the Lakehouse
- Build a curated dataset and publish a serving model in the Warehouse
- Create one Power BI semantic model that powers multiple dashboards
That’s the moment you move from “report building” to “analytics as a product.”
4) If you want a fast way to choose, use this shortlist
- If your pain is conflicting metrics and too many dashboards → Warehouse serving layer is non-negotiable.
- If your pain is messy data, many sources, and heavy transformations → Lakehouse upstream is your foundation.
- If you have both pains (most teams do) → use both, with a clear promotion path from curated to certified.
5) A practical next step you can execute this week
Create a one-page “target architecture brief” with:
- your top 2–3 KPIs (and who owns them)
- the systems of record (ERP/MES/CMMS/IoT)
- the intended layers (Lakehouse raw/curated, Warehouse serving)
- the first “certified” tables you’ll publish for BI
Do that, and you’ve already solved the hardest part: aligning the organization around one version of the truth—instead of letting every dashboard invent its own.
If you’d like a second opinion before you build, a lightweight way to de-risk the decision is a short architecture/KPI alignment session: map your top reporting pain points to a Lakehouse→Warehouse target design, identify the first certified tables, and flag the usual gotchas (ownership, grain, KPI definitions) before they turn into dashboard sprawl.
