Text-to-SQL without a semantic layer leads to hallucinations, inaccurate queries, and failed deployments. Here's why governed semantic layers matter.
You've seen the demos. A user types a natural language question into a chat interface, and within seconds, the system returns a perfectly formatted SQL query that answers their question. The LLM magic feels real, the promise feels inevitable, and your team is already imagining the productivity gains.
Then you deploy it to production.
Within days, you're drowning in hallucinated metrics, incorrect aggregations, and queries that run for minutes before timing out. Your finance team is getting revenue numbers that don't match the general ledger. Your product team is seeing user counts that contradict your warehouse. The executive dashboard breaks because the system confused "monthly active users" with "monthly user signups." Your data team is now spending more time validating AI-generated queries than they would have spent answering questions manually.
This isn't a failure of the LLM. It's a failure of architecture.
Text-to-SQL systems without a semantic layer are like asking someone to navigate a foreign city without a map, street signs, or a shared language. The LLM is incredibly powerful, but it's operating in a vacuum—without explicit definitions of what your metrics mean, how your tables relate to each other, or what the business rules are. The result is a system that feels intelligent but produces garbage.
A semantic layer fixes this. It's the bridge between natural language and your data warehouse. It's the Rosetta Stone that lets an LLM understand not just the structure of your database, but the meaning behind it.
Let's walk through why text-to-SQL without a semantic layer fails, what a semantic layer actually does, and how to build one that works in production.
Large language models are pattern-matching machines trained on vast amounts of text. They're phenomenally good at predicting what comes next in a sequence. But they have no inherent understanding of your business, your data model, or what your metrics actually mean.
When you ask an LLM to write SQL against your database, you're asking it to:
Without explicit guidance, the LLM will make assumptions. It will hallucinate table names that don't exist. It will guess at join logic. It will invent column names. It will apply the wrong aggregation function because the training data had a similar-sounding example.
Consider a simple question: "What was our revenue last month?"
Without a semantic layer, the LLM has to figure out:
transactions, orders, sales, invoices, or something else?amount column, or do you need to sum unit_price * quantity?The LLM might guess correctly 60% of the time. The other 40%, it will produce plausible-sounding SQL that runs without errors but returns completely wrong numbers. This is worse than an error—errors force you to investigate. Wrong numbers silently corrupt your decision-making.
You might think you can solve this by providing the LLM with better documentation. You document your schema, write out your business logic, include example queries, and feed it all to the prompt.
This works until it doesn't.
Documentation is unstructured. It's prose. It's ambiguous. When you write "revenue is the sum of all completed orders," you're creating a semantic artifact that a human can interpret in multiple ways. Is an order "completed" when it ships, when the customer receives it, or when payment clears? Does "all" include refunded orders? What about test orders?
An LLM reading this documentation will make a guess. If the guess is wrong, there's no systematic way to correct it. You can't debug documentation the way you can debug code.
Moreover, documentation rots. Your data model evolves, your business logic changes, and your documentation falls out of sync. The LLM is now operating on stale information.
A semantic layer is executable documentation. It's code that defines your metrics, dimensions, and relationships in a machine-readable format that the LLM can actually use.
A semantic layer is a structured, governed definition of your business metrics and data relationships. It sits between your raw database and your applications (including text-to-SQL systems). Instead of having an LLM guess at your data model, the semantic layer explicitly defines:
Metrics: Calculated values like revenue, customer count, or churn rate. A metric has a clear definition, a calculation method, and documented business logic.
Dimensions: Attributes that you filter or group by, like date, customer segment, or product category. Dimensions have standardized names and clearly defined values.
Relationships: How tables connect to each other. Instead of the LLM guessing at join logic, the semantic layer explicitly defines the foreign key relationships.
Filters and Business Rules: Constraints that should apply to certain queries. For example, "revenue should exclude test orders" or "active users are those who logged in within the last 30 days."
When an LLM has access to a semantic layer, it doesn't have to figure out the data model. It can focus on understanding the user's intent and translating that intent into a query that uses the semantic layer's pre-defined metrics and dimensions.
This is the difference between asking someone to navigate a city with a map versus without one. The map doesn't eliminate the need for navigation, but it makes the task vastly more tractable.
Let's walk through some concrete examples of how text-to-SQL fails without a semantic layer, and how a semantic layer prevents these failures.
Your CFO asks: "What's our gross margin by product line?"
Without a semantic layer, the LLM searches your database schema and finds a column called margin. It generates a query that selects product_line, margin and groups by product line.
The query runs successfully. It returns numbers. The CFO is satisfied.
Six weeks later, your accounting team discovers that the "gross margin" numbers you've been reporting are actually net margin, and they're off by 15% because the LLM picked the wrong column. The entire forecast is now wrong.
With a semantic layer, you've explicitly defined "gross_margin" as a metric: (revenue - cogs) / revenue. The LLM can't confuse it with net margin because the semantic layer has already codified the correct calculation. If the CFO asks for gross margin, the LLM generates a query that uses the defined metric. If the LLM tries to invent its own margin calculation, the semantic layer's type system catches the error.
Your product team asks: "How many customers did we acquire last month, broken down by acquisition channel?"
Without a semantic layer, the LLM has to figure out how to join the customers table with the acquisition_channels table. It guesses at the join key. Maybe it assumes customer_id exists in both tables. Maybe it assumes there's a channel_id foreign key.
The query runs. It returns numbers. But the join is wrong. Some customers have multiple acquisition channels, and the query is double-counting them. Other customers are missing because the join is too restrictive. The numbers are plausible, but they're wrong.
With a semantic layer, the relationship between customers and acquisition channels is explicitly defined. The LLM doesn't guess at join logic—it uses the pre-defined relationship. If there are edge cases (like customers with multiple channels), the semantic layer has already decided how to handle them. The query is correct by design.
Your operations team asks: "What's our average order value by region?"
Without a semantic layer, the LLM generates a query that calculates AVG(order_value) grouped by region.
But here's the problem: average order value should be calculated as SUM(revenue) / COUNT(distinct orders), not AVG(order_value). If you have orders of varying sizes, the two calculations give different results. The LLM picked the simpler aggregation without understanding the business meaning.
With a semantic layer, "average order value" is a defined metric with a specific calculation. The LLM doesn't get to choose the aggregation function—it uses the one that's defined in the semantic layer. The query is correct by definition.
A semantic layer transforms text-to-SQL from a risky guessing game into a reliable, governed process. Here's how:
Instead of having the LLM invent metrics, you define them once in the semantic layer. Each metric has:
When the LLM encounters a user question about a metric, it doesn't generate SQL from scratch. It looks up the metric in the semantic layer and uses the pre-defined calculation. This eliminates entire categories of errors.
Dimensions are the attributes you filter and group by. By defining them in the semantic layer, you ensure consistency:
The LLM can reference these standardized dimensions without ambiguity. When a user asks for "revenue by region," the LLM knows exactly which region dimension to use.
Instead of the LLM guessing at join logic, the semantic layer explicitly defines how tables relate:
The LLM uses these relationships to construct correct joins. It doesn't have to understand the intricacies of your data model—the semantic layer handles that.
Business rules that should apply to every query are baked into the semantic layer:
These rules are applied automatically, consistently, across all queries. The LLM doesn't have to remember them or apply them correctly—they're enforced by the semantic layer.
So how do you actually build a semantic layer that works with text-to-SQL? There are several approaches, each with trade-offs.
dbt's semantic layer approach provides a structured way to define metrics and dimensions in YAML. You define your metrics once, and they're available to any tool that integrates with dbt—including text-to-SQL systems.
The advantages:
The disadvantages:
Specialized semantic layer tools like Wren AI provide a dedicated platform for defining and governing your semantic model. These tools often include:
The advantages:
The disadvantages:
Many teams use dbt as the foundation (defining metrics and dimensions in dbt) and layer additional governance on top. For example, you might:
This approach gives you the best of both worlds: the version control and structure of dbt, plus custom governance tailored to your needs.
Once you have a semantic layer in place, here's how text-to-SQL actually works:
User Input: A user asks a natural language question: "What was our revenue last month by product category?"
Intent Recognition: The LLM parses the question and identifies:
Semantic Layer Lookup: The system looks up these elements in the semantic layer:
Query Generation: The LLM generates SQL using the semantic layer's definitions:
SELECT
products.category,
SUM(orders.amount) as revenue
FROM orders
JOIN products ON orders.product_id = products.id
WHERE orders.status = 'completed'
AND DATE_TRUNC('month', orders.created_date) = DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month')
GROUP BY products.category
ORDER BY revenue DESCValidation: The system validates the query against the semantic layer:
Execution: If validation passes, the query is executed against the warehouse.
Result Formatting: The results are formatted and returned to the user.
The key difference: the LLM isn't inventing the metric definition or the join logic. It's using pre-defined, governed definitions from the semantic layer. This dramatically reduces the surface area for errors.
Once you have a semantic layer in place, you can go further. Snowflake's research on agentic semantic model improvement shows how to use LLMs to continuously improve your semantic model.
The idea: when a text-to-SQL query returns unexpected results or when a user corrects an AI-generated query, that feedback can be used to improve the semantic layer itself. An agentic system can:
This creates a virtuous cycle: as your semantic layer improves, your text-to-SQL accuracy improves. As your text-to-SQL accuracy improves, you get better feedback to improve the semantic layer further.
You don't need to boil the ocean. Here's a pragmatic approach to implementing text-to-SQL with a semantic layer:
Start with the metrics that matter most:
Define these metrics explicitly in your semantic layer. Don't try to cover everything—focus on the 80/20 rule.
For each metric, identify the dimensions that matter:
Add these to the semantic layer and test the relationships.
Once you have a solid semantic layer foundation, implement text-to-SQL:
As usage grows, add governance:
Building and maintaining a semantic layer is non-trivial. It requires:
This is where managed platforms like D23 come in. D23 is built on Apache Superset with integrated semantic layer governance, AI-powered text-to-SQL, and expert data consulting.
Instead of building and maintaining these components yourself, D23 provides:
For data and engineering leaders at scale-ups and mid-market companies, this eliminates the operational burden of building and maintaining these systems yourself. You get the benefits of a semantic layer and text-to-SQL without the infrastructure overhead.
Even with a semantic layer, text-to-SQL projects can fail. Here are the most common pitfalls:
If your semantic layer doesn't cover the metrics and dimensions that users actually care about, they'll work around it. The LLM will revert to generating raw SQL, and you're back to the original problem.
Solution: Start with your most important metrics and expand iteratively. Don't try to cover everything at once.
If your semantic layer definitions drift out of sync with your actual data model, the LLM will generate incorrect queries.
Solution: Treat your semantic layer as code. Version control it, test it, and maintain it as part of your data infrastructure.
If your semantic layer definitions are cryptic or poorly documented, even humans can't understand them. The LLM certainly won't.
Solution: Document every metric and dimension. Explain the business logic, the calculation method, and the intended use cases.
If you don't test your text-to-SQL system, you won't catch errors until they corrupt your decision-making.
Solution: Implement automated testing. Validate that text-to-SQL queries return expected results. Monitor query accuracy in production.
The LLM is a tool, not a replacement for data governance. It can't fix a broken semantic layer.
Solution: Invest in your semantic layer. Make it accurate, complete, and well-governed. The LLM will then work reliably.
Implementing a semantic layer requires upfront investment. Why is it worth it?
Without a semantic layer, every new metric requires custom SQL, testing, and validation. With a semantic layer, new metrics are defined once and reused everywhere. This dramatically reduces the time it takes to build new dashboards and reports.
Metrics defined in a semantic layer are consistent across all tools and use cases. No more discrepancies between the executive dashboard and the data warehouse. No more debates about which number is "right."
With a semantic layer, your data team spends less time answering ad-hoc questions and more time building infrastructure. Text-to-SQL handles the routine questions; your team focuses on the complex ones.
When anyone can ask questions in natural language and get accurate answers immediately, decisions happen faster. No more waiting for the data team to write a query. No more wrong numbers corrupting your analysis.
Without a semantic layer, self-service BI leads to chaos—everyone writing their own SQL, generating their own metrics, and arriving at different answers. With a semantic layer, self-service BI is governed and reliable. Users can explore data without breaking things.
Text-to-SQL is genuinely powerful. It can democratize data access and accelerate decision-making. But only if it's built on a solid foundation.
That foundation is a semantic layer.
Without a semantic layer, text-to-SQL is a risky experiment that will fail spectacularly. With one, it's a reliable system that scales.
If you're considering a text-to-SQL project, don't start with the LLM. Start with your semantic layer. Define your metrics. Govern your dimensions. Explicit your relationships. Build your semantic layer first, and the text-to-SQL system will work.
For organizations evaluating managed alternatives to Looker, Tableau, and Power BI—or for teams embedding self-serve analytics into their products—D23 provides a managed Apache Superset platform with semantic layer governance built in. You get text-to-SQL that actually works, self-serve BI that's reliable, and embedded analytics that scale, all without building the infrastructure yourself.
The future of analytics isn't about smarter LLMs. It's about better semantic layers. Get that right, and everything else falls into place.