Complete guide to migrating from Looker on AWS to Apache Superset. Learn architecture, data mapping, and cost savings without vendor lock-in.
Moving from Looker to Apache Superset represents a significant shift in how your organization approaches business intelligence. Unlike a simple platform upgrade, this migration involves rethinking your BI architecture, reconnecting data sources, and redefining user workflows. For teams running Looker on AWS, the good news is that Superset's architecture aligns well with cloud-native deployments, and the underlying data warehouse connections remain largely unchanged.
The primary motivation for this migration typically centers on three factors: cost reduction, operational independence, and architectural flexibility. Looker's licensing model charges per user seat and enforces platform overhead that scales with your organization. Apache Superset, by contrast, operates on an open-source model where you control deployment, scaling, and licensing entirely. When you host Superset on AWS using the same warehouse as your current Looker instance, you're essentially replacing the visualization and querying layer while keeping your data infrastructure intact.
This guide walks through the complete migration process, from assessment through cutover, with practical guidance specific to AWS deployments. We'll cover data mapping, dashboard recreation, user onboarding, and how to leverage D23's managed Apache Superset platform to eliminate operational burden if you prefer not to manage infrastructure directly.
Before moving a single dashboard, you need a complete inventory of your Looker environment. This assessment determines migration complexity, timeline, and resource requirements.
Start by cataloging everything in your Looker instance:
This inventory serves two purposes: it quantifies the work ahead, and it identifies which assets you'll need to recreate versus which you can retire. Many organizations discover that 20-30% of their Looker dashboards see minimal traffic and can be archived rather than migrated.
Your data warehouse connection is the foundation of both Looker and Superset. Verify that Apache Superset supports your warehouse natively. Superset maintains official drivers for PostgreSQL, MySQL, Snowflake, BigQuery, Redshift, Athena, and dozens of others. If you're running Looker against Redshift, PostgreSQL, or Snowflake on AWS, you're in excellent shape—Superset has battle-tested drivers for all three.
Test connectivity from a test Superset instance to your warehouse using the same credentials and network configuration. This validates that your AWS security groups, VPC routing, and IAM policies will work with Superset before you commit to the migration.
LookML is Looker's semantic modeling language. It defines how raw tables map to business dimensions, measures, and explores. Superset doesn't have a direct LookML equivalent, but it offers multiple paths to replicate this functionality:
For most migrations, a combination of database views and Superset datasets provides the best balance of maintainability and performance. You're not rewriting LookML one-to-one; you're translating business logic into SQL views and Superset dataset definitions.
Superset's architecture differs from Looker's, particularly in how it separates the metadata database from the application server. Understanding this design helps you plan a deployment that scales with your needs.
A production Superset deployment on AWS consists of:
This architecture is more distributed than Looker's, but it's also more flexible. You can scale application servers independently from query processing, and you can deploy Superset across multiple availability zones for high availability.
You have three main options for running Superset on AWS:
Self-managed on ECS/EKS: You build and maintain Docker images, manage RDS instances, configure Celery workers, and handle scaling policies. This gives you complete control but requires DevOps expertise. The official guide for deploying Superset on AWS ECS with Terraform provides a solid starting point, including custom image building and ECR integration.
Self-managed on EC2: You run Superset directly on EC2 instances with systemd or similar process managers. This is simpler than containerized deployment but less cloud-native and harder to scale horizontally.
Managed Superset platform: Services like D23 handle infrastructure, scaling, security patching, and backups, letting your team focus on data work rather than platform operations. This is particularly valuable if your team lacks DevOps bandwidth or wants to avoid the operational overhead of managing a BI platform.
For this migration guide, we'll assume a self-managed ECS deployment, as it represents the most common choice for teams with existing AWS infrastructure. However, the data migration and dashboard recreation steps apply regardless of deployment model.
When deploying Superset on AWS, ensure:
The transition from Looker's semantic layer to Superset's dataset model requires careful planning. This is where you decide how much of your LookML logic to preserve versus simplify.
In Looker, an "explore" is a starting point for data analysis, combining a base view with related dimensions and measures. In Superset, the equivalent is a dataset—a SQL query (or table reference) plus a set of defined columns with metadata like data type, aggregation options, and formatting.
For each Looker explore you're migrating:
For example, if Looker has an explore called "orders" based on a view that joins orders, customers, and products tables, you'd create a Superset dataset with a SQL query like:
SELECT
o.order_id,
o.order_date,
o.total_amount,
c.customer_name,
c.customer_segment,
p.product_category,
p.product_name
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN products p ON o.product_id = p.product_idThen, in Superset, you'd define columns for each field, set appropriate data types, and configure which columns are filterable, groupable, or aggregatable.
If your LookML includes complex derived tables or frequently-reused calculations, create database views in your warehouse rather than embedding them in every Superset query. This approach:
For instance, if Looker has a derived table that calculates monthly cohort retention, create a materialized view in Redshift or PostgreSQL:
CREATE MATERIALIZED VIEW cohort_retention AS
SELECT
DATE_TRUNC('month', first_order_date) AS cohort_month,
DATE_TRUNC('month', order_date) AS order_month,
COUNT(DISTINCT customer_id) AS customers
FROM customer_orders
GROUP BY 1, 2;Then reference this view directly in your Superset dataset, keeping the definition simple and maintainable.
Looker's RLS capabilities let you restrict data access based on user attributes. Superset supports RLS through a combination of:
When migrating RLS rules from Looker, map each rule to a Superset RLS configuration. For example, if Looker restricts regional managers to see only their region's data, create a Superset RLS rule:
Clause: "region" = '{{ current_user_id() }}'
Assuming your user metadata includes a region field, this automatically filters data for each user.
Dashboard recreation is the most time-intensive part of migration. However, the process is straightforward once you understand Superset's chart types and configuration options.
Not all dashboards are worth migrating. Evaluate each dashboard based on:
Create a migration backlog, prioritizing high-usage, critical dashboards. Plan to recreate 60-70% of your Looker dashboards; the remaining 30-40% are often lightly-used or outdated and can be archived.
Superset offers a rich set of visualization types, though not all map directly to Looker's visualizations. Here's a quick mapping:
| Looker Visualization | Superset Equivalent |
|---|---|
| Table | Table |
| Number/Single Value | Big Number |
| Bar Chart | Bar Chart |
| Line Chart | Line Chart |
| Scatter Plot | Scatter Plot |
| Map | Map (Deck.gl) |
| Funnel | Funnel Chart |
| Pivot Table | Pivot Table |
| Gauge | Gauge Chart |
| Custom Visualization | Custom Plugin (requires development) |
Most standard charts translate directly. Custom Looker visualizations require either finding a Superset equivalent or developing a custom Superset plugin.
For each dashboard you're migrating:
This process typically takes 30-60 minutes per moderately complex dashboard. Simple dashboards with 3-5 charts may take 15 minutes; complex dashboards with 15+ charts and intricate filtering may take 2+ hours.
While recreating dashboards, take advantage of Superset capabilities that may exceed Looker's functionality:
Metadata includes user accounts, permissions, dashboard definitions, and dataset configurations. Superset stores all metadata in its backend database (RDS PostgreSQL or MySQL).
Before migrating, back up your Superset metadata database following official best practices. This is critical before any migration or upgrade:
pg_dump or mysqldump:
pg_dump -h your-rds-endpoint.us-east-1.rds.amazonaws.com -U superset_user superset_db > superset_backup.sqlFor a fresh Superset deployment, you won't import Looker's metadata directly. Instead, you'll recreate dashboards in Superset using the recreation workflow above. However, if you're upgrading an existing Superset instance, use the official upgrade documentation to migrate metadata safely.
Superset uses role-based access control (RBAC) with predefined roles:
Map your Looker user roles to Superset roles:
For user provisioning, integrate Superset with your identity provider (Okta, Azure AD, Google Workspace) using SAML or OAuth. This allows users to log in with existing credentials and automatically assigns roles based on group membership.
The transition from Looker to Superset requires careful planning to minimize disruption and maintain data access during the switch.
Run both systems in parallel for 2-4 weeks before decommissioning Looker. This allows:
During this period, clearly communicate which dashboards are available in Superset and which remain in Looker. Update any documentation, bookmarks, or embedded links to point to Superset.
Before full cutover, validate that:
Create a test plan document and have a representative from each user group sign off on validation before proceeding to cutover.
Once you're confident in Superset, schedule Looker decommissioning:
Keep Looker available for admins for 1-2 weeks post-cutover as a safety net, in case you need to reference old dashboards or troubleshoot issues.
One of the primary benefits of migrating to Superset is cost reduction. However, achieving those savings requires thoughtful optimization.
Superset's performance depends on your warehouse's performance. Optimize queries by:
For a mid-market organization with 100 users, here's a typical cost breakdown:
Looker on AWS:
Superset on AWS:
For teams without in-house DevOps expertise, D23's managed Superset service typically costs $5,000-15,000/month depending on scale, which still represents 50-70% savings versus Looker while eliminating operational overhead.
Unlike Looker, Superset's open-source nature and modern architecture make it easier to integrate AI and advanced analytics capabilities.
D23 integrates AI-powered text-to-SQL capabilities that let users ask questions in plain English and automatically generate SQL queries. This feature:
Text-to-SQL works best when your datasets are well-documented with clear column names and descriptions. During migration, invest time in creating meaningful dataset metadata.
Superset's comprehensive REST API enables:
If you were embedding Looker dashboards in your product, Superset's API provides equivalent (and often superior) flexibility.
Superset can be integrated with Model Context Protocol (MCP) servers, enabling:
A successful migration requires more than technical preparation; your users need to understand and embrace Superset.
Develop documentation covering:
Provide both written guides and video walkthroughs. Record screen captures showing common workflows.
Hold live training sessions for different user groups:
Schedule sessions at times convenient for different time zones and departments. Record sessions for asynchronous viewing.
Set up clear channels for users to ask questions and report issues:
Respond quickly to issues during the parallel running period to build confidence in Superset.
After cutover, your focus shifts to maintaining and optimizing Superset.
Set up monitoring for:
Use CloudWatch for AWS-native monitoring and consider tools like Datadog or New Relic for comprehensive observability.
Migrating from Looker on AWS to Apache Superset is a significant undertaking, but it's entirely achievable with proper planning and execution. The migration path is clear: assess your current setup, design your Superset architecture, recreate dashboards, validate thoroughly, and cut over systematically.
The benefits are substantial. You'll reduce BI platform costs by 50-70%, eliminate vendor lock-in, gain architectural flexibility, and access modern capabilities like AI-powered text-to-SQL and embedded analytics. Your data warehouse connection remains stable throughout—you're replacing the visualization layer, not your data infrastructure.
For organizations seeking to minimize operational burden, D23's managed Apache Superset platform provides a middle ground: all the benefits of Superset with the operational simplicity of a managed service. Whether you choose self-managed or managed deployment, the core migration process remains the same.
Start with your assessment, build your migration backlog, and tackle dashboards in priority order. Involve your users early, train thoroughly, and run parallel systems long enough to build confidence. With this approach, you'll successfully transition to Superset while maintaining data access and user satisfaction throughout the process.
The migration is an opportunity to reassess your BI strategy, eliminate unused dashboards, and establish better data governance practices. Use this moment to build a more efficient, flexible, and cost-effective analytics platform for your organization.