New: AI & text-to-SQL on your own SupersetBook a demo

Apache Superset18 Apr 2026

Migrating from Looker on AWS to Apache Superset

Complete guide to migrating from Looker on AWS to Apache Superset. Learn architecture, data mapping, and cost savings without vendor lock-in.

DTD23 Team

15 minutes read

Understanding the Migration Landscape

Moving from Looker to Apache Superset represents a significant shift in how your organization approaches business intelligence. Unlike a simple platform upgrade, this migration involves rethinking your BI architecture, reconnecting data sources, and redefining user workflows. For teams running Looker on AWS, the good news is that Superset's architecture aligns well with cloud-native deployments, and the underlying data warehouse connections remain largely unchanged.

The primary motivation for this migration typically centers on three factors: cost reduction, operational independence, and architectural flexibility. Looker's licensing model charges per user seat and enforces platform overhead that scales with your organization. Apache Superset, by contrast, operates on an open-source model where you control deployment, scaling, and licensing entirely. When you host Superset on AWS using the same warehouse as your current Looker instance, you're essentially replacing the visualization and querying layer while keeping your data infrastructure intact.

This guide walks through the complete migration process, from assessment through cutover, with practical guidance specific to AWS deployments. We'll cover data mapping, dashboard recreation, user onboarding, and how to leverage D23's managed Apache Superset platform to eliminate operational burden if you prefer not to manage infrastructure directly.

Assessing Your Current Looker Setup

Before moving a single dashboard, you need a complete inventory of your Looker environment. This assessment determines migration complexity, timeline, and resource requirements.

Documenting Your Looker Instance

Start by cataloging everything in your Looker instance:

Total dashboard count and average complexity (number of tiles, filters, drill-down paths)
Explore definitions and their underlying views and dimensions
Scheduled reports and alerts that run on automation
User counts broken down by role (viewers, explorers, developers)
Custom code including derived tables, liquid parameters, and custom fields
Access controls and row-level security (RLS) rules
Connected data sources including databases, APIs, and external systems
Custom visualizations and any Looker marketplace extensions

This inventory serves two purposes: it quantifies the work ahead, and it identifies which assets you'll need to recreate versus which you can retire. Many organizations discover that 20-30% of their Looker dashboards see minimal traffic and can be archived rather than migrated.

Evaluating Data Warehouse Compatibility

Your data warehouse connection is the foundation of both Looker and Superset. Verify that Apache Superset supports your warehouse natively. Superset maintains official drivers for PostgreSQL, MySQL, Snowflake, BigQuery, Redshift, Athena, and dozens of others. If you're running Looker against Redshift, PostgreSQL, or Snowflake on AWS, you're in excellent shape—Superset has battle-tested drivers for all three.

Test connectivity from a test Superset instance to your warehouse using the same credentials and network configuration. This validates that your AWS security groups, VPC routing, and IAM policies will work with Superset before you commit to the migration.

Understanding Your LookML Layer

LookML is Looker's semantic modeling language. It defines how raw tables map to business dimensions, measures, and explores. Superset doesn't have a direct LookML equivalent, but it offers multiple paths to replicate this functionality:

Direct SQL in Superset charts: Write SQL queries directly, including CTEs and complex joins
Database views: Create materialized or standard views in your warehouse that encapsulate LookML logic
Semantic layer integration: Use Cube as an open-source semantic layer between Superset and your warehouse, which provides a middle ground between raw SQL and full LookML
Superset's native dataset abstraction: Build Superset datasets that function similarly to LookML explores

For most migrations, a combination of database views and Superset datasets provides the best balance of maintainability and performance. You're not rewriting LookML one-to-one; you're translating business logic into SQL views and Superset dataset definitions.

Designing Your Superset Architecture on AWS

Superset's architecture differs from Looker's, particularly in how it separates the metadata database from the application server. Understanding this design helps you plan a deployment that scales with your needs.

Core Architecture Components

A production Superset deployment on AWS consists of:

Metadata database: A PostgreSQL or MySQL instance (typically RDS) that stores dashboard definitions, user accounts, dataset configurations, and query cache
Application servers: Stateless Superset containers running on ECS, EKS, or similar, handling the web UI and API requests
Asynchronous task queue: Celery workers (with Redis backend) that execute long-running queries in the background
Data warehouse connection: Your existing Redshift, RDS PostgreSQL, Snowflake, or Athena instance
Object storage: S3 for storing exported reports, cached query results, and backup metadata

This architecture is more distributed than Looker's, but it's also more flexible. You can scale application servers independently from query processing, and you can deploy Superset across multiple availability zones for high availability.

Choosing a Deployment Model

You have three main options for running Superset on AWS:

Self-managed on ECS/EKS: You build and maintain Docker images, manage RDS instances, configure Celery workers, and handle scaling policies. This gives you complete control but requires DevOps expertise. The official guide for deploying Superset on AWS ECS with Terraform provides a solid starting point, including custom image building and ECR integration.

Self-managed on EC2: You run Superset directly on EC2 instances with systemd or similar process managers. This is simpler than containerized deployment but less cloud-native and harder to scale horizontally.

Managed Superset platform: Services like D23 handle infrastructure, scaling, security patching, and backups, letting your team focus on data work rather than platform operations. This is particularly valuable if your team lacks DevOps bandwidth or wants to avoid the operational overhead of managing a BI platform.

For this migration guide, we'll assume a self-managed ECS deployment, as it represents the most common choice for teams with existing AWS infrastructure. However, the data migration and dashboard recreation steps apply regardless of deployment model.

Networking and Security Considerations

When deploying Superset on AWS, ensure:

VPC placement: Run Superset in the same VPC as your data warehouse (or use VPC peering) to minimize latency and avoid data exfiltration concerns
Security groups: Allow inbound HTTPS traffic on port 443 from your users' networks and outbound database access on your warehouse's port (typically 5432 for PostgreSQL, 3306 for MySQL)
IAM roles: Grant Superset's ECS task role permissions to read from RDS (metadata database) and your data warehouse
Encryption in transit: Use TLS for all connections, including database connections and internal service communication
Secrets management: Store database credentials in AWS Secrets Manager, not in environment variables or config files

Preparing Your Data Sources and Datasets

The transition from Looker's semantic layer to Superset's dataset model requires careful planning. This is where you decide how much of your LookML logic to preserve versus simplify.

Mapping Looker Explores to Superset Datasets

In Looker, an "explore" is a starting point for data analysis, combining a base view with related dimensions and measures. In Superset, the equivalent is a dataset—a SQL query (or table reference) plus a set of defined columns with metadata like data type, aggregation options, and formatting.

For each Looker explore you're migrating:

Identify the base view: This is typically a table or materialized view in your warehouse
List all dimensions and measures: Document their names, data types, and any custom SQL or formatting
Note relationships and joins: Looker explores often combine multiple views; you'll need to express these as SQL joins in your Superset dataset query
Check for derived tables and custom fields: These may need to be recreated as database views or Superset-level calculated columns

For example, if Looker has an explore called "orders" based on a view that joins orders, customers, and products tables, you'd create a Superset dataset with a SQL query like:

SELECT
  o.order_id,
  o.order_date,
  o.total_amount,
  c.customer_name,
  c.customer_segment,
  p.product_category,
  p.product_name
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN products p ON o.product_id = p.product_id

Then, in Superset, you'd define columns for each field, set appropriate data types, and configure which columns are filterable, groupable, or aggregatable.

Creating Database Views for Reusable Logic

If your LookML includes complex derived tables or frequently-reused calculations, create database views in your warehouse rather than embedding them in every Superset query. This approach:

Centralizes business logic in one place
Makes it easier to update calculations across multiple dashboards
Improves query performance by pushing computation to the warehouse
Simplifies Superset dataset definitions

For instance, if Looker has a derived table that calculates monthly cohort retention, create a materialized view in Redshift or PostgreSQL:

CREATE MATERIALIZED VIEW cohort_retention AS
SELECT
  DATE_TRUNC('month', first_order_date) AS cohort_month,
  DATE_TRUNC('month', order_date) AS order_month,
  COUNT(DISTINCT customer_id) AS customers
FROM customer_orders
GROUP BY 1, 2;

Then reference this view directly in your Superset dataset, keeping the definition simple and maintainable.

Handling Row-Level Security (RLS)

Looker's RLS capabilities let you restrict data access based on user attributes. Superset supports RLS through a combination of:

Row-level security rules: Define SQL predicates that filter data based on logged-in user properties
Dataset-level filters: Apply automatic filters to specific datasets for certain user roles
Database-level security: Leverage your warehouse's native RLS if available (Snowflake, PostgreSQL, etc.)

When migrating RLS rules from Looker, map each rule to a Superset RLS configuration. For example, if Looker restricts regional managers to see only their region's data, create a Superset RLS rule:

Clause: "region" = '{{ current_user_id() }}'

Assuming your user metadata includes a region field, this automatically filters data for each user.

Migrating Dashboards and Charts

Dashboard recreation is the most time-intensive part of migration. However, the process is straightforward once you understand Superset's chart types and configuration options.

Assessing Dashboard Complexity

Not all dashboards are worth migrating. Evaluate each dashboard based on:

Usage frequency: Check Looker's usage logs to identify dashboards viewed monthly or more
Business criticality: Prioritize dashboards used for decision-making, reporting, or operational monitoring
Complexity: Estimate the effort to recreate, considering number of charts, filters, and drill-down interactions
Audience size: Focus on dashboards used by multiple teams or departments

Create a migration backlog, prioritizing high-usage, critical dashboards. Plan to recreate 60-70% of your Looker dashboards; the remaining 30-40% are often lightly-used or outdated and can be archived.

Understanding Superset's Chart Types

Superset offers a rich set of visualization types, though not all map directly to Looker's visualizations. Here's a quick mapping:

Looker Visualization	Superset Equivalent
Table	Table
Number/Single Value	Big Number
Bar Chart	Bar Chart
Line Chart	Line Chart
Scatter Plot	Scatter Plot
Map	Map (Deck.gl)
Funnel	Funnel Chart
Pivot Table	Pivot Table
Gauge	Gauge Chart
Custom Visualization	Custom Plugin (requires development)

Most standard charts translate directly. Custom Looker visualizations require either finding a Superset equivalent or developing a custom Superset plugin.

Dashboard Recreation Workflow

For each dashboard you're migrating:

Create a new Superset dashboard and note its name and purpose
Recreate each chart by:
- Selecting the appropriate dataset or writing a custom SQL query
- Choosing the visualization type
- Configuring dimensions (row/grouping columns) and measures (aggregated columns)
- Setting filters and drill-down interactions
Add dashboard-level filters that apply to multiple charts
Configure drill-down and cross-filtering interactions
Test all filters and interactions to ensure they work as expected
Compare visually with the original Looker dashboard to verify accuracy

This process typically takes 30-60 minutes per moderately complex dashboard. Simple dashboards with 3-5 charts may take 15 minutes; complex dashboards with 15+ charts and intricate filtering may take 2+ hours.

Leveraging Superset's Advanced Features

While recreating dashboards, take advantage of Superset capabilities that may exceed Looker's functionality:

Text-to-SQL with AI: D23 integrates AI-powered text-to-SQL capabilities, allowing users to generate queries by typing natural language questions. This can reduce the need for pre-built charts and empower self-serve analysis
Dashboard parameters: Use Superset's native parameters to create flexible, reusable dashboard templates
Alerts and reports: Configure automatic alerts when metrics cross thresholds, and schedule dashboard exports to email
Custom CSS and themes: Style dashboards to match your organization's branding

Handling Metadata Migration

Metadata includes user accounts, permissions, dashboard definitions, and dataset configurations. Superset stores all metadata in its backend database (RDS PostgreSQL or MySQL).

Backing Up and Transferring Metadata

Before migrating, back up your Superset metadata database following official best practices. This is critical before any migration or upgrade:

Snapshot your RDS instance (or create a manual backup)

Export database contents using pg_dump or mysqldump:

pg_dump -h your-rds-endpoint.us-east-1.rds.amazonaws.com -U superset_user superset_db > superset_backup.sql

Store the backup in S3 with versioning enabled

For a fresh Superset deployment, you won't import Looker's metadata directly. Instead, you'll recreate dashboards in Superset using the recreation workflow above. However, if you're upgrading an existing Superset instance, use the official upgrade documentation to migrate metadata safely.

User and Permission Management

Superset uses role-based access control (RBAC) with predefined roles:

Admin: Full access to all dashboards, datasets, and configuration
Alpha: Can create and edit dashboards and datasets
Gamma: Can view dashboards and explore data through existing datasets
Public: Can access public dashboards without logging in

Map your Looker user roles to Superset roles:

Looker admins → Superset admins
Looker developers → Superset alphas
Looker viewers → Superset gammas

For user provisioning, integrate Superset with your identity provider (Okta, Azure AD, Google Workspace) using SAML or OAuth. This allows users to log in with existing credentials and automatically assigns roles based on group membership.

Managing the Cutover

The transition from Looker to Superset requires careful planning to minimize disruption and maintain data access during the switch.

Parallel Running Period

Run both systems in parallel for 2-4 weeks before decommissioning Looker. This allows:

Users to familiarize themselves with Superset's interface
Validation that Superset dashboards match Looker's in accuracy and performance
Time to address issues and refine dashboards
Confidence that critical reports are working correctly

During this period, clearly communicate which dashboards are available in Superset and which remain in Looker. Update any documentation, bookmarks, or embedded links to point to Superset.

Testing and Validation

Before full cutover, validate that:

All critical dashboards are available in Superset with correct data
Query performance is acceptable: Compare query times between Looker and Superset for the same underlying queries
Filters and interactions work correctly: Test all dashboard filters, drill-downs, and cross-filtering
RLS is enforced properly: Verify that users see only data they're authorized to access
Scheduled reports and alerts execute on schedule
API integrations (if any) work with Superset's API

Create a test plan document and have a representative from each user group sign off on validation before proceeding to cutover.

Decommissioning Looker

Once you're confident in Superset, schedule Looker decommissioning:

Set a cutover date and communicate it broadly
Disable Looker access for all non-admin users 1-2 weeks before cutover
Export any remaining reports or analyses from Looker for archival
Cancel Looker licenses and cloud resources
Document the migration for future reference

Keep Looker available for admins for 1-2 weeks post-cutover as a safety net, in case you need to reference old dashboards or troubleshoot issues.

Optimizing Performance and Costs

One of the primary benefits of migrating to Superset is cost reduction. However, achieving those savings requires thoughtful optimization.

Query Performance Tuning

Superset's performance depends on your warehouse's performance. Optimize queries by:

Using appropriate aggregation levels: Avoid selecting all rows when you need only daily or hourly aggregates
Creating indexes on frequently-filtered columns (date ranges, customer IDs, regions)
Leveraging materialized views for complex calculations used in multiple dashboards
Configuring query caching: Superset caches query results; set appropriate TTLs based on data freshness requirements
Using result backends: Store large query results in Redis or S3 to avoid re-running expensive queries

Cost Comparison: Looker vs. Superset

For a mid-market organization with 100 users, here's a typical cost breakdown:

Looker on AWS:

Licenses: 100 users × $2,000/user/year = $200,000
AWS infrastructure (hosted Looker): ~$5,000-10,000/month = $60,000-120,000/year
Total: $260,000-320,000/year

Superset on AWS:

RDS PostgreSQL (metadata): ~$500-1,000/month = $6,000-12,000/year
ECS/EKS infrastructure (application servers): ~$1,000-2,000/month = $12,000-24,000/year
Redis (Celery backend): ~$200-500/month = $2,400-6,000/year
Data warehouse (unchanged): Same as before
Total: $20,400-42,000/year for BI platform (excluding warehouse)

For teams without in-house DevOps expertise, D23's managed Superset service typically costs $5,000-15,000/month depending on scale, which still represents 50-70% savings versus Looker while eliminating operational overhead.

Avoiding Common Performance Pitfalls

Don't query raw tables: Always aggregate in the warehouse or use materialized views
Avoid SELECT * queries: Explicitly select needed columns
Don't cache indefinitely: Set appropriate TTLs for cached results
Monitor query execution: Use Superset's query logs to identify slow queries

Leveraging AI and Advanced Features in Superset

Unlike Looker, Superset's open-source nature and modern architecture make it easier to integrate AI and advanced analytics capabilities.

Text-to-SQL and Natural Language Queries

D23 integrates AI-powered text-to-SQL capabilities that let users ask questions in plain English and automatically generate SQL queries. This feature:

Reduces dependency on pre-built dashboards
Enables ad-hoc analysis without SQL knowledge
Accelerates time-to-insight for exploratory questions
Complements your dataset definitions with intelligent query generation

Text-to-SQL works best when your datasets are well-documented with clear column names and descriptions. During migration, invest time in creating meaningful dataset metadata.

API-First Architecture

Superset's comprehensive REST API enables:

Embedded analytics: Embed dashboards and charts directly in your product or internal applications
Programmatic dashboard creation: Build dashboards via API rather than UI
Third-party integrations: Connect Superset with Slack, Teams, or other tools for automated reporting
Custom applications: Build data applications on top of Superset's data layer

If you were embedding Looker dashboards in your product, Superset's API provides equivalent (and often superior) flexibility.

MCP Server Integration

Superset can be integrated with Model Context Protocol (MCP) servers, enabling:

Semantic layer connections: Link Superset to semantic layer tools like Cube or dbt
Custom data connectors: Build integrations with proprietary data sources
Workflow automation: Trigger external systems based on dashboard interactions

User Training and Adoption

A successful migration requires more than technical preparation; your users need to understand and embrace Superset.

Creating Training Materials

Develop documentation covering:

Getting started: How to log in, navigate dashboards, and run basic filters
Dashboard-specific guides: For critical dashboards, document what each chart shows and how to interpret it
Self-serve analysis: How to create new charts and dashboards (for alpha users)
Common tasks: Exporting data, scheduling reports, sharing dashboards
Troubleshooting: What to do if a dashboard isn't loading or a filter isn't working

Provide both written guides and video walkthroughs. Record screen captures showing common workflows.

Conducting Training Sessions

Hold live training sessions for different user groups:

Admins and alphas: Deep dive into dataset creation, RLS configuration, and advanced features
Business users: Focus on navigating dashboards, using filters, and interpreting results
Executives: Brief overview of available reports and how to access them

Schedule sessions at times convenient for different time zones and departments. Record sessions for asynchronous viewing.

Establishing Support Channels

Set up clear channels for users to ask questions and report issues:

Slack channel: For quick questions and peer support
Email support: For detailed issues requiring investigation
Office hours: Regular sessions where your team is available to help
Feedback form: Allow users to suggest improvements

Respond quickly to issues during the parallel running period to build confidence in Superset.

Post-Migration Operations

After cutover, your focus shifts to maintaining and optimizing Superset.

Monitoring and Alerting

Set up monitoring for:

Application health: Monitor ECS task health, error rates, and response times
Database health: Monitor RDS CPU, connections, and storage
Query performance: Track slow queries and set alerts for queries exceeding thresholds
Cache hit rates: Monitor Superset's cache effectiveness

Use CloudWatch for AWS-native monitoring and consider tools like Datadog or New Relic for comprehensive observability.

Regular Maintenance

Update Superset regularly: Follow the official upgrade documentation to stay current with security patches and new features
Archive old dashboards: Periodically review and archive dashboards with low usage
Optimize datasets: Review dataset queries and optimize those with poor performance
Clean up cache: Periodically clear expired cache entries to maintain performance

Continuous Improvement

Gather user feedback: Regularly ask users what's working well and what could improve
Monitor usage patterns: Identify which dashboards are most valuable and which are unused
Iterate on dashboards: Refine dashboards based on user feedback and changing business needs
Explore new features: As Superset evolves, evaluate new capabilities that could benefit your organization

Conclusion: Charting Your Path Forward

Migrating from Looker on AWS to Apache Superset is a significant undertaking, but it's entirely achievable with proper planning and execution. The migration path is clear: assess your current setup, design your Superset architecture, recreate dashboards, validate thoroughly, and cut over systematically.

The benefits are substantial. You'll reduce BI platform costs by 50-70%, eliminate vendor lock-in, gain architectural flexibility, and access modern capabilities like AI-powered text-to-SQL and embedded analytics. Your data warehouse connection remains stable throughout—you're replacing the visualization layer, not your data infrastructure.

For organizations seeking to minimize operational burden, D23's managed Apache Superset platform provides a middle ground: all the benefits of Superset with the operational simplicity of a managed service. Whether you choose self-managed or managed deployment, the core migration process remains the same.

Start with your assessment, build your migration backlog, and tackle dashboards in priority order. Involve your users early, train thoroughly, and run parallel systems long enough to build confidence. With this approach, you'll successfully transition to Superset while maintaining data access and user satisfaction throughout the process.

The migration is an opportunity to reassess your BI strategy, eliminate unused dashboards, and establish better data governance practices. Use this moment to build a more efficient, flexible, and cost-effective analytics platform for your organization.