Step-by-step operational playbook for adding portfolio companies to a unified PE analytics platform. Covers data integration, user setup, and governance.
When a private equity firm acquires a new portfolio company, the first 100 days are critical. Among the many operational priorities—financial consolidation, cultural integration, talent retention—sits a less visible but equally important task: getting that company's data into your firm's shared analytics platform.
Most PE firms still operate in a fragmented state. Each portfolio company maintains its own reporting stack, its own data warehouse (if it has one at all), and its own spreadsheet-driven KPI tracking. When you acquire a new company, you face a choice: let it continue operating in isolation, or integrate it into a centralized analytics environment where you can monitor performance, surface cross-portfolio patterns, and make faster decisions.
This guide walks through the operational playbook for onboarding a portfolio company to a shared PE analytics platform built on Apache Superset. We'll cover the technical architecture, the people and process side, and the timeline you should expect. Whether you're managing five portfolio companies or fifty, this framework scales.
Before diving into the mechanics of onboarding, it's worth understanding why this matters. A global PE firm that unified 15 portfolio companies into one shared analytics brain reduced insight lag from weeks to hours and enabled their investment committee to answer critical questions about portfolio health in real time.
The benefits are concrete:
The challenge is execution. Onboarding a new portfolio company to a shared analytics platform is not a one-week project. It requires coordination across finance, IT, the portfolio company's leadership, and your internal analytics team. Get it wrong, and you'll have stale data, frustrated users, and a platform that no one trusts.
Before you touch any data, you need to understand what you're working with. This phase typically takes 1–2 weeks and should happen in parallel with other acquisition integration activities.
Start with a comprehensive inventory. You need to know:
This assessment doesn't require deep technical work. A conversation with the CFO and IT manager, plus 4–6 hours of poking around in their systems, usually surfaces the critical information. Document everything in a simple spreadsheet: system name, data type (financial, operational, customer), estimated record count, last update date, owner, and any known data quality issues.
You won't integrate everything on day one. Prioritize ruthlessly.
Tier 1 (must-have for day-one dashboards):
Tier 2 (within 30 days):
Tier 3 (within 90 days):
Scoping prevents you from getting bogged down trying to integrate every system at once. It also sets clear expectations with the portfolio company's team about what "done" looks like for each phase.
Now you know what data you need. The next phase is building the plumbing to get it into your shared platform. This typically takes 2–4 weeks, depending on complexity.
You have a few options, each with trade-offs:
Direct Database Connections If the portfolio company has a data warehouse (Snowflake, BigQuery, Redshift, Postgres), the simplest approach is to connect your analytics platform directly to it. D23, built on Apache Superset, supports direct connections to all major data warehouses. You define a read-only user with access to specific schemas, and your platform can query that data directly.
Pros: Low latency, real-time or near-real-time data, minimal middleware. Cons: Requires the portfolio company to have a data warehouse already. If they don't, you're building one as part of onboarding.
ETL/ELT Pipelines If the portfolio company's data is scattered across applications (Salesforce, QuickBooks, Stripe, etc.), you'll use an ETL tool (Fivetran, Airbyte, Stitch, or custom scripts) to extract data from those systems, transform it into a standard schema, and load it into a central data warehouse.
Pros: Flexible, handles multiple source systems, can normalize and clean data as it moves. Cons: More moving parts, requires maintenance, introduces latency (typically 6–24 hours depending on refresh frequency).
API-Driven Approach For real-time or near-real-time requirements, some portfolio companies expose their operational data via APIs. Your platform can query those APIs directly or use them to feed a data warehouse. This is common for SaaS companies where you want to track customer metrics as they happen.
Pros: Real-time, no data warehouse required, direct source of truth. Cons: Depends on API availability and reliability, rate limits can be an issue at scale.
Here's where standardization happens. You can't just dump each portfolio company's data into separate schemas and call it unified. You need a common structure.
Work with your data team to design a hub-and-spoke schema:
For example, a revenue fact table might look like:
Revenue
├── company_id (maps to which portfolio company)
├── date
├── customer_id
├── product_id
├── amount
├── currency
├── revenue_type (subscription, services, one-time)
└── [company-specific columns]
Every portfolio company's data is transformed to fit this structure. This is what enables cross-portfolio dashboards and comparisons. Without it, you end up with a platform that's just a collection of isolated data sources.
If the portfolio company doesn't have a data warehouse, you'll provision one. This could be:
The shared warehouse approach is more cost-efficient and operationally simpler, but requires careful attention to data governance and access control. The dedicated approach gives portfolio companies more autonomy but increases operational overhead.
Most PE firms start with a shared warehouse and move to dedicated only for portfolio companies above a certain revenue or data volume threshold.
You can't have a unified analytics platform without clear rules about who sees what. This is especially critical in PE, where you might have competing portfolio companies or sensitive financial information.
Set up a tiered access model:
Apache Superset's built-in RBAC allows you to control access at the dashboard, dataset, and row level. You can restrict a user to see only data where company_id = 'Portfolio_Company_A', which is exactly what you need.
Document the rules:
Put these policies in writing and share them with all portfolio companies. This prevents surprises and sets clear expectations.
With the architecture in place, here's the step-by-step technical workflow for bringing a new portfolio company online. This is where the rubber meets the road.
Work with the portfolio company's IT team to:
Do not hardcode credentials. Do not email them. Use a proper secrets management system.
In your analytics platform, add the portfolio company as a new data source:
If using ETL, configure the extraction jobs:
Run the first data extraction and transformation:
This step usually takes longer than expected. You'll discover data quality issues you didn't catch in the assessment phase. A customer ID that's sometimes numeric, sometimes alphanumeric. Revenue figures that don't reconcile. Missing months of data. This is normal. Budget extra time here.
While data is loading, start building dashboards. You need two types:
Portfolio Company Dashboards These are for the portfolio company's leadership. They show the KPIs that matter to them: revenue, customer metrics, operational performance, cash position. These dashboards should be familiar—they're probably dashboards the CEO was already looking at, just now in your platform instead of Excel.
PE Firm Dashboards These are for your investment committee and operations team. They show how this portfolio company is performing relative to others, highlight risks or opportunities, and surface metrics relevant to value creation.
Start simple. Your first dashboard for a portfolio company should have 5–8 key metrics, not 50. You can add depth later. The goal is to get something live quickly that people want to use.
Create user accounts for portfolio company staff who need access:
Assign them to the appropriate role (portfolio company user, operations team, etc.). This controls what they can see.
Schedule a training session (30–60 minutes) covering:
Do this training live, not async. You'll catch confusion in real time.
Let users access the dashboards and dashboards for a week. Collect feedback:
Expect to find discrepancies. A customer count in Superset doesn't match what the sales leader thinks it should be. Revenue is off by a few percent. These are usually data quality issues or definition mismatches. Work through them methodically.
Make dashboard adjustments based on feedback. Add a missing metric. Fix a calculation. Improve the visual layout.
Once dashboards are validated and users are comfortable, you've completed initial onboarding. But you're not done.
Every onboarding hits friction points. Here are the most common ones and how to handle them.
The problem: The portfolio company's data is messy. Missing values, duplicates, inconsistent formats, unexplained gaps.
Why it happens: Most mid-market companies have never had to think deeply about data quality. Their systems work fine for day-to-day operations but aren't designed for analytics.
How to solve it:
The problem: The CEO or CFO sees this as overhead. They're already reporting to you monthly. They don't want to spend time learning a new platform.
Why it happens: Change is hard, especially for leaders who are already busy integrating a new company.
How to solve it:
The problem: The portfolio company wants real-time data. You're on a daily refresh cycle. They're frustrated.
Why it happens: Different use cases have different requirements. The sales team wants real-time pipeline data. The finance team can live with daily close data.
How to solve it:
The problem: You've successfully onboarded two portfolio companies. Now you're doing your third, fourth, and fifth. The manual work is piling up.
Why it happens: Each portfolio company is slightly different. Their systems are different. Their data is in different places. There's no one-size-fits-all process.
How to solve it:
Once the basics are in place, you can layer on more sophisticated capabilities. This is where modern analytics platforms differentiate.
Instead of requiring users to understand SQL or click through dashboards, they can ask questions in plain English: "What's our customer churn rate by product line?" or "Which portfolio companies are below their EBITDA targets?"
Text-to-SQL technology powered by LLMs translates these natural language questions into SQL queries and returns results. This dramatically lowers the barrier to self-serve analytics. Your portfolio company CFOs don't need to learn SQL or dashboard design. They just ask questions.
Set up automated monitoring on your key metrics. If revenue drops 15% week-over-week, or if customer churn spikes, you want to know immediately. AI-powered analytics platforms can detect these anomalies automatically and alert your investment team.
This is especially valuable for PE because it surfaces problems early, before they become crises. You catch a portfolio company's revenue decline in week 2, not in the monthly close.
With data from multiple portfolio companies in one place, you can run sophisticated analyses:
These insights drive real value creation. They identify best practices you can replicate across the portfolio. They flag underperformance. They inform investment decisions.
Let's talk about what this actually takes.
This assumes the portfolio company has decent data quality and a cooperative IT team. If either of those assumptions breaks down, add 2–4 weeks.
If you're onboarding multiple portfolio companies in parallel, you need dedicated resources. A single analyst can't onboard three companies simultaneously.
This varies widely, but a rough estimate:
For a mid-market PE firm, total cost to onboard a portfolio company is usually $15,000–$40,000 in the first year. After that, ongoing costs are $5,000–$15,000/year.
Private equity firms that have successfully unified their portfolio companies into modern data platforms follow a few consistent patterns:
Start with the CFO: Get the chief financial officer bought in early. They're motivated by better visibility and less manual work. Once they're a champion, the rest of the organization follows.
Lead with quick wins: Your first dashboard should show something the CFO wants to see and is currently getting wrong (or not seeing at all). Make it accurate, make it beautiful, make it useful. That builds credibility.
Standardize ruthlessly: Define a common data schema and enforce it. Every portfolio company's revenue data should be structured the same way. This is what enables cross-portfolio analysis.
Invest in data quality: Spend time upfront getting data clean. It's tempting to rush to dashboards, but garbage in, garbage out. A week spent fixing data quality issues now saves months of credibility problems later.
Make it easy to use: Your users are busy. They're running companies. They don't have time to learn SQL or dashboard design. Make your platform intuitive. Provide training. Respond quickly to questions.
Plan for scale: Design your infrastructure and processes assuming you'll add 10 more portfolio companies. Don't build a one-off solution for each company. Templatize and automate.
Measure adoption: Track who's using the platform, which dashboards are viewed most, what questions are being asked. Use this data to improve the platform and justify continued investment.
Once you have the fundamentals in place—data flowing reliably, users trained, dashboards validated—you can evolve the platform.
If any of your portfolio companies are B2B SaaS businesses, they might want to embed analytics into their own product for customers. Rather than building a separate analytics infrastructure, they can use your shared platform. Embedded analytics on Apache Superset allows you to white-label dashboards and embed them in your portfolio companies' applications.
This creates a revenue opportunity for your portfolio companies and reduces their infrastructure costs.
Move beyond historical reporting to forward-looking analytics. Use machine learning to forecast revenue, predict churn, identify at-risk customers. These capabilities require more sophisticated data science skills, but they drive real value.
As your platform matures, enable portfolio companies to build their own dashboards. Provide templates and guidelines, but let them customize. This reduces the load on your analytics team and gives portfolio companies ownership of their data.
As your analytics platform becomes more central to your PE firm's operations, you'll need robust governance and compliance controls.
Document everything. Who accessed what data, when, and for what purpose. Your platform should provide audit logs. D23 provides comprehensive audit trails for compliance purposes.
When you're managing financial data for multiple portfolio companies, audit trails aren't optional—they're essential for SOC 2, GDPR, and other compliance frameworks.
Understand where data lives and whether that meets regulatory requirements. Some portfolio companies might be subject to data residency requirements (EU data must stay in the EU, for example). Your platform needs to support this.
If a portfolio company changes a metric definition, that change needs to be tracked and communicated. Document the old definition, the new definition, when the change took effect, and which dashboards are affected. This prevents confusion and ensures consistency.
Onboarding a portfolio company to a shared PE analytics platform is a project, but it's a project with clear steps and measurable outcomes. You're moving from a fragmented state—where each portfolio company operates in isolation with limited visibility—to a unified state where you have real-time insight into portfolio performance, can spot patterns across companies, and make faster, better-informed decisions.
The operational playbook is straightforward: assess the portfolio company's current state, design the integration architecture, set up data governance, execute the technical onboarding, and validate with users. Do this methodically, and you'll have a new portfolio company integrated into your platform in 6–12 weeks.
The payoff is significant. PE firms using modern data and analytics platforms report faster decision-making, better visibility into portfolio operations, and measurable improvements in value creation. They also report reduced reporting overhead—less time in spreadsheets, more time on strategy.
If you're building a shared analytics platform for your PE portfolio, D23 is purpose-built for this use case. It's Apache Superset—the open-source BI standard—with production-grade hosting, API-first architecture, and AI-powered analytics built in. You can onboard portfolio companies quickly, maintain strict data governance, and evolve the platform as your firm scales.
The key is to start. Pick your first portfolio company, work through this playbook, and build the muscle memory. By your third or fourth onboarding, this process becomes routine. By your tenth, you'll have it down to a science.
Your investment committee will thank you when they can answer critical questions about portfolio performance in minutes instead of days. Your portfolio companies will thank you when they realize they no longer need to maintain separate reporting infrastructure. And your team will thank you when they're no longer drowning in manual reporting work.