Build an MCP server exposing GitHub data to Claude. Enable AI-powered engineering metrics, text-to-SQL queries, and self-serve analytics without platform overhead.
The Model Context Protocol (MCP) is a standardized way to connect AI models like Claude to external systems and data sources. At its core, MCP allows you to define resources, tools, and prompts that an AI model can interact with—turning your GitHub repositories, pull request data, and engineering metrics into queryable assets that Claude can reason about in real time.
For engineering leaders, this means you can ask Claude natural-language questions like "What's our average PR review time this quarter?" or "Which repositories have the highest defect density?" without building custom dashboards first. The AI model becomes your query interface, and GitHub becomes your analytics backend.
According to the official GitHub documentation on the Model Context Protocol, MCP works by establishing a connection between your AI client (Claude, GitHub Copilot, etc.) and a server that exposes your data. The server implements a set of JSON-RPC methods that define what data the AI can access and what operations it can perform. This is fundamentally different from traditional BI platforms like Looker or Tableau, which require you to pre-build dashboards and define dimensions/measures upfront. With MCP, the AI model negotiates data access dynamically.
Why does this matter for engineering teams? Because your metrics are constantly evolving. Last quarter you cared about deployment frequency; this quarter it's mean time to recovery (MTTR). With an MCP server, you expose the raw GitHub data—commits, pull requests, issues, workflows—and let Claude answer whatever question you ask. You're not locked into a pre-defined schema.
Traditional business intelligence platforms (Looker, Tableau, Power BI) require significant upfront investment to connect GitHub data:
MCP sidesteps these problems. By exposing GitHub's API through an MCP server, you let Claude answer ad-hoc questions without pre-built dashboards. The AI model handles the schema negotiation—it "understands" what GitHub data looks like because you've defined it in the MCP server.
As detailed in Warp's practical guide to GitHub MCP, you can use MCP servers to automate PR summaries and issue creation. That same pattern applies to analytics: define tools that expose aggregated metrics (average review time, deployment frequency, etc.), and Claude can query them conversationally.
The cost difference is stark. An MCP server running on your infrastructure costs pennies per month in compute. Claude API calls cost fractions of a cent per request. Compare that to Looker's minimum $5K/month seat licensing, and the math becomes obvious.
Let's walk through building a minimal MCP server that exposes GitHub repository metrics. This server will:
You'll need:
repo and read:org scopesmcp Python library and the PyGithub library for GitHub API accessStart by installing dependencies:
pip install mcp PyGithub python-dotenvCreate a .env file with your GitHub token:
GITHUB_TOKEN=ghp_your_token_here
The official GitHub MCP server repository provides a reference implementation you can adapt. It demonstrates how to structure MCP tools and resources for GitHub API interactions.
An MCP server exposes "tools"—functions that Claude can call. For GitHub analytics, your tools might look like:
Tool 1: Get Pull Request Metrics
{
"name": "get_pr_metrics",
"description": "Returns PR metrics for a repository: total PRs, average review time, merge rate",
"inputSchema": {
"type": "object",
"properties": {
"owner": {"type": "string", "description": "GitHub organization or user"},
"repo": {"type": "string", "description": "Repository name"},
"days": {"type": "integer", "description": "Look back window in days (default: 30)"}
},
"required": ["owner", "repo"]
}
}When Claude calls this tool, your server queries the GitHub API for closed pull requests in the past 30 days, calculates the time between PR open and merge, and returns:
{
"total_prs": 42,
"average_review_time_hours": 18.5,
"merge_rate": 0.95,
"median_review_time_hours": 12
}Claude receives this data and can reason about it: "Your average PR review time is 18.5 hours, which is faster than the industry median of 24 hours. However, your merge rate of 95% suggests you're being selective about what gets merged—that's good."
Tool 2: Get Deployment Frequency
{
"name": "get_deployment_frequency",
"description": "Returns deployment frequency by analyzing GitHub releases and workflow runs",
"inputSchema": {
"type": "object",
"properties": {
"owner": {"type": "string"},
"repo": {"type": "string"},
"days": {"type": "integer", "description": "Look back window"}
},
"required": ["owner", "repo"]
}
}This tool counts GitHub releases or successful workflow runs to calculate deployments per day:
{
"deployments_per_day": 3.2,
"total_deployments": 96,
"period_days": 30,
"deployment_success_rate": 0.98
}Tool 3: Get Code Churn and Defect Density
{
"name": "get_code_quality_metrics",
"description": "Returns code churn, defect density, and issue metrics",
"inputSchema": {
"type": "object",
"properties": {
"owner": {"type": "string"},
"repo": {"type": "string"},
"days": {"type": "integer"}
},
"required": ["owner", "repo"]
}
}This calculates:
{
"lines_changed_per_commit": 145,
"open_issues": 23,
"closed_issues_30d": 18,
"avg_issue_resolution_hours": 72,
"defect_density": 0.0008
}Each tool is a function your MCP server implements. Claude doesn't call GitHub's API directly—it calls your server, which acts as an intermediary. This gives you control over rate limiting, caching, and data aggregation.
Here's a simplified implementation:
import json
from datetime import datetime, timedelta
from github import Github
from mcp.server import Server
from mcp.types import Tool, TextContent
import os
GITHUB_TOKEN = os.getenv("GITHUB_TOKEN")
g = Github(GITHUB_TOKEN)
server = Server("github-analytics-mcp")
@server.call_tool()
async def get_pr_metrics(owner: str, repo: str, days: int = 30):
"""Calculate PR metrics for a repository."""
try:
repo_obj = g.get_user(owner).get_repo(repo)
cutoff = datetime.utcnow() - timedelta(days=days)
# Get closed PRs in the window
closed_prs = repo_obj.get_pulls(
state="closed",
sort="updated",
direction="desc"
)
review_times = []
merged_count = 0
for pr in closed_prs:
if pr.updated_at < cutoff:
break
if pr.merged_at:
merged_count += 1
review_time = (pr.merged_at - pr.created_at).total_seconds() / 3600
review_times.append(review_time)
total_prs = closed_prs.totalCount
avg_review_time = sum(review_times) / len(review_times) if review_times else 0
merge_rate = merged_count / total_prs if total_prs > 0 else 0
return {
"total_prs": total_prs,
"average_review_time_hours": round(avg_review_time, 1),
"merge_rate": round(merge_rate, 2),
"merged_prs": merged_count
}
except Exception as e:
return {"error": str(e)}
@server.call_tool()
async def get_deployment_frequency(owner: str, repo: str, days: int = 30):
"""Calculate deployment frequency from releases and workflow runs."""
try:
repo_obj = g.get_user(owner).get_repo(repo)
cutoff = datetime.utcnow() - timedelta(days=days)
# Count releases
releases = repo_obj.get_releases()
recent_releases = [r for r in releases if r.published_at > cutoff]
deployments_per_day = len(recent_releases) / days if days > 0 else 0
return {
"deployments_per_day": round(deployments_per_day, 2),
"total_deployments": len(recent_releases),
"period_days": days
}
except Exception as e:
return {"error": str(e)}
if __name__ == "__main__":
server.run()This server exposes two tools that Claude can call. When you connect Claude to this server, it can ask questions like:
Claude translates natural language into tool calls, receives the results, and synthesizes them into answers.
Once your server is running, you need to configure Claude to access it. There are two main approaches:
If you're using Claude Desktop, add your MCP server to the configuration file (~/.claude/claude_desktop_config.json):
{
"mcpServers": {
"github-analytics": {
"command": "python",
"args": ["/path/to/github_mcp_server.py"]
}
}
}Restart Claude Desktop, and the server will be available in your conversations.
For production use, integrate the MCP server with Anthropic's API. You'll use the stdio transport to communicate with your server:
import anthropic
import subprocess
import json
# Start your MCP server
server_process = subprocess.Popen(
["python", "github_mcp_server.py"],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE
)
# Initialize Anthropic client
client = anthropic.Anthropic()
# Send a message to Claude with MCP tools available
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
tools=[
{
"name": "get_pr_metrics",
"description": "Get PR metrics for a repository",
"input_schema": {
"type": "object",
"properties": {
"owner": {"type": "string"},
"repo": {"type": "string"},
"days": {"type": "integer"}
},
"required": ["owner", "repo"]
}
}
],
messages=[
{
"role": "user",
"content": "What's the average PR review time for my-org/backend-service?"
}
]
)
print(response.content)Claude will call the get_pr_metrics tool, your server will execute it, and Claude will return a natural-language answer.
The Claude 3.5 Sonnet model has excellent tool-use capabilities, making it ideal for MCP integrations. It can understand complex queries, chain multiple tool calls together, and reason about the results.
Let's say your engineering org has five repositories:
backend-service (production API)frontend-app (React web app)data-pipeline (ETL jobs)mobile-sdk (iOS/Android SDK)infrastructure (Terraform and DevOps)Instead of building five separate dashboards in Looker or Tableau, you ask Claude:
Query 1: "Compare PR review times across all five repos. Which has the slowest reviews?"
Claude calls get_pr_metrics for each repo, aggregates the results, and returns:
Average PR Review Times (30-day window):
1. backend-service: 12.3 hours
2. frontend-app: 18.7 hours
3. data-pipeline: 24.5 hours ← Slowest
4. mobile-sdk: 16.2 hours
5. infrastructure: 22.1 hours
The data-pipeline has the slowest reviews. This could indicate:
- Fewer reviewers available
- More complex code requiring thorough review
- Lack of automated testing (requiring manual verification)
Recommendation: Pair a senior data engineer with the data-pipeline PRs for a week to identify bottlenecks.
Query 2: "Which repos are deploying most frequently? Are we shipping code faster this month than last?"
Claude calls get_deployment_frequency for each repo and compares:
Deployment Frequency (30-day window):
- backend-service: 4.2 deployments/day
- frontend-app: 2.8 deployments/day
- data-pipeline: 0.6 deployments/day
- mobile-sdk: 0.3 deployments/day (App Store review delays)
- infrastructure: 1.1 deployments/day
Trend: Deployments are up 18% vs. last month, driven by backend-service.
Mobile-sdk is constrained by App Store review times (not a process issue).
Query 3: "Which repo has the highest defect density? Should we invest in testing?"
Claude calls get_code_quality_metrics and synthesizes:
Defect Density (issues per 1,000 lines of code):
- backend-service: 0.8 defects/1KLOC
- frontend-app: 1.2 defects/1KLOC ← Highest
- data-pipeline: 0.5 defects/1KLOC
- mobile-sdk: 0.9 defects/1KLOC
- infrastructure: 0.4 defects/1KLOC
Frontend-app has the highest defect density. However, this is typical for UI code,
which is harder to test automatically. Consider:
1. Increasing E2E test coverage
2. Implementing visual regression testing
3. Adding Sentry for real-time error tracking
Notice what just happened: You asked Claude three questions and got actionable insights without building any dashboards. The MCP server exposed the raw data; Claude handled the analysis and recommendations.
This is fundamentally different from traditional BI. In Looker, you'd need to:
With MCP + Claude, you skip all that. You define the tools once, and Claude becomes your analytics engine.
As you move from experimentation to production, consider:
GitHub's API has rate limits (5,000 requests/hour for authenticated users). If Claude calls your server frequently, you'll hit these limits. Implement caching:
from functools import lru_cache
from datetime import datetime, timedelta
class CachedGitHubMetrics:
def __init__(self, ttl_minutes=15):
self.cache = {}
self.ttl = timedelta(minutes=ttl_minutes)
def get_pr_metrics(self, owner, repo, days):
cache_key = f"{owner}/{repo}/{days}"
if cache_key in self.cache:
cached_data, timestamp = self.cache[cache_key]
if datetime.utcnow() - timestamp < self.ttl:
return cached_data
# Fetch fresh data
data = self._fetch_pr_metrics(owner, repo, days)
self.cache[cache_key] = (data, datetime.utcnow())
return dataWith 15-minute caching, you reduce GitHub API calls by 90% while keeping data reasonably fresh.
As your org grows, you'll want to query metrics across 10, 20, or 50 repositories. Define a tool that accepts a list:
@server.call_tool()
async def get_org_metrics(org: str, repos: list = None):
"""Get metrics for all repos in an organization or a specific list."""
# If repos not specified, fetch all repos in the org
if not repos:
repos = [r.name for r in g.get_organization(org).get_repos()]
results = {}
for repo in repos:
results[repo] = await self.get_pr_metrics(org, repo, days=30)
return resultsNow Claude can ask: "Get metrics for all backend repos in my-org" and receive aggregated data.
If you're already using D23 for embedded analytics, you can combine MCP with Apache Superset dashboards. Use the MCP server to power Claude conversations, and embed Superset dashboards in your product for team-wide visibility.
For example:
The D23 platform handles embedded analytics without the overhead of Looker or Tableau. You can build dashboards on top of your GitHub data warehouse and combine them with Claude conversations for maximum flexibility.
Once you're comfortable with MCP tools, the next step is text-to-SQL. Instead of defining specific tools, you expose your GitHub data as a SQL-queryable table and let Claude write queries.
This requires:
query_sql toolFor example:
User: "What's the 90th percentile PR review time by team this quarter?"
Claude writes:
SELECT
team,
PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY review_time_hours) as p90_review_time
FROM github_prs
WHERE merged_at >= DATE_TRUNC('quarter', CURRENT_DATE)
GROUP BY team
ORDER BY p90_review_time DESC;This is more powerful than fixed tools because Claude can ask arbitrary questions. However, it requires careful SQL validation to prevent injection attacks and runaway queries.
The GitHub MCP server directory lists several implementations that support SQL querying. Start with those as references before building your own.
You could build a REST API that Claude calls directly via the Anthropic API. Why use MCP instead?
Traditional BI platforms are better for:
MCP is better for:
The article on three tools to run MCP on GitHub repositories compares GitHub Chat MCP, Git MCP, and the official GitHub MCP Server—each with different strengths.
Metabase and Mode are lighter-weight BI tools than Looker or Tableau, but they still require:
MCP skips all that. You expose data, Claude queries it conversationally. The tradeoff is that non-technical users need Claude access (or a Claude-powered chat interface you build).
If you want non-technical team members to access your MCP server, build a simple web chat interface:
from flask import Flask, request, jsonify
import anthropic
app = Flask(__name__)
client = anthropic.Anthropic()
@app.route("/api/chat", methods=["POST"])
def chat():
user_message = request.json["message"]
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
tools=[
# Your MCP tools here
],
messages=[
{"role": "user", "content": user_message}
]
)
# Process tool calls and return final response
return jsonify({"response": response.content[0].text})
if __name__ == "__main__":
app.run(debug=True)Deploy this to a simple server (AWS Lambda, Heroku, etc.), and your team can ask engineering metrics questions via a chat interface. No dashboard training required.
When exposing GitHub data to Claude via MCP, consider:
Never hardcode GitHub tokens. Use environment variables or a secrets manager:
import os
from dotenv import load_dotenv
load_dotenv()
GITHUB_TOKEN = os.getenv("GITHUB_TOKEN")For production, use AWS Secrets Manager, HashiCorp Vault, or your cloud provider's secret storage.
Implement per-user or per-org rate limits to prevent abuse:
from flask_limiter import Limiter
from flask_limiter.util import get_remote_address
limiter = Limiter(
app=app,
key_func=get_remote_address,
default_limits=["200 per day", "50 per hour"]
)
@app.route("/api/chat")
@limiter.limit("10 per minute")
def chat():
# Your chat logic
passReview your D23 privacy policy and your organization's data retention policies. If you're logging Claude conversations, ensure you're complying with data protection regulations (GDPR, CCPA, etc.).
GitHub data may include:
Consider masking or filtering sensitive data before exposing it to Claude.
Log all MCP server calls for compliance:
import logging
logger = logging.getLogger(__name__)
@server.call_tool()
async def get_pr_metrics(owner: str, repo: str, days: int = 30):
logger.info(f"get_pr_metrics called: {owner}/{repo} (days={days})")
# Your logic hereStore logs in a secure, tamper-proof location (CloudWatch, Splunk, etc.).
Here's how a mature MCP analytics implementation looks:
MCP Server (running on your infrastructure)
Claude Integration
Complementary Dashboards (optional, via D23)
Data Warehouse (optional, for scale)
This stack gives you:
Once you have a working GitHub MCP server, consider extending it:
Each becomes another tool in your MCP server, and Claude can correlate data across sources.
Use Claude to identify patterns:
Schedule Claude to generate weekly or monthly engineering reports:
from apscheduler.schedulers.background import BackgroundScheduler
scheduler = BackgroundScheduler()
@scheduler.scheduled_job('cron', day_of_week='fri', hour=17)
def weekly_engineering_report():
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=2048,
messages=[
{
"role": "user",
"content": "Generate a weekly engineering report covering PR review times, deployment frequency, and code quality across all repos. Highlight any concerning trends."
}
]
)
# Email the report to the team
send_email(response.content[0].text)
scheduler.start()The Model Context Protocol represents a fundamental shift in how engineering teams access analytics. Instead of building dashboards, you expose data. Instead of training users on BI tools, you let Claude answer questions conversationally.
For engineering leaders evaluating analytics platforms, MCP offers a compelling alternative to Looker, Tableau, and Power BI:
The official GitHub MCP documentation and the GitHub MCP server repository provide solid starting points. Build your first server this week, and you'll be answering engineering metrics questions through Claude by next week.
For teams that need more polished dashboards for non-technical stakeholders, combine MCP with D23's embedded analytics platform. Use Claude for exploration, Superset for presentation. You get the best of both worlds: conversational AI for engineers, beautiful dashboards for everyone else.
Your GitHub data is valuable. Stop locking it behind expensive BI platforms. Expose it through MCP, and let Claude be your analytics engine.