Strategy

When to Fire Your MSP and Build an In-House DevOps Team

Updated By Zak Kann

Key takeaways

  • MSPs make sense for seed-stage startups (under $1M ARR, fewer than 10 engineers) but become cost-inefficient and velocity bottlenecks beyond Series A
  • Financial break-even occurs at $150K-$200K annual MSP spend when a single senior DevOps engineer ($180K loaded cost) provides more value and faster iteration
  • Warning signs to transition include multi-day ticket response times, lack of IaC adoption, vendor lock-in to MSP-specific tools, and inability to customize infrastructure
  • True MSP cost includes hidden expenses like engineering context switching (20% productivity loss), delayed feature launches, technical debt accumulation, and opportunity cost
  • Successful transitions require 6-12 month phased approaches: hire first DevOps engineer, document existing infrastructure, implement IaC, gradually reduce MSP scope, and maintain knowledge transfer overlap

The MSP Paradox

Your startup just raised a Series A. You have 25 engineers shipping features daily. But every infrastructure change requires a ticket to your Managed Service Provider:

Monday, 9 AM:

Ticket #4521: Please increase RDS instance from db.t3.large to db.r5.xlarge
Priority: High
Estimated response time: 24-48 hours

Wednesday, 3 PM:

MSP Response: "We can schedule this change for Saturday at 2 AM EST.
Downtime estimate: 15-20 minutes. Please confirm."

Your engineering team: Blocked for 4 days waiting for a 30-second Terraform change.

The cost:

  • Feature launch delayed 1 week
  • 3 engineers context-switched to other work
  • $50K revenue opportunity missed

Meanwhile, your competitor with an in-house DevOps team: Made the change in 10 minutes during business hours with zero downtime using Multi-AZ failover.

The Financial Break-Even Point

MSP Cost Structure (Real Numbers)

Typical MSP pricing for Series A startup:

Base retainer: $5,000/month
- Includes: 40 hours support, basic monitoring, security patching

Per-hour overage: $200/hour
- Average usage: 20 hours/month
- Monthly overage: $4,000

AWS markup: 10-15%
- Your AWS spend: $15,000/month
- MSP markup: $2,250/month

Total MSP cost: $11,250/month = $135,000/year

At scale (Series B, 50 engineers):

Base retainer: $10,000/month
Per-hour overage: $200/hour (50 hours/month) = $10,000
AWS markup (15% of $50K): $7,500/month

Total MSP cost: $27,500/month = $330,000/year

In-House DevOps Team Cost Structure

Scenario 1: Single Senior DevOps Engineer

Salary: $150,000/year
Benefits (30%): $45,000
Tools/SaaS: $12,000 (Datadog, PagerDuty, GitHub Actions, Terraform Cloud)
Recruitment: $10,000 (amortized over 3 years)

Total loaded cost: $180,000/year

Break-even vs MSP: $135K-$180K annual MSP spend

Scenario 2: Two-Person Team (Series B+)

Senior DevOps Engineer: $180,000 loaded
Mid-level Platform Engineer: $140,000 loaded
Tools/SaaS: $18,000

Total: $338,000/year

Break-even vs MSP: $300K+ annual MSP spend

The Hidden Costs of MSPs

The sticker price doesn't include:

1. Engineering Productivity Loss

Average ticket response time: 24 hours
Number of infrastructure requests/month: 40
Total blocking time: 40 days/month across team

Engineering team size: 20
Percentage of time blocked: 10%
Loaded cost per engineer: $180K/year
Annual productivity loss: 20 × $180K × 10% = $360,000/year

2. Delayed Feature Launches

Features delayed by infrastructure: 3/quarter
Average revenue per feature: $50K/quarter
Annual opportunity cost: 12 × $50K = $600,000/year

3. Technical Debt Accumulation

MSP uses ClickOps (no IaC)
Manual configuration = undocumented changes
Debt cleanup: 6 months × $180K = $90,000

4. Vendor Lock-In

MSP-specific monitoring tools
Proprietary deployment scripts
Migration cost to standard tooling: $50,000

Total Cost of Ownership (TCO) Comparison

MSP Total Cost (Series A, 25 engineers):

Direct MSP fees: $135,000
Engineering productivity loss: $200,000
Delayed features (3/year): $150,000
Technical debt: $30,000/year (amortized)

Total TCO: $515,000/year

In-House Total Cost (1 DevOps engineer):

Direct cost: $180,000
Productivity loss: $0 (same-day changes)
Delayed features: $0
Technical debt: -$50,000 (IaC implemented)

Total TCO: $130,000/year

Net savings: $385,000/year

Warning Signs You've Outgrown Your MSP

Sign 1: Ticket-Driven Velocity Bottleneck

Red flag scenario:

Your team: "We need to add a new Lambda function for the new API endpoint."

MSP: "Please submit a ticket with the following:
- Function name, runtime, memory allocation
- IAM role requirements
- VPC configuration
- Deployment package (upload to S3)
- Estimated traffic volume
Response time: 2-3 business days"

Your team: "...it's a 50-line function that would take 5 minutes to deploy."

What this looks like at scale:

  • 40+ infrastructure tickets per month
  • Average resolution time: 48 hours
  • Engineers blocked 20% of sprint capacity
  • Features shipped 2-3 weeks late

If you can't terraform apply without a ticket, you've outgrown your MSP.

Sign 2: No Infrastructure as Code

MSP approach:

# MSP's "infrastructure documentation"
RDS_INSTANCE_ID=prod-db-1
INSTANCE_CLASS=db.r5.xlarge
STORAGE=500GB
BACKUP_RETENTION=7 days

# Changed manually via AWS Console on 2024-11-15 by MSP Engineer

In-house approach:

# infrastructure/rds.tf
resource "aws_db_instance" "production" {
  identifier     = "prod-db-1"
  instance_class = "db.r5.xlarge"
  allocated_storage = 500
  backup_retention_period = 7
 
  # Git history shows who changed what and why
}

Why this matters:

  • MSP changes are invisible (no audit trail)
  • Disaster recovery requires "tribal knowledge"
  • Compliance audits are nightmares
  • Scaling horizontally requires manual replication

If your infrastructure isn't in Git, you don't control it.

Sign 3: AWS Markup Exceeds Tooling Cost

The math:

Your AWS spend: $50,000/month
MSP markup: 15% = $7,500/month = $90,000/year

Alternative tooling cost:
- Datadog: $2,000/month
- PagerDuty: $500/month
- Terraform Cloud: $1,000/month
- GitHub Actions: $500/month
Total: $4,000/month = $48,000/year

Savings by eliminating markup: $42,000/year

Paying $90K/year for AWS console access is indefensible.

Sign 4: Innovation Theater

MSP conversations sound like:

You: "Can we implement blue-green deployments for zero-downtime releases?"
MSP: "We can explore that in Q3 2025 as part of our roadmap."

You: "Can we use Spot Instances to save 70% on compute costs?"
MSP: "We recommend against that due to stability concerns."

You: "Can we migrate to Graviton instances for 40% performance improvement?"
MSP: "That's not currently supported in our standard offering."

Translation: "We don't want to change our template."

Your competitor's in-house team: Shipped all three in the same quarter.

Sign 5: You're Hiring DevOps Engineers Anyway

The trap:

Your job posting: "Senior DevOps Engineer
- Build CI/CD pipelines
- Optimize cloud costs
- Implement monitoring and alerting
- Work with MSP to provision infrastructure"

Wait... why are you paying both?

If you're hiring DevOps engineers to work around MSP limitations, fire the MSP.

The Transition Plan: 6-Month Roadmap

Month 1-2: Hire Your First DevOps Engineer

The ideal first hire profile:

  • Senior level (5+ years experience)
  • AWS Certified Solutions Architect
  • Terraform expert
  • Experience with both MSP and in-house environments
  • Strong documentation skills (will be knowledge bridge)

Compensation benchmark:

Seed stage: $130K-$150K + 0.5-1% equity
Series A: $150K-$180K + 0.25-0.5% equity
Series B: $170K-$200K + 0.1-0.25% equity

Interview question to ask:

"Walk me through how you'd transition infrastructure from an MSP to in-house over 6 months with zero downtime. What would you do first?"

Good answer: "First, I'd audit existing infrastructure and create Terraform for net-new resources. Then gradually import existing resources into Terraform state while maintaining MSP as fallback."

Bad answer: "I'd immediately migrate everything to Kubernetes on day one."

Month 2-3: Infrastructure Audit and Documentation

Your new DevOps engineer's first 60 days:

Week 1-2: Discovery

# What exists?
aws ec2 describe-instances
aws rds describe-db-instances
aws ecs list-clusters
aws lambda list-functions
aws s3 ls
 
# Document everything in a spreadsheet:
# Resource Type | ID | Purpose | Owner | Dependencies | Cost/Month

Week 3-4: Terraform Pilot

# Start with low-risk, new resources
# Example: New development environment
 
module "dev_vpc" {
  source = "terraform-aws-modules/vpc/aws"
 
  name = "dev-vpc"
  cidr = "10.1.0.0/16"  # Use a CIDR calculator to plan your network
 
  azs             = ["us-east-1a", "us-east-1b"]
  private_subnets = ["10.1.1.0/24", "10.1.2.0/24"]
  public_subnets  = ["10.1.101.0/24", "10.1.102.0/24"]
}
 
# Don't touch production yet!

Week 5-8: Import Existing Resources

# Gradually import MSP-managed resources
terraform import aws_security_group.api sg-abc123
terraform import aws_db_instance.production prod-db-1
 
# Verify state matches reality
terraform plan  # Should show "No changes"

Month 3-4: Parallel Infrastructure Management

Run both MSP and in-house in parallel:

New Resources (Month 3-4):
- All new services → In-house Terraform
- All new environments → In-house Terraform
- Net-new infrastructure → In-house Terraform

Existing Production Resources:
- RDS databases → Still MSP-managed
- ECS clusters → Still MSP-managed
- VPC/Networking → Still MSP-managed

Emergency Changes:
- MSP remains on-call for production incidents
- In-house engineer shadows all changes

Deliverables:

  • 100% of new infrastructure deployed via Terraform
  • Terraform state for 30% of existing resources
  • Runbooks documenting all operational procedures
  • Monitoring dashboards replicated in Datadog/CloudWatch

Month 4-5: Production Cutover Planning

Test disaster recovery with MSP as safety net:

Week 1: Test RDS failover
- In-house team: Promotes read replica using Terraform
- MSP: On standby in case of issues
- Result: Success, 3-minute failover

Week 2: Test ECS deployment
- In-house team: Blue-green deployment via Terraform + CodeDeploy
- MSP: Monitoring but not touching
- Result: Success, zero downtime

Week 3: Test full disaster recovery
- In-house team: Rebuild staging environment from Terraform
- MSP: Validates configurations
- Result: Success, 2-hour recovery time

Milestone: Confidence that in-house team can handle production operations

Month 5-6: Renegotiate MSP Contract

Option 1: Gradual reduction

Month 5: Reduce retainer by 50%
- In-house handles all new infrastructure
- MSP provides advisory services only
- Cost reduction: $5,000/month

Month 6: Reduce to on-call support only
- In-house handles 95% of operations
- MSP available for emergency escalations
- Cost reduction: $8,000/month

Month 7: Full termination
- In-house team fully operational
- MSP contract ends
- Savings: $11,250/month

Option 2: Convert to consulting retainer

End managed services: Save $11,250/month
Retain MSP as consultants: $3,000/month (20 hours)
- Architecture reviews
- Security audits
- Capacity planning advice

Net savings: $8,250/month = $99,000/year

Month 6+: Knowledge Transfer and Documentation

Final deliverables before MSP exit:

Infrastructure Documentation:

terraform-infrastructure/
├── docs/
│   ├── architecture.md
│   ├── runbooks/
│   │   ├── rds-failover.md
│   │   ├── ecs-deployment.md
│   │   ├── incident-response.md
│   │   └── disaster-recovery.md
│   └── cost-optimization.md
├── modules/
├── environments/
└── README.md

Operational Procedures:

  • On-call rotation schedule
  • Incident response playbooks
  • Deployment checklists
  • Cost optimization strategies
  • Security patching procedures

Monitoring and Alerting:

  • CloudWatch dashboards
  • PagerDuty escalation policies
  • Slack integrations
  • Weekly infrastructure health reports

Real-World Case Study: Series A SaaS Company

Company: B2B SaaS, $3M ARR, 30 engineers

Before (18 months with MSP):

MSP cost: $18,000/month = $216,000/year
AWS spend: $30,000/month (includes 15% MSP markup)
Infrastructure velocity: 3-day average turnaround
Deployment frequency: 2× per week
On-call incidents: 12/month (MSP handles)

Transition (6 months):

Month 1-2: Hired Senior DevOps Engineer ($165K)
Month 2-4: Terraform implementation (40% coverage)
Month 4-6: Parallel operation (in-house + MSP)
Month 6: MSP contract reduced to advisory ($5K/month)

After (6 months post-transition):

In-house cost: $165,000/year (1 DevOps engineer)
AWS spend: $26,000/month (removed 15% markup)
Infrastructure velocity: Same-day turnaround
Deployment frequency: 10× per week
On-call incidents: 8/month (in-house handles)

Total savings: $216K - $165K = $51,000/year
Hidden savings:
- Engineering productivity: +15% = $300K/year value
- Faster feature shipping: 3 extra features/quarter = $200K/year
Total value: $551,000/year

Unexpected benefits:

  • GitOps workflow improved developer experience
  • Terraform enabled environment parity (dev = staging = prod)
  • Engineers self-service infrastructure via Pull Requests
  • Cost optimization (Spot Instances, Graviton) saved $8K/month

Challenges:

  • On-call burden (mitigated with PagerDuty + runbooks)
  • Knowledge gaps (filled with AWS training)
  • First production incident was stressful (but resolved faster than with MSP)

When MSPs Still Make Sense

Don't fire your MSP if:

1. Pre-Product/Market Fit (Under $1M ARR)

You should be focused on customers, not infrastructure:

Your runway: 12 months
Engineering team: 3 developers
AWS spend: $2,000/month

MSP cost: $3,000/month
In-house DevOps engineer: $15,000/month (loaded cost)

Verdict: MSP makes sense

Use MSP to get to Series A, then reassess.

2. Highly Regulated Industries (Healthcare, Finance)

Compliance expertise has value:

HIPAA/SOC 2/PCI-DSS requirements
MSP provides:
- Pre-configured compliant infrastructure
- Audit documentation
- Security monitoring
- Compliance training

Cost of non-compliance: $millions in fines
Cost of MSP: $200K/year

Verdict: MSP might still make sense

But verify MSP uses IaC, or you'll face technical debt later.

3. Expertise Gaps in Specific Technologies

Example: Legacy Windows infrastructure

Your stack: .NET Framework on Windows Server
Your team: Modern cloud-native engineers

MSP expertise:
- Active Directory management
- Windows Server patching
- IIS configuration
- MSSQL administration

Verdict: MSP makes sense temporarily while you modernize

Set a 12-month deadline to migrate to cloud-native.

Hybrid Model: Best of Both Worlds

Pattern: In-house for application infrastructure, MSP for specialized services

In-House Team Handles:
- ECS/Kubernetes clusters
- Lambda functions
- API Gateway
- DynamoDB/RDS
- CI/CD pipelines
- Application monitoring

MSP Handles:
- Network security (WAF, Shield, Firewall)
- Compliance audits (SOC 2, HIPAA)
- Advanced monitoring (Threat detection)
- Disaster recovery testing

Cost: $10,000/month MSP + $165K/year in-house = $285K/year

This works if:

  • MSP provides true specialized expertise
  • Clear ownership boundaries
  • MSP uses IaC (Terraform) that you control
  • No emergency situations require "calling the MSP"

The "One DevOps Engineer" Risk

The concern: "What if they quit?"

Mitigation strategies:

1. Documentation-First Culture

Every change includes:
- Terraform code (reviewable, reversible)
- Pull request description (why, not just what)
- Runbook updates
- Architecture decision records (ADRs)

2. Managed Services for Operational Overhead

Use AWS Managed Services:
- RDS (not self-managed Postgres)
- ECS Fargate (not EC2-based ECS)
- Lambda (not long-running servers)
- Managed Prometheus (not self-hosted)

Reduces operational burden by 60%

3. Fractional DevOps Consultant (Backup Plan)

Retainer: $5,000/month (10 hours)
- Monthly architecture review
- Quarterly disaster recovery test
- Emergency escalation contact

Acts as "insurance policy" if engineer leaves

4. Clear Career Path (Retention)

Year 1: Senior DevOps Engineer → Build foundation
Year 2: Staff Engineer → Mentor new hire
Year 3: Engineering Manager → Build team

Compensation growth:
Year 1: $165K
Year 2: $185K + promotion to Staff
Year 3: $210K + team of 3

Reality check: An engineer who built your infrastructure is more likely to stay than an external MSP that manages 50 clients.

Conclusion: The Tipping Point is Series A

The data shows:

  • MSPs make sense for seed-stage startups (under $1M ARR, fewer than 10 engineers)
  • Break-even point is $150K-$200K annual MSP spend
  • Velocity bottlenecks emerge at 15-20 engineers
  • In-house teams pay for themselves via productivity gains

The transition trigger:

If (AWS_spend > $15K/month) AND (engineering_team > 15) AND (deploy_frequency > 5/week):
    start_transition_plan()

Don't wait until:

  • You've hired 2-3 DevOps engineers working around the MSP
  • Technical debt is so high migration takes 12+ months
  • Your MSP contract auto-renews for another year

Start the conversation now.

Action Items

  1. Calculate your true MSP TCO: Direct fees + productivity loss + opportunity cost
  2. Audit current MSP: Do they use IaC? What's average ticket response time?
  3. Define success metrics: Deployment frequency, incident response time, engineer satisfaction
  4. Interview DevOps candidates: Even if not ready to hire, gauge market
  5. Negotiate MSP contract: Add IaC requirements, reduce AWS markup, shorten auto-renewal
  6. Start Terraform pilot: Deploy one new environment in-house to test feasibility

If you're considering transitioning from an MSP to in-house infrastructure, schedule a consultation. We'll audit your current setup, calculate TCO, and design a risk-free transition plan that maintains operational stability while building internal capabilities.

Need Help with Your Cloud Infrastructure?

Our experts are here to guide you through your cloud journey

Schedule a Free Consultation