Full-Time

Observability & Operations Engineer

Fullbay

Fullbay

51-200 employees

Cloud-based shop management for heavy-duty repairs

No salary listed

Phoenix, AZ, USA

In Person

Category
DevOps & Infrastructure (2)
,
Required Skills
Kotlin
Datadog
Claude
Kubernetes
Grafana
Node.js
Java
opentelemetry
AWS
Prometheus
Terraform
Splunk
Requirements
  • 7–10 years of experience in Software Engineering, Cloud Operations, or Site Reliability Engineering
  • 5+ years of hands-on experience with AWS infrastructure and AWS PaaS services; certifications are a plus
  • Demonstrated experience building repeatable, code-first pipelines and treating operational configuration as first-class software
  • Experience working with polyglot environments including Java, Kotlin, and Node.js
  • Demonstrated experience using AI tools (coding assistants, AI-powered observability platforms, or similar) in a professional setting — we’re an AI-first company and expect this to be part of how you work, not something you’re just exploring
  • Deep experience with enterprise observability platforms — including AWS-native tooling such as CloudWatch, X-Ray, and OpenTelemetry, or comparable platforms such as Datadog, Grafana, or Prometheus
  • Proficiency with distributed tracing frameworks and log management platforms (e.g. ELK Stack, Splunk, Fluent Bit); experience mapping these patterns to AWS-native tooling is a strong plus
  • Strong understanding of SRE principles including SLOs, SLAs, error budgets, and chaos engineering
  • Hands-on FinOps experience — cloud cost allocation, chargeback modeling, rightsizing, and savings plans optimization across AWS
  • Strong working knowledge of AWS PaaS services including Lambda, API Gateway, ECS, RDS, SQS, SNS, and IAM — and how to leverage them to build scalable operational tooling
  • Experience instrumenting polyglot applications (Java, Kotlin, Node.js) and cloud-native microservices for observability
  • Proven ability to build repeatable, code-first pipelines — treating dashboards, alerts, runbooks, and infrastructure configuration as versioned, testable software
  • Experience with CI/CD tooling, specifically Harness
  • Solid understanding of Infrastructure as Code using Terraform
  • Ability to lead incident response, facilitate blameless post-mortems, and drive long-term reliability improvements
  • Strong collaboration skills for working across platform and product engineering teams
  • Knowledge of containerization technologies and microservices architecture
Responsibilities
  • Design and implement a comprehensive observability strategy (logging, metrics, tracing, alerting) across all AWS environments, leveraging AI-powered tools to detect anomalies and surface insights automatically
  • Build and manage monitoring platforms such as Datadog, Grafana, Prometheus, and AWS CloudWatch — actively exploring AI-native features within these tools to reduce alert fatigue and improve signal quality
  • Use AI coding assistants (e.g. GitHub Copilot, Claude) to accelerate development of dashboards, runbooks, and automation scripts
  • Own the incident management lifecycle — on-call rotations, post-mortems, root cause analysis — and apply AI-assisted log analysis to speed up diagnosis and resolution
  • Instrument Java, Kotlin, and Node.js-based cloud-native applications to emit structured logs, distributed traces, and metrics; identify opportunities to use ML-based anomaly detection in place of static thresholds
  • Build repeatable, code-first observability pipelines that treat dashboards, alerts, and runbooks as first-class software — versioned, tested, and deployed through Harness
  • Leverage AWS PaaS services (Lambda, API Gateway, ECS, RDS, SQS, SNS, and others) to build scalable, automated operational tooling
  • Collaborate with development teams to embed observability and AI-assisted quality checks into CI/CD pipelines via Harness
  • Own the FinOps function for our AWS environment — tracking cloud spend, building cost dashboards, identifying waste, and using AI-powered cost analysis tools to surface optimization opportunities and drive accountability across engineering teams
  • Monitor AWS infrastructure for performance, availability, and cost — partnering with finance and engineering to enforce spend governance
  • Develop and maintain Infrastructure as Code using Terraform, using AI pair programming to improve quality and consistency
  • Contribute to architectural decisions with a focus on resilience, automation, and reducing toil through intelligent systems
  • Adheres to all confidentiality and compliance regulations
  • Performs other duties as assigned

Fullbay provides a cloud-based software-as-a-service platform that helps heavy-duty repair shops manage their entire operation. It acts as an end-to-end shop management system, letting shop staff handle service orders, invoicing, parts inventory, and scheduling from a single web-based interface that can be used on any device. It also offers an online customer portal to approve estimates, track repair progress, and pay invoices, plus tools for technicians to receive assignments, log labor time, and access repair guidance. The company differentiates itself by focusing specifically on the heavy‑duty repair market and by offering integrated workflows tailored to fleets and independent repair shops, including QuickBooks integration and automated parts ordering, plus strategic acquisitions to broaden services. Its goal is to digitize and streamline repair shop operations, increase efficiency, and support business growth in the heavy‑duty maintenance sector.

Company Size

51-200

Company Stage

Early VC

Total Funding

$23.2M

Headquarters

Phoenix, Arizona

Founded

2012

Simplify Jobs

Simplify's Take

What believers are saying

  • Heavy-duty shops hit $5.04B revenue in 2025, fueling 68% net growth for Fullbay.
  • Diesel Connect 2026 conference on May 19-21 boosts networking with 5,000 shops.
  • JMI Equity investment accelerates expansion using $6.5B repair data ecosystem.

What critics are saying

  • Technician shortage worsens, with 61% shops constrained, slashing labor capacity now.
  • Whip Around CEO keynotes Diesel Connect, pitching competitors to Fullbay's audience.
  • Pitstop AI integration fails, dropping prediction accuracy and driving churn by Q4 2026.

What makes Fullbay unique

  • Fullbay delivers end-to-end SaaS exclusively for heavy-duty truck repair shops.
  • Pitstop acquisition on March 25, 2026, adds 94% accurate AI predictive maintenance.
  • Cloud platform integrates QuickBooks, Mitchell 1, and FleetCross for seamless workflows.

Help us improve and share your feedback! Did you find this helpful?

Your Connections

People at Fullbay who can refer or advise you

Benefits

Flexible Work Hours

Professional Development Budget

Growth & Insights and Company News

Headcount

6 month growth

0%

1 year growth

0%

2 year growth

-2%
PR Newswire
Mar 25th, 2026
Fullbay acquires Pitstop to add AI-powered predictive maintenance to $6.5B repair platform

Fullbay, a heavy-duty repair shop management platform, has acquired Pitstop, an AI-powered predictive maintenance and fleet intelligence platform. The acquisition will integrate Pitstop's technology with Fullbay's 10 years of repair data from over 5,000 shops and $6.5 billion in annual service orders. The combined platform will deliver predictive maintenance solutions, enabling fleets to anticipate vehicle failures before they occur. Pitstop's AI technology, which analyzes billions of data points, achieves over 94% accuracy in identifying potential failures weeks in advance. The system monitors units in real time, automatically generates service requests, and predicts parts demand. Founded in 2015, Pitstop joins Fullbay's ecosystem serving commercial repair shops and fleet maintenance departments. Fullbay has promoted Scott Gordon to chief product officer to oversee the expanded portfolio.

PR Newswire
Mar 25th, 2026
Fullbay acquires Pitstop to strengthen ai-powered predictive maintenance.

Fullbay acquires Pitstop to strengthen ai-powered predictive maintenance. PR Newswire Today at 12:03pm PDT PHOENIX, March 25, 2026 /PRNewswire/ - Fullbay, the largest and most comprehensive turn-key platform that improves the operational efficiency of heavy-duty repair shops, has announced its acquisition of Pitstop, an AI-powered predictive maintenance and fleet intelligence platform that will be integrated into the Fullbay platform. Pitstop's proven AI-driven technology will leverage 10+ years of repair data from Fullbay's shop management software to deliver predictive maintenance solutions for fleets to anticipate failures before they impact operations. With this acquisition, Fullbay brings together the pieces for the largest, most comprehensive platform at the center of every commercial repair shop and fleet. By combining Fullbay's industry-leading repair data from more than 5,000 shops and $6.5 billion in annual service orders and parts with Pitstop's predictive AI technology, the new AI-powered maintenance module transforms real-world service history into actionable maintenance intelligence. The result is a powerful foundation for true predictive maintenance, helping fleets stay ahead of issues and improving driver safety. "In our industry, operations are too often reactive, and the repair process can be inefficient due to unpredictability. When a truck breaks down - possibly in a situation unsafe for the driver and other motorists - the unit is towed, the problem is inspected, parts are ordered, the unit is fixed, and the process repeats," said Trent Broberg, CEO of Fullbay. "This acquisition enables us to change that model by delivering predictive maintenance, real-time diagnostics, fault-code management and automated fleet communication directly into Fullbay, revolutionizing the experience for fleets and heavy-duty repair." Through this acquisition and integration, Fullbay is addressing the gap between reactive repair and proactive maintenance with technology that: * Monitors units in real time and flags issues to shop staff before they turn into breakdowns * Automatically generates service requests from vehicle issues, including PMs, prioritized fault codes, and predictive alerts * Provides predictive insights based on return patterns * Produces accurate unit health reports * Accurately predicts parts demand so inventory is available before the job starts "Pitstop was built to help fleets move from 'fix it when it fails' to knowing what's coming next and acting before downtime, cost, or safety risks hit," said Shiva Bhardwaj, CEO and founder of Pitstop. "By integrating Pitstop's AI-powered predictive technology with Fullbay's comprehensive platform, we can scale our impact across the industry while delivering the proactive, data-driven solutions that fleets desperately need to improve safety, reduce costs, and maximize uptime." Beyond preventing breakdowns, the Pitstop technology helps fleets and internal maintenance teams reduce unplanned downtime and extend asset life, and independent shops better serve their fleet customers and build customer loyalty. By analyzing billions of data points to detect fault patterns and identify issues weeks before a breakdown occurs, the Pitstop system delivers more than 94% accuracy in identifying potential failures to enable large fleets to fundamentally shift from reactive service to proactive maintenance planning. In addition to this strategic acquisition, Fullbay has promoted Scott Gordon to Chief Product Officer from Vice President of Product. In this new role, he will oversee the expanded product portfolio and drive the company's AI-first vision across the combined organization. As a seasoned SaaS product executive with extensive experience at Microsoft and Amazon prior to joining Fullbay in early 2024, Scott has a proven track record of transforming complex industry challenges into scalable software solutions. "This acquisition represents an incredible opportunity to accelerate our mission of delivering customer-driven innovation at scale," said Gordon. "By combining our teams' expertise and technologies, we can build even more powerful solutions using traditional and AI-enabled innovation to truly move the needle for our customers." ABOUT PITSTOP Founded in 2015 by CEO Shiva Bhardwaj, inspired by his experience in his dad's mechanic shop, Pitstop is a solutions provider that empowers fleets with tools to eliminate vehicle downtime, improve planned maintenance efficiency, and reduce manual inputs. By combining data and other integrations into one easy-to-use platform, Pitstop simplifies fleet maintenance. Its advanced analytics enable fleets to streamline work order and DVIR workflows, generate comprehensive vehicle health reports, provide predictive failure alerts, and much more! ABOUT FULLBAY Fullbay revolutionizes the operations of heavy-duty repair shops and internal fleet maintenance departments to create more efficient, focused, and faster organizations. The company employs the latest technology, expertise and an AI-first approach to provide a turn-key platform that connects every function of customers' businesses in real time from any location to improve workflow and create transparency in operations. Founded in 2014 and based in Phoenix, Arizona, Fullbay focuses on delivering operational excellence, preventive maintenance solutions, and inventory management optimization to its wide variety of customers. Through its Fullbay Cares program, the company commits to giving back to the essential workers of heavy-duty repair by supporting charitable organizations and dedicating time and resources to community service. SOURCE Fullbay This is a paid placement. For further inquiries, please contact PR Newswire directly.

Yahoo Finance
Mar 20th, 2026
Fullbay report: Heavy-duty shops hit $5B revenue amid worsening technician shortage

Fullbay's sixth annual State of Heavy-Duty Repair report reveals the commercial vehicle repair sector achieved $5.04 billion in service order commerce during 2025, with net new revenue reaching $2.05 billion — a 68% increase from 2023 to 2025. Sixty-one percent of shops reported stronger business compared with 2024. The findings, drawn from nearly 900 industry professionals across Fullbay's network of 5,000 shop locations, highlight growth driven by aging truck fleets and increasing equipment complexity. However, the report warns of structural challenges, including rising wages and an aging workforce that could threaten long-term sustainability. The report, previewed at the Technology & Maintenance Council's 2026 Annual Meeting in Nashville, becomes publicly available on 23 March. Seventy percent of respondents operate independent repair shops, with a median workforce of eight employees.

PR Newswire
Sep 24th, 2025
Trent Broberg joins Fullbay as CEO, succeeding Patrick McKittrick

Trent Broberg joins Fullbay as CEO, succeeding Patrick McKittrick.

Fullbay
Aug 15th, 2025
New Integration Ahoy: Fullbay + Mitchell 1

That's why Fullbay, Inc. is pleased to announce that Fullbay now integrates with Mitchell 1's ProDemand and TruckSeries labor time module.