Full-Time

Devops Engineer

Confirmed live in the last 24 hours

Lightning AI

Lightning AI

51-200 employees

AI development platform for coding and deployment

AI & Machine Learning

Compensation Overview

$120k - $190kAnnually

Mid, Senior

New York, NY, USA

This is a hybrid role with a two-day in-office requirement.

Category
DevOps & Infrastructure
DevOps Engineering
Required Skills
Kubernetes
Microsoft Azure
Grafana
Git
Docker
CloudFormation
AWS
Go
Prometheus
Jenkins
Terraform
Ansible
Development Operations (DevOps)
CircleCI
Google Cloud Platform

You match the following Lightning AI's candidate preferences

Employers are more likely to interview you if you match these preferences:

Degree
Experience
Requirements
  • Proven experience as a DevOps Engineer or in a similar role, with a deep understanding of cloud infrastructure (AWS, GCP, or Azure)
  • Expertise in CI/CD tools such as Jenkins, CircleCI, GitHub, or GitLab
  • Ability to code in golang
  • Experience with infrastructure as code tools like Terraform, Ansible, or CloudFormation
  • Familiarity with containerization technologies like Docker and Kubernetes
  • Knowledge of monitoring and logging tools such as Prometheus, Grafana, or ELK stack
  • A strong security mindset with experience in managing secure cloud environments
  • Excellent problem-solving skills, attention to detail, and ability to work in a fast-paced, collaborative environment.
Responsibilities
  • Design, build, and maintain scalable infrastructure for deploying, monitoring, and automating our cloud environments.
  • Collaborate closely with development teams to ensure seamless integration and delivery of new features.
  • Implement and manage CI/CD pipelines to improve deployment frequency and reduce manual intervention.
  • Monitor system performance, identify bottlenecks, and develop strategies to improve reliability and performance.
  • Ensure security best practices are followed across infrastructure and deployment processes.
  • Troubleshoot and resolve infrastructure-related issues in a timely manner.
  • Stay up to date with the latest industry trends and tools to drive innovation in DevOps practices.

Lightning AI provides a platform for developing artificial intelligence applications, covering the entire process from idea generation to deployment. The platform is accessible through web browsers, allowing developers and data scientists to easily code, prototype, and train AI models using GPUs. Users can work on CPUs for coding, debug on GPUs, and scale their models across multiple nodes, all within a cloud-based environment that offers persistent storage. Lightning AI operates on a subscription model, targeting both enterprises and individual developers who need effective tools for AI development. Key features of the platform include PyTorch Lightning, Fabric, Lit-GPT, and torchmetrics, which enhance the scaling and optimization of AI models.

Company Size

51-200

Company Stage

Late Stage VC

Total Funding

$105.6M

Headquarters

New York City, New York

Founded

2015

Simplify Jobs

Simplify's Take

What believers are saying

  • Recent $50M funding round indicates strong market confidence in Lightning AI.
  • Thunder compiler speeds up AI model training, reducing costs significantly.
  • Collaboration with AWS enhances AI model performance using cutting-edge hardware.

What critics are saying

  • Open-source models like Dolly and Alpaca challenge Lightning AI's proprietary offerings.
  • Rapid AI development may outpace Lightning AI's innovation capabilities.
  • Dependence on AWS could affect cost structure if service terms change.

What makes Lightning AI unique

  • Lightning AI offers a comprehensive AI lifecycle platform from ideation to deployment.
  • The platform's integration with AWS Marketplace simplifies enterprise procurement processes.
  • PyTorch Lightning's popularity supports Lightning AI's open-source framework approach.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Health Insurance

Dental Insurance

Vision Insurance

Life Insurance

Flexible Paid Time Off

Paid Family Leave

Phone/Internet Stipend

Home Office Stipend

Professional Development Budget

Gym Membership

Mental Health Support

Stock Options

Growth & Insights and Company News

Headcount

6 month growth

4%

1 year growth

4%

2 year growth

-6%
Pulse 2.0
Nov 21st, 2024
Lightning AI Secures $50M for Growth

Lightning AI, creator of the PyTorch Lightning framework, secured $50 million in funding from Cisco Investments, J.P. Morgan, K5 Global, and NVIDIA, totaling $103 million. With 240,000 users across 2,000 organizations, Lightning AI offers cloud-based development environments that simplify AI development. The platform integrates with popular ML tools and provides flexible pricing, including a free tier. It helps enterprises reduce infrastructure setup time and streamline AI deployment.

Business Wire
May 9th, 2024
Lightning Ai Launches Its Studio Platform In Aws Marketplace

NEW YORK--(BUSINESS WIRE)--Lightning AI, the company behind PyTorch Lightning, with over 100 million downloads, announced today that its all-in-one, web-based AI development platform Lightning AI Studio is now available in AWS Marketplace, a digital catalog with thousands of software listings. AWS Marketplace offers a seamless way to leverage pre-approved budgets and increase business performance through a single view across customers’ IT spend, simplifying procurement processes with flexible options like private pricing and consolidated billing through AWS. Studio, an enterprise-grade suite of tools, fundamentally re-architects how developers work by abstracting away every non-core activity and providing a single interface for all AI development needs. Users can:jumpstart the development of AI products with pre-built templates,scale from CPU to GPU to multi-node, distributed large-scale processing across as many machines as they need at the click of a button, andleverage natively integrated tools or build their own and deploy anywhere, among many other features.Studio also offers seamless integration with the Lightning open source stack that includes PyTorch Lightning, and allows developers to train, fine tune, deploy the latest generative AI models across multiple GPUs at maximum efficiency. “Lightning AI Studio is the most advanced all-in-one platform to build, train, and deploy any AI model on your data, in your Cloud or data center, with all the tools you love,” said William Falcon, founder and CEO of Lightning AI. “By accessing Studio in AWS Marketplace, customers will realize greater productivity and faster development times for their AI apps.”

EnterpriseAI
Mar 28th, 2024
Lightning AI Today Announces 'Thunder' to Speed Up AI Model Training

Lightning AI today announced the availability of Thunder, which it hails as a new and powerful source-to-source compiler for PyTorch.

Business Wire
Mar 28th, 2024
Lightning Ai Announces Availability Of Thunder; A Powerful Source-To-Source Compiler For Pytorch That Speeds Up Training And Serving Generative Ai Models Across Multiple Gpus, Built With Support From Nvidia

NEW YORK--(BUSINESS WIRE)--Following the company’s presentation at the NVIDIA GTC AI conference, Lightning AI, the company behind PyTorch Lightning, which has over 100 million downloads, today announced the availability of Thunder, a new and powerful source-to-source compiler for PyTorch designed for training and serving the latest generative AI models across multiple GPUs at maximum efficiency. Thunder is the culmination of two years of research on the next generation of deep learning compilers, built with support from NVIDIA. Large model training can cost billions of dollars today because of the number of GPUs and the length of time it takes to train these models. Lack of high-performance optimization and profiling tools puts this scale of training out of reach for developers who don’t have the resources of a large technology company. Even at its early stage, Thunder achieves up to a 40% speed-up for training large LLMs, compared to unoptimized code in real-world scenarios. These speed-ups save weeks of training and lower training costs proportionally

Business Wire
Feb 21st, 2024
Lightning Ai Signs Strategic Collaboration Agreement With Aws To Offer Enterprises Optimized Performance For Building And Deploying Ai Products

NEW YORK--(BUSINESS WIRE)--Lightning AI, the creator of PyTorch Lightning and Lightning Studios, announced today it has signed a Strategic Collaboration Agreement (SCA) with Amazon Web Services, Inc. (AWS). The SCA allows Lightning AI to leverage AWS compute services to power generative artificial intelligence (AI) services and to provide first-class support for Amazon Elastic Compute Cloud (Amazon EC2) Trn1 instances, powered by AWS Trainium accelerators, directly within the platform. By collaborating with AWS, Lightning AI is able to offer a powerful, enterprise-grade, cloud-based platform for building and deploying AI products. Last month, the company announced Lightning Studios, which are cloud-based virtual environments where AI researchers and developers can code on the browser or from their laptops to develop and ship AI together. Developers today, string together 20 platforms to monitor, train, serve, prep data, host apps, etc