Senior GPU Performance Engineer
Posted on 7/19/2023
INACTIVE
Self driving car robotics company
Company Overview
Zoox is reinventing personal transportation—making the future safer, cleaner, and more enjoyable for everyone. The company is building the infrastructure for self-driving cars.
AI & Machine Learning
Automotive & Transportation
Company Stage
M&A
Total Funding
$2.3B
Founded
2014
Headquarters
Foster City, California
Growth & Insights
Headcount
6 month growth
↑ 11%1 year growth
↑ 27%2 year growth
↑ 86%Locations
San Mateo, CA, USA
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
CUDA
Data Structures & Algorithms
Linux/Unix
CategoriesNew
AI & Machine Learning
Software Engineering
Requirements
- BS in computer science or related field
- Strong knowledge of C++ and experience in large code bases
- Strong knowledge of CUDA as applied to recent GPU microarchitectures (e.g., Turing, Ampere)
- Strong knowledge in linear algebra; 3D geometry and/or dynamical systems and control
- Experiencing debugging and optimizing GPU kernels using tools like Nsight Systems and Compute
- Comfortable in Linux development environments
Responsibilities
- Build real-time instrumentation for performance monitoring (CPU, GPU, latency, memory) of the system at hand and offline benchmarking frameworks to support performance evaluation
- Build tools and scripts to evaluate & analyze performance at scale in CI as well as in vehicle. Establish budgets for existing architectures, and provide data to define next-gen architectures
- Analyze performance metrics to identify GPU hotspots and root causes, and propose and co-implement actionable solutions with component teams
- Support teams on bringing serial algorithms to the GPU to maximize compute utilization and improve overall latency
- Work as part of the Core team to build a middleware framework that promotes by default efficient and performant code development by maximizing CPU and GPU
Desired Qualifications
- GPU kernel development experience in a firm/hard real-time environment
- Experience in development, debugging and profiling of complex multiprocess systems (e.g. game engines, robotic systems)
- Experience with PTX-level programming
- Experience with CPU SIMD instructions (e.g., AVX intrinsics)
- Experience with TensortRT and custom CUDA layers