Cerebras is developing a radically new chip and system to dramatically accelerate deep learning applications. Our system runs training and inference workloads orders of magnitude faster than contemporary machines, fundamentally changing the way ML researchers work and pursue AI innovation.
We are innovating at every level of the stack – from chip, to microcode, to power delivery and cooling, to new algorithms and network architectures at the cutting edge of ML research. Our fully-integrated system delivers unprecedented performance because it is built from the ground up for the deep learning workload.
Cerebras is building a team of exceptional people to work together on big problems. Join us!
About The Role
Cerebras’ fully-integrated system is built from the ground up with a singular focus on ML, where the hardware, software, and ML algorithms are co-designed in tight collaboration. The foundation is the Wafer Scale Engine (WSE), a single chip that is 56x larger than a GPU and has orders of magnitude higher memory bandwidth and fully unstructured sparsity acceleration. On top of the WSE, there is a cluster architecture that scales to train the largest neural networks in the world.
This is an applied research engineer role working in tight collaboration with senior researchers to co-design state-of-the-art ML algorithms on this unique specialized architecture. It will focus on designing the novel software architecture, workflows, analysis tools, and infrastructure for state-of-the-art ML algorithms. Areas of research beyond what’s possible on GPU include new sparsity algorithms, unique approaches to model scaling and parallelism, and novel efficient training techniques.
Our research is focused on improving state-of-the-art large language models (e.g. BERT, GPT) and computer vision models (e.g. ResNet, Vision Transformer) in many dimensions unique to the Cerebras architecture, such as:
- Sparse and low-precision training algorithms for reduced training time and increased accuracy
- Compute- and memory-efficient training techniques such as reversibility and low-rank
- Scaling laws for increasing model size: accuracy/loss, architecture scaling, hyperparameter transfer
- Optimizers, initializers, normalizers to improve distributed training on large scale clusters
- Design and develop ML workflows and user interfaces for novel algorithms
- Design and develop software for scaling models and large-scale experimentation
- Design and develop analysis tools to drive efficient research insight, including dataset cleaning, analysis of training dynamics and gradient quality
- Publish and present research at leading machine learning conferences
- Strong grasp of machine learning fundamentals and computer science
- Experience with scaling state-of-the-art models on large distributed clusters
- Deep knowledge of machine learning frameworks, such as TensorFlow and PyTorch
- Deep knowledge of distributed training concepts and frameworks such as Megatron and Deepspeed
- Fluency in a programming language, such as Python and C++
- Experience in research environments in academic or industry labs
- Track record of relevant publications/patents
- $140,000 - $200,000 (US only, based on experience, location and other determining factors)
Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.
This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice.