HPC Platform Engineer
Posted on 10/26/2022
INACTIVE
Locations
Montreal, QC, Canada
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
AWS
Development Operations (DevOps)
Docker
Google Cloud Platform
Linux/Unix
Management
Microsoft Azure
SQL
Kubernetes
Python
Ansible
Requirements
- De 5 à 7 ans d'expérience en génie logiciel, en exploitation de développement ou en administration Linux, avec au moins 1 an d'expérience en développement et exploitation de Kubernetes
- Solide expérience en programmation, avec une préférence pour Go et Python
- Expérience dans la création d'applications RESTful
- Expérience en rédaction de requêtes SQL
- Compréhension de la conteneurisation, du réseautage de conteneurs et de Kubernetes
- Expérience en intégration continue, en déploiement continu et en infrastructure en tant que code
- Expérience avec au moins une plateforme infonuagique (GCP, AWS ou Azure)
- Connaissance pratique de l'automatisation, de la surveillance et des alertes de systèmes ou de réseau
- Connaissance pratique de certains systèmes de gestion de la configuration (comme Salt ou Ansible)
- Expertise et intérêt démontrables en dépannage et en désir d'automatiser, avec accent sur l'expérience utilisateur final
- 5-7 years of experience in software engineering, development operations, or Linux administration, with at least 1 year of Kubernetes DevOps experience
- Strong programming experience, with a preference for Go and Python
- Experience in building RESTful applications
- Experience in writing SQL queries
- Understanding of containerization, container networking, and Kubernetes
- Experience in continuous integration/continuous deployment and infrastructure-as-code
- Experience with at least one cloud platform (GCP, AWS or Azure)
- Working knowledge of systems or network automation, monitoring, and alerting
- Working knowledge of some configuration management system (such as Salt or Ansible)
- Demonstrable troubleshooting expertise and interest, a desire to automate and focus on end-user experience
Responsibilities
- Developing and enhancing Tower's HPC infrastructure stack - compute, storage, networking, automation and monitoring
- Guiding platform users in designing, building, testing and deploying changes to existing software for the HPC on-premises and cloud-based environments
- Developing and operating company's containerized applications environment (Docker, bare-metal Kubernetes and GKE)
- Developing and operating the company's cloud infrastructure (GCP, AWS)
- Developing, maintaining and improving HPC workload management software (HTCondor)
- Developing, maintaining and improving HPC custom management tools, services, and SDKs
- Developing metric collection capabilities, analyzing results and using them to improve HPC clusters resource utilization and performance
- Managing code deployments, fixes, updates and related processes
- Updating system processes and designing new processes as needed. Identify manual processes that can be automated and help with their automation
Electronic trading services.
Company Overview
Tower Research Capital's goal is to optimize finance through technology. The company is building a proprietary trading system based on technological principles.
Benefits
- 5 weeks of paid vacation per year
- 401(k) with company matching
- Free meals and snacks
- Reimbursement for health and wellness expenses
Company Core Values
- Customer focus
- Entrepreneurial spirit