Tower Research Capital LLC, a high-frequency proprietary trading firm founded in 1998, seeks a Linux System Administrator to join our Server Reliability Engineering team. The Server Reliability Engineering organization is responsible for providing innovative processes and tools for the operation of Tower’s high-frequency Linux-based trading platforms and High Performance Computing Environment (HPC). You will also be expected to propose and drive the adoption of Infrastructure as Code (IaC) practices to make our storage solutions scalable and manageable, and develop our growing needs with GPU, balancing on-premises and cloud-based resources.
Responsibilities
- Supporting, maintaining, and enhancing the firm’s trading Linux infrastructure
- Supporting, maintaining, and enhancing the firm’s HPC infrastructure for research
- Providing support specifically for the Linux and HPC environments including:
- Emergency response
- Execution of planned changes, updates, and deployment projects within the Linux server infrastructure
- Manage HPC systems to support trading operations and Condor Job scheduler
- Advanced profiling and troubleshooting of performance issues specifically within the Linux servers environment
- Contributing to the development and refinement of tools and systems to automate provisioning, configuration, and monitoring of thousands of Linux servers
- Management of essential core services such as DHCP, LDAP, DNS, and NFS for on-prem and hosted data centers as well as public clouds
- Participating in an on-call rotation and occasional weekend shifts
- Engaging in daily direct communication with trading teams and core engineering
- Stay up-to-date with the latest technologies and best practices in HPC, storage, and GPU computing.
Qualifications
- Experience in maintenance, operation, and administration of a sufficiently advanced Linux environment
- Daily use of and contribution to developing automation and monitoring tools
- Comprehensive understanding of Linux OS concepts and internals
- Working knowledge of Intel-based hardware and server components
- Good knowledge of Python, expert knowledge of Bash for scripting and automation tasks in a Linux environment
- Understanding of Linux server-side networking and typical network protocols
- Participation in open source or personal projects is a plus
- Understanding of Linux configuration management, source control, CI/CD, and automated deployment
- Strong communication skills and the ability to work effectively in a team.
Preferred Qualifications
- Experience with containerization and orchestration tools (e.g., Docker, Kubernetes).
- Familiarity with cloud computing platforms and hybrid cloud environments.
- Knowledge of parallel file systems (e.g., GPFS), batch systems (e.g., Slurm, Grid Engine, Condor), and high-performance network interconnects.
- Experience with VAST and Weka storage solutions is highly desirable.
- Solid understanding of trading infrastructure and low-latency systems.
- Excellent problem-solving skills and the ability to work in a fast-paced, dynamic environment.
- Skills in managing hybrid cloud/on-premises environments.
- Experience proposing and implementing Infrastructure as Code (IaC) practices from the ground up.
Anticipated New York annual base salary range $100,000-$130,000 plus eligible for discretionary bonus.
Benefits
Tower’s dual offices and garden roofdecks are located in TriBeCa and SoHo, neighborhoods in downtown Manhattan. While we work hard, Tower’s cubicle-free workplace, jeans-clad workforce, and well-stocked kitchens reflect the premium the firm places on quality of life. Benefits include:
- 401(k) with company matching
- 5 weeks of paid vacation per year plus 11 paid holidays
- Free breakfast, lunch, and snacks on a daily basis
- Reimbursement for health and wellness expenses
- Free events and workshops
- Donation matching program
Tower Research Capital is an equal opportunity employer.