AWS Infrastructure Engineer @ Sendwave

INACTIVE

Full-Time

AWS Infrastructure Engineer

Posted on 3/16/2023

Sendwave

1,001-5,000 employees

Mobile app for fee-free financial services in Africa

Fintech

Senior

United States

Required Skills

Bash

Agile

Python

React.js

Management

Git

SQL

AWS

Terraform

Customer Service

Development Operations (DevOps)

Splunk

Linux/Unix

Requirements

A skilled Engineer. At least 5 years in Cloud Operations/Platform Services with a keen interest in solving problems using automation
Understand SRE and DevOps methodologies. You understand the build and deployment cycle of an application, and how to operate a resilient system
Strong experience in Incident & Event Management (NOC, App Support…)
Experience with support and troubleshooting of 24x7 high volume transactional Web applications
Knowledge of Windows and Linux systems
Experience of Cloud infrastructure and platform services (we run on AWS)
Familiarity with terraform and IaC best practices
Solid experience with GitHub or other version control systems
APM systems such as Dynatrace, AppDynamics and/or New Relic
Alerting tool such as Grafana OnCall, PagerDuty, or OpsGenie
Experience in Scripting languages such as Python, Bash and PowerShell
Strong verbal and written communication skills. Ability to take ownership of issues
Systematic problem-solving approach. You should have an understanding of how to analyze, and troubleshoot large-scale distributed systems
Happy in the Clouds. Our Cloud Native platform is hosted on AWS. You'll be comfortable working with a system that supports users from around the world, at scale. Experience working for a Digital company, delivering real time transactional services (Finance/regulated) is preferred
Bias for action. You see a problem, you fix a problem. You get buy-in for your solutions and keep tickets moving. We're always looking for ways to ship at pace
Growth mindset. A willingness to use your skills and experience to mentor less-experienced engineers. A desire to learn from others and make yourself better every day
Agile outlook. You need to be excited about working in a fast-changing environment. Products, tools, frameworks and processes change, we evolve and take the best bits with us. The teams drive the evolution

Responsibilities

Monitor our Production systems and react to alerts swiftly
Ensure 24x7 availability of our product platform working with the Tech teams
Participate in the development of our monitoring & alerting strategies with the SRE team across multiple cloud environments, in particular AWS, using advanced monitoring tools like Grafana, AppDynamics and Splunk
Experience in cloud and on premise infrastructure, understanding the challenges and considerations to migrate workloads from on premise to AWS. Understanding of SQL, Windows server, Active Directory, DNS, VMWare, Networking skills and the willingness to research and advise on new technologies and developments
Manage incidents, categorization, triage, resolution and escalation
Communicate appropriately with our business stakeholders on incidents (Customer Service…)
Participate in an oncall/shift rotation
Use code to solve problems. configuration, infrastructure, tooling, and automation, everything must be solved by writing high quality code that performs and scales
Using best practices and standards in regards to Observability, Monitoring, Alerting, Capacity Planning, availability, performance/latency, change, troubleshooting for all our Tech services
Work closely with feature teams to ensure that services are correctly monitored, change is delivered in a safe and secure way, resilience is built into our product and our standards and best practices adopted
Lead or be involved in the troubleshooting of complex incidents and problems
Have visibility on end to end service to our customers and ensure their journey is stable and consistent across all the microservices and 3rd party dependencies with the observability tool you will have implemented with the Engineering teams
Perform various Technical Operations in collaboration with the DevOps and Infrastructure teams (patching, log management, space management …)
Develop various technical runbooks in collaboration with other tech teams
Participate in the continuous improvements of our operational processes (Incident, Problems, Change …)
Provide input in Post Incident Review / Post Mortem and take initiative in order to prevent and reduce incidents

About Zepz

Zepz is the group powering two leading global remittance brands: WorldRemit and Sendwave. Since 2010, we have been disrupting an industry previously dominated by offline legacy players with our relentless focus to reduce the cost of remittances and increase safety and convenience for our users. Every day, our people work to unlock the prosperity of cross-border communities through finance and technology - driven by our vision of a world that celebrates migrants’ impact on prosperity, at home and abroad.

Our brands helped cross-border communities send over $15bn from 50 countries to recipients in 130 countries in 2022. We operate over 5,000 money transfer corridors worldwide and employ over 1,600 people globally. Zepz is a remote-first employer, with team members located across six continents. Our vision is to create a world that celebrates migrants’ impact on prosperity, at home and abroad. Our purpose is to unlock the prosperity of cross-border communities through finance and technology.

Zepz.io

Our Commitments:

We act like owners - We are relentlessly delivering for our users and spending money thoughtfully.
We embrace embarrassing honesty - We function best when we’re open and honest with one another — especially about our challenges and doubts.
We have a bias to action - We get to first outcomes quickly, iterate and learn.
We strive to be better - We may make mistakes, but always learn from them.
We are inclusive - to better reflect and serve our users.

About the role:

You are a Technical Support Engineer; passionate about our customers and their experience with Worldremit services. You have a breadth of technical knowledge (vs depth) and you are responsible for managing WorldRemit incidents and technical operations on our production platform.

You understand our overall systems architecture and are able to drive swift resolution of incidents by coordinating and with various technical teams. (DevOps, Infrastructure, Engineering, Suppliers …) You have experience in automation and will drive improvement to streamline monitoring, alerting and incident resolution processes, in collaboration with our DevOps, SRE, and Engineering teams.

We use a modern DevOps and SRE tech stack –Jenkins, K8s, Harness, AppDynamics, Python, Terraform, and Agile working practices to get the job done.

As a member of Zepz Cloud Operations team you will aim high, embrace challenges and always do what’s right; acting with integrity and building trust as you contribute to the company’s technical direction and long term decision making.

What you will own:

Reporting to the Cloud Infrastructure Manager, you will:

Monitor our Production systems and react to alerts swiftly.
Ensure 24x7 availability of our product platform working with the Tech teams
Participate in the development of our monitoring & alerting strategies with the SRE team across multiple cloud environments, in particular AWS, using advanced monitoring tools like Grafana, AppDynamics and Splunk
Experience in cloud and on premise infrastructure, understanding the challenges and considerations to migrate workloads from on premise to AWS. Understanding of SQL, Windows server, Active Directory, DNS, VMWare, Networking skills and the willingness to research and advise on new technologies and developments.
Manage incidents, categorization, triage, resolution and escalation
Communicate appropriately with our business stakeholders on incidents (Customer Service…)
Participate in an oncall/shift rotation
Use code to solve problems. configuration, infrastructure, tooling, and automation, everything must be solved by writing high quality code that performs and scales.
Using best practices and standards in regards to Observability, Monitoring, Alerting, Capacity Planning, availability, performance/latency, change, troubleshooting for all our Tech services.
Work closely with feature teams to ensure that services are correctly monitored, change is delivered in a safe and secure way, resilience is built into our product and our standards and best practices adopted.
Lead or be involved in the troubleshooting of complex incidents and problems.
Have visibility on end to end service to our customers and ensure their journey is stable and consistent across all the microservices and 3rd party dependencies with the observability tool you will have implemented with the Engineering teams.
Perform various Technical Operations in collaboration with the DevOps and Infrastructure teams (patching, log management, space management …)
Develop various technical runbooks in collaboration with other tech teams.
Participate in the continuous improvements of our operational processes (Incident, Problems, Change …)
Provide input in Post Incident Review / Post Mortem and take initiative in order to prevent and reduce incidents

What you bring to the table:

A skilled Engineer. At least 5 years in Cloud Operations/Platform Services with a keen interest in solving problems using automation.
Understand SRE and DevOps methodologies. You understand the build and deployment cycle of an application, and how to operate a resilient system.
Strong experience in Incident & Event Management (NOC, App Support…)
Experience with support and troubleshooting of 24x7 high volume transactional Web applications,
Knowledge of Windows and Linux systems
Experience of Cloud infrastructure and platform services (we run on AWS)
Familiarity with terraform and IaC best practices
Solid experience with GitHub or other version control systems
APM systems such as Dynatrace, AppDynamics and/or New Relic
Alerting tool such as Grafana OnCall, PagerDuty, or OpsGenie
Experience in Scripting languages such as Python, Bash and PowerShell
Strong verbal and written communication skills. Ability to take ownership of issues.
Systematic problem-solving approach. You should have an understanding of how to analyze, and troubleshoot large-scale distributed systems.
Happy in the Clouds. Our Cloud Native platform is hosted on AWS. You’ll be comfortable working with a system that supports users from around the world, at scale. Experience working for a Digital company, delivering real time transactional services (Finance/regulated) is preferred.
Bias for action. You see a problem, you fix a problem. You get buy-in for your solutions and keep tickets moving. We’re always looking for ways to ship at pace.
Growth mindset. A willingness to use your skills and experience to mentor less-experienced engineers. A desire to learn from others and make yourself better every day.
Agile outlook. You need to be excited about working in a fast-changing environment. Products, tools, frameworks and processes change, we evolve and take the best bits with us. The teams drive the evolution.

What we offer you:

Please note that the benefits below do not apply to part-time, contractor or temporary roles.

We have five core benefits for our talent in the US, UK, Philippines, Poland, and South Africa. If you’re not in one of those regions, don’t worry - the Talent team can let you know what is available for you specifically:

Unlimited Annual Leave: Most Zepz team members are eligible for unlimited annual leave. Colleagues in customer-facing roles, receive a competitive holiday allowance and four recharge days a year. Feel free to make the most of your time off and maintain a healthy work-life balance!
Private Medical Cover: You can opt-in to a Private Medical Insurance scheme. This provides you with access to thorough medical coverage, so you can feel confident in your health and well-being.
Retirement: We offer pension schemes to help you plan for and secure your future.
Life Assurance: Life assurance is available to give you peace of mind and protect your loved ones in case of the unexpected..
Parental Leave: We offer competitive parental leave schemes to ensure you are spending as much quality time with your new bundle of joy as possible.

We are also remote-first as an organisation, offering flexibility for you to work where you need to be most productive. In many locations, we have workspaces, which you can use as you desire.

Most roles in the Philippines are predominately office-based, with this we offer free meals for those 100% on-site.

In addition to the above, you will discover that we have a range of secondary perks (such as the cycle-to-work scheme and employee discounts) depending on your location, to help you thrive at Zepz!

Why choose Zepz?

Our team of over 1600+ employees is fully distributed across the world. We are working from coffee shops, homes, and co-working spaces — making us one of the larger fully distributed growth-stage startups in the world but we also offer workspace in our talent cluster locations - spaces we can meet, collaborate and connect.
We are proud parents, community organizers, farmers, band members, yoga teachers, YouTube influencers, former Olympians, and serial entrepreneurs.
We collectively speak over twenty languages, including Akuapem, Amharic, Bengali, Ewe, Fante, Ga, Igbo, Kalenjin, Luganda, Oromo, Somali, Swahili, Wolof, Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hungarian, Irish, Italian, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish and Swedish.
At Zepz, embodying our commitments binds us together. We are collectively passionate about striving to achieve our vision and purpose - to continue to provide the best service to our users.

Ready to apply?

Applications will be reviewed on a rolling basis. If interested, please submit your resume along with a cover letter (optional), highlighting why your experience demonstrates you meet the requirements of the role. Please also indicate the countries in which you have work authorization. While Zepz supports visa sponsorship, sponsorship opportunities may be limited to certain roles and skills.

At Zepz we record interviews using Metaview (https://metaview.ai). It helps us become better interviewers by recording and transcribing our interviews, and ensures we interview candidates in a fair & consistent manner. It is not required. Please let us know if you’d like to opt out of the use of Metaview - this will not affect the outcome of your interview.

Confidence can sometimes hold us back from applying for a job. But we’ll let you in on a secret: there’s no such thing as a ’perfect’ candidate. Zepz is a place where everyone can thrive.

So however you identify and whatever background you bring with you, and if at all you might need any form of support to make the process as comfortable as possible, please let us know and give us a shot by applying. We want you to be excited to wake up to make an impact every day.

Sendwave

View

Website

View Company Profile

Wave Mobile Money is a trailblazer in the financial technology sector, addressing the significant issue of financial inclusion in Africa by providing accessible, fee-free services. The company's robust mobile app, which allows for cash deposits, withdrawals, and both peer-to-peer and business payments, has gained millions of users in Senegal and Cote D'Ivoire, demonstrating its wide acceptance and effectiveness. Wave's commitment to making Africa a cashless continent, coupled with its user-friendly technology that simplifies transactions such as school fee payments, sets it apart in the industry and makes it an exciting place to work.

Company Stage

Series A

Total Funding

$403.7M

Headquarters

Dakar, Senegal

Founded

2017

Growth & Insights

Headcount

6 month growth

↑ 0%

1 year growth

↑ 0%

2 year growth

↓ -3%

INACTIVE