Site Reliability Engineer (*) REMOTE

WEX Inc | Portland, ME, United States

Posted Date 4/19/2024
Description

(*) This is a remote position; however, the candidate must reside within 30 miles of one of the following locations: Boston, MA; Dallas, TX; San Francisco Bay Area, CA; Portland, ME; and Washington, D.C.

About the Team/Role

The WEX Site Reliability Engineering (SRE) team is looking for individuals passionate about developing software and solutions focused on observability, incident response, reliability and performance, operational excellence, and compliance. The team will be part of the Platform Reliability organization which supports our internal stakeholders and our Funding Platform teams. As part of the Platform Reliability organization you’ll have the opportunity to solve complex challenges and improve the quality of life of our engineering teams as well as our ability to service our customers.

The successful candidate should have a strong aptitude for learning new technologies and the ability to drive complex and meaningful projects to a conclusion. Tight-knit collaboration with the engineering teams and an ability to thrive under pressure are key skills required to succeed in this role.

How you’ll make an impact

  • Willingness to dig deep into code, networking, operating systems, and/or storage solutions to solve complex issues
  • Develop automation and utilize monitoring tools to ensure system reliability
  • Participate in incident response and troubleshooting
  • Participate in 24x7 Site Reliability rotations and escalation workflows
  • Identify and address performance bottlenecks. This will include code optimization, configuration changes, or infrastructure upgrade recommendations.
  • Collaborate with development teams to ensure software design meets operational requirements
  • Continuously improve processes and procedures to increase system reliability and efficiency
  • Stay up-to-date with the latest industry trends and technologies

Experience you’ll bring

  • 2+ years of hands-on experience as a Site Reliability Engineer or equivalent role
  • 2+ years of development experience with at least one major programming language
  • Experience with Cloud Computing platforms (AWS, Azure, GCP)
  • Ability to thrive in a fast paced, development and operations world
  • Strong communication and collaboration skills
  • Experience with observability and logging technologies
  • Experience with at least one major RDBMS and NoSQL data store
  • Experience with containerization technologies such as Docker or Kubernetes
  • BA/BS degree in Computer Science or related technical field, or equivalent job experience

Nice to have

  • Experience with one or more of the following languages: C#, Java, GoLang, Python
  • Experience with infrastructure as code, preferably Terraform
  • Working knowledge in building and designing RESTful APIs.
  • Experience with Datadog, Grafana and Splunk
  • Familiarity with Agile methodologies and practices
  • Experience with GitOps
  • Experience with Apache Kafka
Job Type
Remote Work (from Maine)
Industry
Engineering | Information Technology

Share this job