Cloud Site Reliability Engineer (SRE) Job at ZipRecruiter, Washington DC

ZlJrZW9OYTJqUGwwWWpCQTJqK1lLdldKSGc9PQ==
  • ZipRecruiter
  • Washington DC

Job Description

Job Description Company Overview Promise empowers utilities and government agencies to create flexible, affordable solutions for individuals struggling with debt. Our innovative approach to payment plans and relief distribution significantly improves enrollment and recovery rates, helping individuals clear debts faster and reducing delinquencies for our partners. We treat people facing financial difficulties with respect and dignity, providing the tools and resources they need to thrive. Our team includes experts from companies like Palantir, Google, Stripe, and esteemed government leaders. Backed by over $50 million in funding from top investors – such as Reid Hoffman, Howard Schultz, Michael Seibel, Y Combinator, 8VC, The General Partnership, First Round Capital, Kapor Capital, XYZ Ventures, and Bronze Investments – Promise has been recognized as one of Fast Company's "World's Most Innovative Companies of 2022," "Forbes Next Billion-Dollar Startups 2024," and Y Combinator’s #1 GovTech startup. Role Overview We’re looking for a Cloud Site Reliability Engineer (SRE) to build, operate, and optimize the infrastructure that powers our products. You’ll be responsible for ensuring high reliability, performance, and scalability of our cloud-based systems. The ideal candidate is self-sufficient, detail-oriented, and execution-driven, with a strong background in software development, site reliability engineering (SRE), and infrastructure-as-code (IaC). You’ll collaborate closely with product and engineering teams to improve system architecture, troubleshoot issues, and automate operational processes. This role is ideal for someone who thrives in a hard-working, fast-moving environment, enjoys solving complex technical challenges, and takes personal responsibility for ensuring security outcomes are achieved and aligned to business goals. What You’ll Do Design, implement, and manage cloud infrastructure to ensure reliability, scalability, and security. Automate infrastructure and operations using Terraform, scripting, and configuration management tools. Develop strong relationships with engineering teams to define system reliability goals and best practices. Troubleshoot and resolve complex network and system issues using observability tools, stack traces, and system logs. Monitor and optimize system performance, implementing best practices for high availability and disaster recovery. Formalize and liaise with the Engineering team to guide them through a security design review process. Ensure the security and stability of Linux-based production systems. Provide essential support in aligning our technology projects with compliance requirements, navigating the complexities of state and federal regulations, while fostering an environment of innovation. Serve as a bridge between technical teams and non-technical stakeholders, translating security and compliance needs into actionable plans that support our broader business objectives. What Will Enable You 4+ years of experience in Linux system administration, managing large-scale production environments. Strong debugging skills, with experience in performance tuning, observability, and system-level troubleshooting. Hands-on experience with cloud platforms (AWS, Azure, or GCP). Expertise in Infrastructure-as-Code (IaC) using Terraform or similar tools. Proficiency in monitoring tools (e.g., Prometheus, Datadog) and health check implementation. Experience with containerization (Docker, Podman, Kubernetes). Scripting experience (Python, Bash, or equivalent) to automate infrastructure management. Knowledge of networking and security best practices for cloud environments. Benefits and Work Environment At Promise, we invest in our team’s well-being, growth, and sense of ownership. Equity for All: All full-time employees receive stock options to share in our company’s success. 100% Paid Health Coverage: We cover 100% of base medical, dental, and vision insurance plans for employees and their dependents. Flexible Hybrid Work: We collaborate in office at least three days a week to stay connected and aligned as a team. Please note: Benefits are reviewed periodically and may be updated at the sole discretion of Promise. Promise is an equal opportunity employer and does not discriminate against any applicant or employee because of race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, or veteran status. Additionally, the Company complies with applicable state and local laws governing non-discrimination in employment in every jurisdiction in which it operates. Promise is committed to promoting an inclusive and equitable workplace. We also provide reasonable accommodations to qualified individuals with disabilities, individuals with sincerely held religious beliefs, in accordance with applicable laws. Promise engages in US government contracts and restricts hiring to US persons, which includes US and permanent residents (e.g., Green Card holders). Additionally, candidates must reside in the US. Compensation Range: $149K - $195K #J-18808-Ljbffr ZipRecruiter

Job Tags

Permanent employment, Full time, Relief, Local area, Flexible hours, 3 days per week,

Similar Jobs

Hunter Hamilton

Customer Service Specialist Job at Hunter Hamilton

 ...job aides and process documents Review flagged screenings, determine next steps for students, and send disclosure documents via DocuSign Keys to Success: ~ Minimum 3 years customer service experience with a focus on customer satisfaction and one-contact... 

Pentangle Tech

Systems Test Engineer Job at Pentangle Tech

 ...Review System requirements and provide feedback Develop test scenarios and test cases based on system feature requirements, architecture designs Execute Software Test Cases on vehicle and bench set-ups. Analyzing test data and evaluating the results... 

LaSalle Network

EPIC Analyst Job at LaSalle Network

 ...We are seeking a certified Epic Analyst with expertise in Beacon, Professional Billing (PB), or Ambulatory to join our clients' healthcare...  ...hire faster and connect top talent with opportunities, from entry-level positions to the C-suite. With units specializing in... 

Roblox

Sr. Site Reliability Engineer, Compute SRE Job at Roblox

 ...technical challenges at scale, and helping to create safer, more civil shared experiences for everyone.What Youll Do:As a Site Reliability Engineer (SRE) on the Infra Compute Orchestration (ICO) team, you will create, support, and evolve the infrastructure at Roblox as we... 

ECHN

Clinical Nurse Educator, Peri-Operative Services Job at ECHN

 ...interventions that result in improvements in clinical outcomes, patient /family satisfaction, resource allocation, staff knowledge and skills, health care team collaboration, and organizational efficiency. Collaborates with members of the healthcare team to develop standards of...