Site Reliability Engineer, II

Coursera • Canada

Posted 5 months, 1 week ago

Deadline: Not specified

Full Time Mid-Level Software Engineering Remote

At Coursera, we are committed to building a globally diverse team and are thrilled to extend employment opportunities to individuals in any country where we have a legal entity. We require candidates to possess eligible working rights and have a compatible timezone overlap with their team to facilitate seamless collaboration.

Coursera has a commitment to enabling flexibility and workspace choices for employees. Our interviews and onboarding are entirely virtual, providing a smooth and efficient experience for our candidates. As an employee, we enable you to select your main way of working, whether it's from home, one of our offices or hubs, or a co-working space near you.

Job Overview:

Our SRE team is part of the Coursera Infrastructure group that builds the foundation that keeps Coursera reliable, scalable, and efficient. We partner with product and platform teams to deliver resilient systems through automation, observability, and operational excellence. From incident response to infrastructure as code, we enable fast, safe, and cost-aware delivery of global learning experiences. We are hiring an IC3 Site Reliability Engineer (SRE) based in Canada to join our SRE team. This role will support reliability, observability, infrastructure automation, and cost optimization efforts across multiple services. The engineer will work closely with senior SREs to build scalable and efficient systems using our AWS-based tech stack, and gain hands-on experience with real-world SRE projects. Joining this team means working on high-impact projects that keep Coursera running smoothly for millions of learners and partners.

Application is on-going until position is filled

Requirements

2+ years of experience in Site Reliability, DevOps, or Backend Engineering roles
Hands-on experience with at least one cloud platform (e.g., AWS, GCP, Azure)Experience with monitoring and logging tools (e.g., Datadog, CloudWatch, SumoLogic, Graphana)
Familiarity with Infrastructure as Code tools (e.g., Terraform, Ansible)
Experience writing automation scripts and backend systems in Java, Python, Bash or similar languages
Preferred Qualifications:

Exposure to incident management processes and tools (e.g., PagerDuty)
Familiarity with containerized infrastructure (e.g., Docker, Kubernetes)
Experience working on cost visibility or optimization in cloud environments
Knowledge of version control systems and CI/CD practices
Experience contributing to disaster recovery or multi-region infrastructureKnowledge of security/compliance practices (e.g., audit logging, access controls)
If this opportunity interests you, you might like these courses on Coursera:

Site Reliability Engineering: Measuring and Managing Reliability – Learn SRE fundamentals including SLIs, SLOs, and error budgets
Introduction to Cloud Computing – Understand core cloud concepts, including AWS services and architecture
Getting Started with Terraform for Cloud Infrastructure Automation – Learn infrastructure-as-code using Terraform with hands-on AWS examples

Responsibilities

Contribute to building and maintaining observability systems (e.g., metrics, logs, dashboards)
Assist in automating infrastructure provisioning, system configuration, and reducing toil
Participate in on-call rotations and support incident response processes
Collaborate with senior engineers on improving the reliability and scalability of services
Implement cost monitoring tools and assist in cloud resource optimization
Support disaster recovery planning, compliance tasks, and documentation

Salary

USD 113600 - USD 150000

per yearly

Company Size

1000+ employees

Employment Type

Full Time

Work Mode

Remote (Canada)

Apply Externally

Notice: You are about to leave RemoteWok and apply on an external site.
The application process will continue on the employer's website.

View Company Profile

Site Reliability Engineer, II

Requirements

Responsibilities

Similar Jobs

Semi Senior QA Automation Engineer (GCP/Cypress)

Senior Power Platform & Copilot Developer

Talent Community - Full-stack JavaScript

Remote Salesforce Full Stack Engineer

Site Reliability Engineer, II

Requirements

Responsibilities

Similar Jobs

Semi Senior QA Automation Engineer (GCP/Cypress)

Senior Power Platform & Copilot Developer

Talent Community - Full-stack JavaScript

Remote Salesforce Full Stack Engineer

We use cookies to enhance your experience