Shopify Careers
Back

Senior Specialist, Resiliency Incident Response

  • Remote - EMEA
  • Support

About Shopify

Opportunity is not evenly distributed. Shopify puts independence within reach for anyone with a dream to start a business. We propel entrepreneurs and enterprises to scale the heights of their potential. Since 2006, we’ve grown to over 8,300 employees and generated over $1 trillion in sales for millions of merchants in 175 countries.

This is life-defining work that directly impacts people’s lives as much as it transforms your own. This is putting the power of the few in the hands of the many, is a future with more voices rather than fewer, and is creating more choices instead of an elite option.

About you

Moving at our pace brings a lot of change, complexity, and ambiguity—and a little bit of chaos. Shopifolk thrive on that and are comfortable being uncomfortable. That means Shopify is not the right place for everyone.

Before you apply, consider if you can:
  • Care deeply about what you do and about making commerce better for everyone
  • Excel by seeking professional and personal hypergrowth
  • Keep up with an unrelenting pace (the week, not the quarter)
  • Be resilient and resourceful in face of ambiguity and thrive on (rather than endure) change
  • Bring critical thought and opinion
  • Embrace differences and disagreement to get shit done and move forward
  • Work digital-first for your daily work

About the role

About the Infrastructure Engineering Group

The Infrastructure Engineering Group at Shopify is on a mission to build a reliable, trusted, intuitive, and scalable infrastructure platform that powers global commerce. Our work ensures that merchants can focus on running their businesses while Shopify developers focus on creating tools that empower them to do so.

As part of the Infrastructure Engineering Group, the Resiliency Incident Response team handles incidents with precision and urgency. This team collaborates with Resiliency Engineers and the broader engineering organization to ensure Shopify's infrastructure remains robust and reliable.

Key Responsibilities:

  • Respond to automated alerts, analyze data, and broadcast information to manage and resolve incidents swiftly.

  • Coordinate ongoing incident management by engaging appropriate teams for quick resolution.

  • Conduct post-incident analyses to ensure action items are prioritized and addressed.

  • Investigate potential incidents using monitoring tools such as Grafana, Bugsnag, and SQL.

  • Collaborate with the support organization to address cross-cutting concerns.

  • Prepare incident logs and facilitate post-incident retrospectives, including post-incident communications.

  • Analyze past incidents to provide insights for future resilience.

Key Technology Stack:

Our infrastructure is built on a robust technology stack, including:

  • Platform: Google Cloud Platform, Kubernetes

  • Storage: Google Cloud Storage, MySQL, Redis, Memcached, Elasticsearch

  • Data Distribution & Processing: Kafka

  • Programming Languages: Go, Python, Ruby

  • Monitoring: Grafana, Bugsnag

What’s in it for you:

  • Contribute to creating resilient systems for Shopify.

  • Engage with unique and challenging problems.

  • Gain deep knowledge of Shopify’s systems.

  • Directly impact on our millions of merchants’ ability to generate revenue for their livelihood, their families, and their employees through the business they’ve built on our platform

Qualifications

  • Detail-oriented with excellent verbal and written communication skills.

  • Ability to communicate complex technical concepts to varied stakeholders.

  • Proficient in navigating data-heavy dashboards and identifying trends.

  • Familiarity with engineering vernacular and comfortable with inquiry.

  • Proven ability to prioritize and execute in high-pressure environments.

  • Excellent coordination and leadership skills.

Bonus Experience:

  • Experience with logging and metrics systems like Grafana and Bugsnag.

  • Skilled in building reports and knowledge bases.

  • Understanding of infrastructure platforms, services, and their metrics.

  • Familiarity with open-source software (nginx, Redis, Memcached, MySQL).

  • Working knowledge of GitHub or git, and writing SQL queries.

  • Experience with Google Cloud Platform and creating tickets.

Team Culture

We are a passionate group committed to building the best commerce infrastructure platform. We prioritize sustainable practices and long-term success, fostering an environment of continuous learning and professional growth. If you have a growth mindset, an avid curiosity, and a desire to build planet-scale infrastructure, you'll find a fulfilling home at Shopify.

Join Us

If you're excited to tackle complex challenges, learn new technologies, and contribute to a platform that powers millions of businesses worldwide, we encourage you to apply. At Shopify, we're dedicated to supporting your growth as we build the future of commerce together.

We hire people, not resumes. If you think you’re right for the role, apply now.