Head of Cloud Product Technology Operations & Site Reliability Engineering




Requisition ID: 101695

Join a purpose driven winning team, committed to results, in an inclusive and high-performing culture.


The team:

Scotiabank’s Global Technology Services (GTS) Cloud Product Technology Operations & Site Reliability Engineering (SRE) is responsible for the operations engineering required to provide highly available and resilient systems.


The role:

Reporting to a VP, you are accountable to implement and maintain a portfolio of Scotiabank developed infrastructure products, which enables Scotiabank’s usage of the public cloud.  As a strong technical leader, you will lead System Operations, Site Reliability Engineering and DevOps teams in the deployment of applications and foundational platforms.


Is this role right for you?

  • You thrive in a people management role & want to lead teams responsible for 24x7 operations of Public Cloud based infrastructure. This would include: 1) Fostering the talent of employees providing effective coaching and development planning discussions and take an active role in addressing skills gaps; and 2) Coach and lead teams in investigating, analyzing, and resolving system problems.
  • You can provide technical leadership during major incidents until resolution on bridge calls. You will do this by driving Root Cause Analysis (RCA) sessions to identify root cause and enable resolution.
  • You strive to identify opportunities to improve resilience & availability within the Cloud Products and partner with the Engineering teams to enhance them.
  • You gain immense satisfaction from ensuring the portfolio of products attain services levels defined to meet both technical and business objectives.
  • You can provide direction in the implementation of performance management, disaster recovery, monitoring and access management services.
  • You can maintain and update policy and procedures for public cloud specific operations functions.
  • You’re able to proactively identify short- and long-term product improvements, bring forward innovative ideas and opportunities aligned to the Cloud product portfolio.
  • You can create Service Level Objectives, Service Level Indicators and Error Budgets for all Cloud Products.
  • You will ensure the implementation of all Telemetry, KPIs and Dashboards for each SLO related to the Cloud Products.
  • Seeking input and obtaining buy-in to business plans with internal partners to ensure support for priorities is a part of your DNA.
  • You are passionate about building resilience tools for a Cloud ecosystem like probes, tracing, Machine Learning systems for microservices & containers to support failure attribution and create circuit breakers.
  • You can build a testing methodology for a library of microservices and modules to ensure application using the Cloud ecosystem are thoroughly tested for resiliency prior to deployment.
  • You like the challenges in managing outsourced services vendor contract service deliverables.
  • You can communicate effectively across to diverse audiences, departments and levels acting as an advocate of the product driven infrastructure solutions.


Do you have the skills that will enable you to succeed in this role?

  • You have strong communication (verbal/written/presentation) skills in English.  The same in Spanish is an asset.
  • You have at least 10+ years of hands-on working experience in the people management of 20 or more technical resources.
  • You can demonstrate the ability to train & manage a team of Site Reliability Engineers; plus have the passion for driving teams towards high performance and a deep pride in quality craftsmanship that delights users.
  • You have at least 5+ years of hands on working experience in executing IT Operations management best practices combining ITIL, IT Service Management (ITSM) with a focus on Agile development and other frameworks like Scrum & Continuous Delivery.
  • You have at least 2+ years of technical working knowledge of production operations of applications built in public cloud.
  • You consider yourself to be an expert at leading technical teams to resolve major incidents in bridge calls.
  • You possess good interpersonal skills to build relationships with internal / external business partners and vendors.
  • You possess a strong cross domain knowledge for operating scalable, enterprise-level systems including:
    • Container based systems (Docker, Kubernetes, Cloud Foundry)
    • Cloud storage, network, and security resources of Azure and GCP
    • DevOps tools such as Git, Jenkins, Anthos, Terraform, Ansible
    • Monitoring & analysis tools such as ELK, Dynatrace, Splunk, PowerBI or BigQuery
  • You have working experience with using Infrastructure as Code (IaC) approach to the deployment and maintenance of systems.
  • You can demonstrate critical thinking & thought leadership as a senior technical leader.
  • You can demonstrate working experiences with technology roadmaps & strategies.
  • You have a strong desire to learn, to grow yourself and your team.
  • Experience in Risk Management is an asset.
  • You have completed a post-secondary education in computer science, engineering or in a related technology field.


What's in it for you?

  • You'll get to work with and learn from diverse industry leaders, who have hailed from top technology companies around the world.
  • We have an inclusive and collaborative working environment that encourages creativity, curiosity, and celebrates success! We also foster an environment of innovation and continuous learning.
  • We care about our people, allowing them to design how they work to deliver amazing results.
  • We offer a competitive total rewards package, including a performance bonus, company matching programs (pension & Employee Share Ownership), generous vacation; health/medical/wellness benefits; employee banking privileges.
  • While we currently work remotely from home, when it is safe to return physically back to work, our primary location in downtown Toronto is:
    • Design focused on enabling collaboration through both environment and technology.
    • Located in the heart of Toronto’s financial district, the work site is located right above the TTC’s Line 1 King subway station. This location has access to The PATH & is located minutes from GO Transit/VIA Rail hub at Union Station; as well as the TTC’s King 504 streetcar line.
    • Minutes from the Gardiner Expressway & the DVP.
    • Located next door is The Commons, a dining space for employees, where breakfast & lunch are served.  Also, The Bean serves hot/cold beverages & snacks with plenty of room to lounge & recharge.  Also, many meal/snack options + shopping & services for your everyday needs in The PATH without venturing outside.



Location(s):  Canada : Ontario : Toronto 

Scotiabank is a leading bank in the Americas. Guided by our purpose: "for every future", we help our customers, their families and their communities achieve success through a broad range of advice, products and services, including personal and commercial banking, wealth management and private banking, corporate and investment banking, and capital markets.  

At Scotiabank, we value the unique skills and experiences each individual brings to the Bank, and are committed to creating and maintaining an inclusive and accessible environment for everyone. If you require accommodation (including, but not limited to, an accessible interview site, alternate format documents, ASL Interpreter, or Assistive Technology) during the recruitment and selection process, please let our Recruitment team know. If you require technical assistance, please click here. Candidates must apply directly online to be considered for this role. We thank all applicants for their interest in a career at Scotiabank; however, only those candidates who are selected for an interview will be contacted.

Toronto, ON M5H1B6, CA