Infrastructure

Production Operations Lead

Contract-to-Hire

Brooksource

Apply Now

<< Return to Search Results

Production Operations Lead

The Opportunity

Brooksource’s client’s IT Operations team is looking for an outstanding Production Operations Lead. The Productions Operations Lead will serve as a highly specialized senior level technical application lead focusing on operational stability by driving IT operations readiness through the continuous improvement in our products. This role will involve working closely with development teams and business partners, coaching junior operations teams, and implementing enhanced monitoring and alerting capabilities for our distributed platforms. Additionally, the Production Operations Lead will aid in the development of automation to reduce MTTR and manual tasks. The ideal talent will have experience driving large scale development efforts in an agile environment as well as a thorough understanding of DevOps practices with a focus on managing production environments. We are looking for a high energy, team player with an innovative mindset interested in joining a group of IT professionals dedicated to enhancing IT operations. This position will report to a Director of IT Production Operations. Passion for technology and problem solving are a must have.

 The Work Itself

·      Collaborates with Cloud Engineering, Agile squads/developers, sustain and business partners and provides significant contributions to develop specifications to resolve problems, and to address enhancement needs focusing in areas of logging, monitoring, and metrics for operational readiness

·      Uses technical knowledge, creativity, and company practices to drive down occurrences of incidents through development of proactive alerting and monitoring

·      Develops runbooks and patterns for on-prem/AWS/DevOps operations

·      Serves as a mentor to early talent developers and IT operations team members

  • Participates in technical discussions with the development team for deployment and code reviews
  • Ensures knowledge transition from development to operations teams for functional deployments
  • Works with business and development partners to gather inputs to for new capabilities in displaying/monitoring/alerting on key performance indicators (KPIs) by tracking business transactions (BT) in real-time
  • Plans for validation and verification of changes deployed by infrastructure teams, development teams and security teams
  • Facilitates day to day execution of real time technical support and troubleshooting
  • Attends change advisory board meetings and approves changes
  • Supports business continuity and disaster recovery activities
  • Leads maintenance of master documents i.e. Runbook, Playbook and help maintain accurate application documentation
  • Provides guidance in resolving performance related issues and designing solutions for any technical issues faced by the application

·      Provides resolution of any technical issues faced by the application

·      Conducts real-time queue monitoring and decision making

·      Represents application requirements for Infrastructure activities

·      Assesses, plans and communicates steady state upgrades/patches

·      Participates in Discovery Phase to assist with scoping effort for application teams

·      Devises and develops solutions to meet or exceed business requirements

·      Participates in technical meetings and status meetings with the business

·      Co-ordinates between upstream applications to resolve incidents

·      Communicates with several different technology areas in a highly matrixed organization

The Skills You Bring

  • Holds BS (preferably MS) in Computer Science or related field preferred
  • 5 + years of experience in a similar sustain role and extensive knowledge of associated processes
  • Shows deep knowledge and understanding of enterprise-scale platforms and architectures
  • Possesses strong analytical, problem-solving skills and exhibits strong leadership skills
  • Experience with Co-ordination between upstream applications to resolve incidents
  • Grasps new technologies and can adapt to rapid shifts in priorities
  • Experience with implementing sustainable, audit-ready processes to support IT controls such as executing deployment, access management, audits, incident management, change management, etc.
  • Applied experience with as many of the following as possible: Unix and Windows platforms, Java EE, JavaScript, Spring, Spring Boot, REST API/Micro Services, Shell Scripting, Python, PL/SQL and databases, specifically Oracle
  • AWS/Cloud hands on experience preferred
  • DevOps hands on experience with DevOps tools such as Python, Terraform, Jenkins, Gitlab, Ansible, docker preferred
  • Experience with Splunk, AppDynamics or other similar monitoring tools preferred
  • Correlate environment conditions and metrics to application events
  • Experience debugging problems in a distributed system
  • Experience with source control management and build tools including SVN

Brooksource provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, national origin, age, sex, citizenship, disability, genetic information, gender, sexual orientation, gender identity, marital status, amnesty or status as a covered veteran in accordance with applicable federal, state, and local laws.

JO-2111-117662

Apply Now