Site Reliability Engineer
Remote with a home office in Iselin, NJ or Charlotte, NC
We are seeking a Site Reliability Engineer within the Digital Distribution and Mobile Technology group at our Fortune 100 Financial Services client.
As a team member with this client, you will be joining a Scaled Agile (SAFe) environment. This is the ideal opportunity for a fast-paced engineer who wants to play an integral role within a highly visible team. This individual should be an avid problem-solver who is passionate about modernizing systems and is looking for an opportunity grow within this established team for years to come.
- Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
- Support the application CI/CD pipeline for promoting software into higher environments through validation and operational gating, and lead TIAA in DevOps automation and best practices.
- Practice sustainable incident response and blameless postmortems.
- Take a holistic approach to problem solving, by connecting the dots during a production event thru the various technology stack that makes up the platform, to optimize mean time to recover
- Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation and refinement.
- Analyze ITSM activities of the platform and provide feedback loop to development teams on operational gaps or resiliency concerns
- Support services before they go live through activities such as system design consulting, capacity planning and launch reviews.
- Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
- Work with a global team spread across tech hubs in multiple geographies and time zones
- Share knowledge and mentor junior resources
- Design, implement, and enhance our deployment automation based on Chef. We need proven experience writing chef recipes/cookbooks as well as designing and implementing an overall Chef based release and deployment process.
- Support deployments of code into multiple lower environments. Supporting current processes needed with an emphasis on automating everything as soon as possible.
- Design and implement a Git based code management strategy that will support multiple environment deployments in parallel. Experience with automation for branch management, code promotions, and version management is a plus.
- BS degree in Computer Science or related technical field involving coding or equivalent practical experience.
- Experience with algorithms, data structures, scripting, pipeline management, and software design.
- Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
- Ability to help debug and optimize code and automate routine tasks.
- We support many different stakeholders. Experience in dealing with difficult situations and making decisions with a sense of urgency is needed.
- Experience in one or more of the following is preferred: C, C++, Java, Python, Go, Perl or Ruby.
- Interest in designing, analyzing and troubleshooting large-scale distributed systems.
- We need team members with an appetite for change and pushing the boundaries of what can be done with automation.
- Experience in working across development, operations, and product teams to prioritize needs and to build relationships is a must.
- Strong verbal and written communication skills.
Brooksource provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, national origin, age, sex, citizenship, disability, genetic information, gender, sexual orientation, gender identity, marital status, amnesty or status as a covered veteran in accordance with applicable federal, state, and local laws.