Senior engineer- sre - الإمارات, united arab emirates

Job description bridge the gap between operations and developer teams, aiming to expedite developments while improving reliability & quality. qualifications build a site reliability engineering culture across the organization by sharing best practices, approaches, documentation, and code with other engineering teams create software that improves the reliability of systems in production, fixing issues and responding to incidents. apply automation and software to any tasks or parts of the system that would benefit from it or are performed manually able to troubleshoot complicated, cross platform issues handling os, networking, database in a cloud-based saas environment and handle live production incidents, debug/troubleshoot application and infrastructure issues, follow and implement sre best practices monitor application performance take steps to improve overall application performance and stability and follow through with implementation ensure sla/slo error margins are being adhered to by the teams before releasing new features. take corrective actions if error margins are out of bounds. conduct system analysis, configuration management and develops improvements for system software performance, availability and reliability design, write, ship, and motivate the creation of software and systems to increase observability, product reliability and efficiency work closely with software engineers and testers to ensure the system is responding properly to no-functional requirements such as performance, security, and availability document your system knowledge as you acquire it over time, create runbooks, and ensure critical system information is readily available to those who need it maintain and monitoring deployment, orchestration, of the servers, docker containers, databases, and general backend infrastructure keep up-to date with security and proactively identify, diagnose, and solve complex security issue additional information bachelor's degree in computer science or other highly technical, scientific discipline overall experience of 7+ years including 4+ years experience as sre/devops engineer working closely with engineering teams to understand their product requirements and how they build/test/deploy their software applications demonstrable experience in containerization-docker and orchestration (kubernetes) demonstrable experience in ci/cd tools such as bitbucket, bamboo, nexus and helm experience with infrastructure as code (terraform, cloud formation, ansible) knowledge and proven hands-on experience in large-scale databases and distributed technologies, such as kafka and confluent platform kafka basic programming and scripting skills (preferably golang, bash, shell, etc.,) ability to provide advice, best practices and recommendations for the operation and deployment of microsoft azure experience in monitoring and analyzing infrastructure performance using standard performance monitoring tools - nagios, new relic, perfmon, perfview, procdump, debugdiag familiarity with linux and unix systems (e.g. centos, redhat) and command line system administration such as bash, vim, ssh. hands on experience in configuration management of server farms (using tools such as puppet, chef, ansible, etc.,). network routing, load balancing and networking protocols, a base knowledge of tcp/ip, with an understanding of http and dns knowledge of sre & agile methodologies preferred skills (good to have) demonstrated understanding of itil methodologies, itil v3 or v4 certification kubernetes cka or ckad certification

First Abu Dhabi Bank
الإمارات, United Arab Emirates
Please report inappropriate ads by sending a message to Please include the Job ID located in the header of each ad

Apply to this job now Report abuse