Site Reliability Engineer
- Pasig, PhilippinesOrtigas Center, Pasig, Metro Manila, Philippines, Pasig, Metro Manila, PhilippinesPasigMetro ManilaPhilippines
- Full time
Site Reliability Engineer (SRE) is an engineering discipline that combines software and systems engineering to build and run production systems. SRE ensures that August 99’s services—both our internally critical and our externally-visible systems, e.g. GitLab/developer tooling and hosted client sites—have reliability and uptime appropriate to users' needs and a fast rate of improvement while keeping an ever-watchful eye on capacity and performance.
- Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation and refinement.
- Install new / rebuild existing servers and configure hardware, services, settings, directories, storage, etc. in accordance with standards and project/operational requirements.
- Perform daily system monitoring, verifying the integrity and availability of all hardware, server resources, systems and key processes, reviewing system and application logs, and verifying completion of scheduled jobs such as backups.
- Perform regular security monitoring to identify any possible intrusions.
- Regularly work on improving August 99’s security practices, including:
- Recommending new technologies to improve threat assessment and mitigation.
- Assisting in the migration to new technologies.
- Assisting coworkers with infosec best practices to ensure cross-coverage within the team
- Practice sustainable incident response and blameless postmortems.
- Perform ongoing performance tuning and resource optimization as required.
- Apply OS patches and upgrades on a regular basis, and upgrade administrative tools and utilities. Configure / add new services as necessary.
- Develop and maintain installation and configuration procedures, especially related to automation.
- Design, develop, troubleshoot and debug software programs for databases, applications, tools, networks etc.
- As a member of the site reliability and IT team, you will assist in defining and developing software for tasks associated with the developing, debugging or designing of software applications or operating systems.
- Provide technical leadership to other software developers.
- Specify, design and implement modest changes to existing software architecture to meet changing needs.
- Analyze system and software security and change procedures or code when necessary.
- Stay informed about new and relevant CVE’s, potential bugs, viruses/worms/etc, and how to take preventive or corrective measures for each.
- Duties and tasks are varied and complex needing independent judgment.
- Candidates should be fully competent in own areas of expertise. May have project lead role and or supervise lower level personnel. BS or MS degree or equivalent experience relevant to functional area.
- Externsive 5 years of software engineering or related experience.
- Required Skill(s): DevOps, Site Reliability Engineer, SRE, Python, Bash, Jenkins, Jira, PHP, Wordpress, GIT, Java, AWS, Azure, Linux, Unix, Microsoft SQL Server Database Administrator, Active Directory, MongoDB, C++, and GitLab/developer tooling.
- Willing to do midshift schedule, from 10:00am to 7:00pm and on weekends.
- People oriented and results driven
- Proactive and detail-oriented
- Strong interpersonal and analytical skills
Feel secure when applying: look for the verified icon and always do your research on a company. Avoid and report situations when employers require payment or work without compensation as part of their application process.