| 99 days ago | NM

hat You’ll Take On

Windows Administration
- Manage and maintain Windows servers, ensuring their stability, security, and performance.
CheckMK
- Utilize CheckMK for comprehensive monitoring and alerting, ensuring all systems are functioning optimally.
Linux Administration
- Diagnose and resolve issues on Linux systems, ensuring minimal downtime and maximum efficiency.
VMWare
- Manage virtual environments using VMWare, ensuring resources are optimized and available.
vSan Understanding
- Demonstrate a solid understanding of vSan for effective storage management and troubleshooting.
Cloud Administration
- Administer and manage cloud services across AWS, Azure, Splunk, and GCP, ensuring seamless integration and operation.
Risk Assessment
- Assess potential risks and impacts on game services and revenue, taking proactive measures to mitigate them.
Issue Identification
- Identify issues, alerts, and critical service incidents using provided dashboards and monitoring tools.
Service Troubleshooting
- Utilize studio playbooks to troubleshoot and diagnose basic issues across various services.
Communication
- Relay accurate and timely information regarding service impacts to game studios, ensuring effective communication during incidents.
Incident Management
- Spearhead outage management, including communication, triage, and escalation.
Daily On Call
- Responsible for triaging and troubleshooting critical alerts form critical systems

What You Bring

Experience:
- Live Services Knowledge: Understanding of live services and their operational requirements.
- Change/Crisis Management: Experience in managing change and crisis situations, ensuring minimal disruption to services.
- Effective Communicator: Able to relay information accurately and timely to the game studio and other stakeholders.
- Team Player: Works well in a collaborative environment, sharing knowledge and supporting team members.
Proactive Problem-Solving:
- A commitment to continuous improvement and proactive issue resolution.
- Proven experience in troubleshooting production problems affecting live services.
- Able to identify potential issues before they become critical and manage details effectively.
Background:
- At least 1 year of experience in a similar role and/or 3 years experience in a relevant role.

Official notification