Senior Operations Engineer, Retail Store Operations & Support - Retail & Marcom Engineering
Apple
At Apple, we don’t just build products - we revolutionize entire industries. Our innovation is driven by the diverse ideas and people that form the foundation of everything we do, from cutting-edge technology to our environmental leadership.
Retail and Marcom Engineering, an IS&T team, builds and operates the systems and experiences that connect Apple's products with its customers. The team owns the technology behind both Apple's online and physical stores and drives the interactive marketing experiences and tools that keep creative operations moving. Together, those functions deliver the technology behind every product story Apple tells and every purchase a customer makes.
Description
The IS&T Retail Engineering Operations & Support team is the foundation that makes the magic of Apple Retail possible, both in-store and online. We support the Apple Online Store and over 500 Apple Retail Stores with exceptional, business-focused technical services and innovative engineering solutions. We build, operate and support the systems, applications, and tools that power Apple Store operations, and we run a global follow-the-sun support model that keeps every store and every customer connected to what they need, every minute of every day.
As part of Retail Engineering Store Operations & Support, you will play a crucial role in detecting and resolving issues that impact our global retail environment, This role sits at the intersection of production support, SRE, and applied AI engineering. This is a team scaling fast, going deep technically, and investing heavily in the next generation of automation, agentic operations orchestration, and operational intelligence.
This is a hands-on engineering role with high visibility across Apple's global retail technology landscape and an opportunity to shape how production support operates at scale.
As a Senior Operations Engineer on the transactional operations vertical, you will own deep technical expertise across some of Apple Retail's most critical systems, including Point of Sale, Apple Financial Services, Carrier Services, Runner, and Catalog. You will combine that domain expertise with hands-on troubleshooting, building and extending GenAI agents, screening tools, and automation that fundamentally change how our team detects, investigates, and resolves issues.
You will partner closely with Engineering, SRE, and business teams to drive root cause analysis, deliver process improvements, and bring clarity to complex technical problems for both technical and non-technical stakeholders.
The ideal candidate is a proactive problem solver who thrives in a fast-moving production environment and is energized by production environments where speed, scale, and precision all matter, who codes when needed, and who communicates with precision across audiences. If you are passionate about supporting reliable, high-impact systems that serve millions of customers worldwide, this may be the perfect opportunity for you. ","responsibilities":"Build and extend GenAI agents, automation that scale incident detection, triage, and resolution across the team.
Develop deep technical ownership of transactional systems (Point Of Sale, Apple Financial Services, Carrier Services, Runner, Catalog) to investigate, debug, and resolve complex production incidents with speed and precision.
Lead root cause analysis using application logs, telemetry, observability platforms, and ML-driven insights, partnering with engineering and leadership to prioritize and drive resolution.
Contribute to the design and development of functional requirements, technical specifications, and support documentation for automation and process improvement initiatives.
Partner cross-functionally with engineering teams, business stakeholders, to enhance product delivery, support strategy for new launches, and continuous improvement of service quality.
Coordinate with our globally distributed team of consultants to ensure consistent execution of support functions.
Participate in an on-call rotation to support critical applications and services, which may include infrequent non-standard hours and weekend coverage.
Preferred Qualifications
Hands-on experience building and orchestrating agentic workflows with SOTA language models, LLM-based automation, or AI-augmented operational tooling.
Demonstrated ability to work across distributed systems, APIs, and microservices at an architectural level - understanding how failures propagate across system boundaries.
Strong understanding of networking fundamentals (TCP/IP, DNS, HTTP/TLS, load balancing) with the ability to diagnose connectivity issues between distributed systems.
Strong root cause analysis skills using diverse data sources including application logs, telemetry, and customer feedback.
Experience with scripting languages (Python preferred) for log analysis, data investigation, and lightweight automation of operational workflows.
Excellent ability to communicate complex technical issues clearly and concisely to both technical and non-technical stakeholders.
Experience coordinating with distributed teams across multiple time zones.
Familiarity with CI/CD pipelines, Git-based version control, and release management/deployment processes.
Experience with data visualization and analysis tools such as Tableau or Power BI.
Proficiency in ITIL practices, including incident, problem, and change management.
Experience validating data feeds between SAP (MM, SD, FI) and downstream retail systems, with the ability to identify and troubleshoot discrepancies in inventory, pricing, and order data.
Previous experience supporting eCommerce platforms or Retail / Payment systems at scale in a plus.
Minimum Qualifications
Bachelor's degree or higher in Computer Science, Information Technology, or a related field, or equivalent work experience.
5+ years of experience supporting critical, customer-facing systems in a high-volume production environment.
3+ years of hands-on experience with incident management platforms (e.g. ServiceNow) and issue tracking tools (e.g. Jira).
3+ years of practical experience with Splunk, including dashboard creation, SPL querying, and alert configuration for production triage, performance degradation analysis, and incident resolution.
3+ years of experience performing structured root cause analysis using application logs, telemetry, distributed traces, and customer feedback across complex, multi-system environments.
How to apply
To apply for this job you need to authorize on our website. If you don't have an account yet, please register.
Post a resumeSimilar jobs
Branch Manager - Hwy 290 and Belterra Village - Austin, TX
Cyber Oracle Identity Senior Consultant / Senior Engineering Management Specialist
Cleaner