Prithiviraj S

Technical Operations Manager • CloudOps • SRE • Integration Specialist • Automation • Jenkins


📄 Download Resume

About Me

Accomplished Technical Operations Manager with strong expertise in production stability, SRE practices, automation, and large-scale CloudOps. Proven leader with experience managing 21+ team members across Production Support, Project Onboarding, and Production Integration teams, working closely with Engineering, QA, Product, Cloud, and third-party partners to resolve issues and improve operational maturity. Highly skilled in monitoring, RCA, incident lifecycle management, and proactive alerting, ensuring reliable and resilient systems. Experienced with ITSM tools (JIRA, ServiceNow, Freshdesk, GLPI) and known for improving workflows through custom dashboards, CI/CD optimization, and Jenkins automation to enhance SLA compliance. Strong background in Grafana, Zabbix, Elasticsearch, AWS CloudWatch, synthetic monitoring, and cloud-native workloads, with a track record of leading high-impact incidents, mentoring teams, and building repeatable processes that reduce recurring issues and improve KPIs.

Experience

Nandu’s Food Pvt Ltd — Technical Operations Manager (2025–Present)

• Centralized monitoring (zabbix)

• Desktop Application development

• Python script autoamtion

• jenkins jobs schedule

• Automated Reporting

• GLPL Ticketing Tool configuration

• EDC Machine intergration


Role Responsibilities

• Manage production monitoring, alerts, and incident response to ensure high system availability and SLA compliance.

• Oversee cloud and infrastructure operations including servers, applications, networking, and security.

• Implement Python-based automation to reduce manual effort and improve operational efficiency.

• Perform root cause analysis for recurring incidents and drive long-term corrective actions.

• Coordinate with development, QA, and vendors while leading the TechOps team to ensure smooth deployments and stable operations.


Key Achievements

• Reduced recurring production issues by implementing Python-based automation and monitoring enhancements.

• Improved system uptime and SLA compliance through proactive alert tuning and incident prevention.

• Built and enhanced operations dashboards providing real-time visibility to management.

• Improved operational SLA adherence to 99%, ensuring consistent on-time issue resolution over a three-month period.

• Implemented auto-escalation alerts, reducing communication delays and significantly improving response times.

• Reduced manual effort by 60% through API-based reporting, automated monitoring, and desktop automation tools.

BOSCH L.OS — Senior Technical Lead (2024–2025)

• SRE

• Grafana Dashboard Create & Monitor

• Synthetic monitoring Canaries

• ServiceNow (SNOW) ticketing tools

• Downtime reduced 95%


Role Responsibilities

• Lead end-to-end technical operations L3 team for the BOSCH L.OS platform, overseeing incident management, RCA, performance optimization, and overall system stability.

• Drive seamless project onboarding while managing complex technical issues through advanced monitoring, proactive alerting, and detailed incident reporting categorized by severity, root cause, and resolution time.

• Conduct in-depth Root Cause Analysis (RCA) and configure AWS Synthetic Monitoring Canaries using CloudWatch and log-based analysis to eliminate recurring issues and ensure early failure detection.

• Design and deploy advanced Grafana dashboards to enhance observability, real-time tracking, and microservice-level analytics while promoting operational excellence through automation and performance tuning.

• Collaborate closely with Dev, QA, Cloud, and business teams to optimize workflows, strengthen system resilience, and implement proactive monitoring mechanisms that reduced downtime and prevented customer-impacting incidents.


Key Achievements

• Implemented proactive measures that reduced downtime by 95%.

• Enhanced Operations Dashboard visibility, improving team efficiency.

• Reduced system downtime by 95% through proactive performance tuning and automation.

Shiprocket Omuni — Technical Ops Lead / Manager (2021–2024)

• Team Leadership and Collaboration

• Incident and Problem Management

• Intergration onboarding for a new project

• Reporting and Documentation

• Python Script Automation

• Problem Solving and Critical Thinking

• Service-Level Agreements (SLA)

• Jira service desk, Freshdesk Tickets Tool congiuration

• 98% recurring issues fixed


Role Responsibilities

• Managed end-to-end application server production issues, performing detailed root cause analysis to ensure stability and prevent recurring incidents.

• Serve as the primary point of contact for clients or partners during on-boarding process and address any challenges or issues that arise during the integration process promptly and effectively

• Perform regression testing in pre-prod and ensure that any changes do not negatively impact existing systems

• Understand the technical aspects of the integration process, including data flows, APIs, and system configurations and Collaborate with technical teams to troubleshoot and resolve technical problems.

• Handled end-to-end integration for Myntra, Flipkart, and Amazon within the IMS ecosystem, including inventory data synchronization via OMUI, order processing through OMS, and timely invoice generation and delivery to clients.

• Collaborated with cross-functional teams to resolve critical issues, support product enhancements, and maintain seamless business operations.

• Supervised and guided application support teams, reducing recurring issues through strategic automation and process improvements.

• Implemented a cross-functional collaboration framework, improving communication between technical and product teams and achieving a 95% reduction in incident resolution time.

• Strengthened operational efficiency by leveraging Jenkins CI/CD automation, Kubernetes log monitoring and building a knowledge-sharing platform that enhanced collaboration and reduced troubleshooting time.


Key Achievements

• Developed Python automation scripts and proactive monitoring dashboards that reduced daily ticket volume from 60+ to 20+, while cutting recurring issues by 98% and enabling 100% faster onboarding across teams.

• Implemented automation solutions that increased operational efficiency by 100%, significantly reducing manual effort for the 8-member support team and accelerating issue resolution.

• Enhanced CI/CD practices and cross-team collaboration, resulting in improved operational performance and a substantial reduction in Mean Time To Resolution (MTTR).

Arvind Internet Private Limited — Technical Support Engineer - II (2017–2021)

• CI/CD Pipeline Automation

• Kubernates Setup

• AWS (S3 Browser, EC2 Instance)

• Elasticsearch, Kibana

• Restful API’s

• SFTP & FTP Configurations

• Linux Commands

• Postman Setup

• Resolved complex application and integration issues through detailed RCA using Postman APIs, MySQL, Elasticsearch, Kibana, and Linux logs ensuring optimal performance and high availability.

• Developed and integrated Python + REST API automation scripts with Jenkins cron scheduling to streamline operational workflows and reduce manual intervention.

• Managed AWS EC2 instances, including deployment, configuration, optimization, and scalability improvements to support seamless cloud operations and documented technical solutions on Confluence and provided L2/L3 support for escalated issues, ensuring smooth operations and rapid issue resolution.

Arvind Internet Private Limited — Software Solution Engineer (2014–2017)

• Resolved complex application and integration issues through detailed RCA using Postman APIs, MySQL, Elasticsearch, Kibana, and Linux logs ensuring optimal performance and high availability.

• Developed and integrated Python + REST API automation scripts with Jenkins cron scheduling to streamline operational workflows and reduce manual intervention.

• Managed AWS EC2 instances, including deployment, configuration, optimization, and scalability improvements to support seamless cloud operations and documented technical solutions on Confluence and provided L2/L3 support for escalated issues, ensuring smooth operations and rapid issue resolution.

Skills

☁️ Cloud & Infrastructure

• AWS (EC2, ECS, Lambda, S3, IAM, CloudWatch)

• Cloud Operations & Cost Optimization

• Server & Application Migrations

• Kubernetes (Pods, Scaling, Monitoring)

⚙️ Operations & SRE

• Production Support (24×7)

• Site Reliability Engineering (SRE)

• Incident & Problem Management

• Root Cause Analysis (RCA)

• SLA / KPI Management

• Change & Release Management

📊 Monitoring & Observability

• Zabbix

• Grafana

• AWS CloudWatch

• Synthetic Monitoring (Canaries)

• Proactive Alerting & Health Checks

• Performance Monitoring & Optimization

🤖 Automation & CI/CD

• Python Automation

• Jenkins (Pipelines, Jobs, Automation)

• CI/CD Optimization

• Cron Jobs & Scheduled Tasks

🗂️ Logging & Analytics

• Elasticsearch

• Logstash

• Kibana (ELK Stack)

• Log Analysis & Troubleshooting

🛠️ ITSM & Collaboration Tools

• JIRA

• ServiceNow (SNOW)

• Freshdesk

• GLPI

• Phabricator (Code Review & Change Management)

👥 Leadership & Process

• Team Leadership (21+ members)

• Incident Bridge Leadership

• Shift & Roster Management

• Process Improvement & Documentation

• Cross-team & Vendor Coordination

• Mentoring & Knowledge Sharing

Projects

Contact

Mobile: +91 9066660537

Email: sprithiviraj.mca@gmail.com

LinkedIn: View Profile