Site Reliability Leader
Full Stack Problem Solver
Finance & payments background across industry leaders including PayPal, GEICO, USAA, and American Express. I build reliable, observable platforms that protect revenue and accelerate delivery.
Career Profile
Results-driven Site Reliability Engineer specializing in observability engineering. Experienced in building scalable monitoring, logging, and tracing systems using tools like Prometheus, Grafana, and OpenTelemetry to enhance system reliability and performance. Passionate about driving automation, improving incident response, and contributing to operational excellence. Eager to leverage expertise and grow into leadership roles to deliver even greater value to the organization.
Experiences
Professional Highlights
Grafana OSS Stack
- Grafana dashboarding and alerting
- Prometheus/Blackbox Exporters
- Service Discovery Automation
Python and Flask
- FastAPI/Swagger
- Pandas data management
- SQLAlchemy
Incident Mangement
- RCA and Postmortem
- Incident Triage
- Automation & Remediation
Docker and K8s
- Containerized app development
- Deployment with Helm
- Flux & ArgoCD