Senior Devops Engineer with LLM's Experience

8 - 13 years

0.5 - 0.6 Lacs P.A.

Pune, Bengaluru, Gurgaon

Posted:2 months ago| Platform: Naukri logo

Apply Now

Skills Required

DockerAws ClouddevopsKubernetes

Work Mode

Hybrid

Job Type

Full Time

Job Description

DevOps++, Kubernetes, Docker, AWS, Cloud External Description Description - External JD - What You Will Do Design, implement, and maintain LLM operations workflows using tools like Langfuse to monitor performance, track usage, and create feedback loops for continuous improvement Develop and maintain infrastructure-as-code for AI deployments using Terraform and AWS services (Lambda, SQS, API Gateway, OpenSearch, CloudWatch) Build and enhance monitoring, logging, and alerting systems to ensure optimal performance and reliability of our LLM infrastructure Collaborate with AI engineers to design and implement evaluation frameworks (including LLM-as-judge systems) to measure and improve model performance Manage prompt versioning, testing, and deployment pipelines through CI/CD and custom tooling Implement and maintain security guardrails for LLM interactions, ensuring compliance with best practices Create comprehensive documentation for LLM operations, including runbooks for production incidents Participate in on-call rotations to support mission-critical AI systems Drive innovation in LLM operations by researching and implementing best practices and emerging tools in the rapidly evolving GenAI space Kubernetes and Docker Implementation What You Will Bring To succeed in this role, you will need a combination of experience, technology skills, personal qualities, and education. Required Qualifications 3+ years of experience in DevOps, SRE, or similar roles, with at least 1 year specifically working with LLMs or AI systems in production Kubernetes and Docker experience Strong hands-on experience with AWS cloud services, particularly Bedrock, Lambda, SQS, API Gateway, OpenSearch, and CloudWatch Experience with infrastructure-as-code using Terraform, CloudFormation, or similar tools Proficiency in Python and experience building automation tooling and pipelines Familiarity with LangOps platforms such as Langfuse for LLM observability and evaluation Experience with CI/CD pipelines Knowledge of logging, monitoring, and alerting systems Understanding of security best practices for AI systems, including prompt injection mitigation techniques Excellent troubleshooting and problem-solving skills Strong communication skills and ability to work effectively with cross-functional teams Must be legally entitled to work in the country where the role is located Preferred Qualifications Experience with prompt engineering and testing tools like Promptfoo Familiarity with vector databases and retrieval-augmented generation (RAG) systems Knowledge of serverless architectures and event-driven systems Experience with AWS Guardrails for LLM security Background in data engineering or machine learning operations Understanding of financial systems and data security requirements in the finance industry Familiarity with implementing technical solutions to meet compliance requirements outlined in SOC2, ISAE 3402, and ISO 27001

IT Services and IT Consulting
Sunnyvale CA +

RecommendedJobs for You

Chennai, Pune, Delhi, Mumbai, Bengaluru, Hyderabad, Kolkata

Pune, Bengaluru, Mumbai (All Areas)

Chennai, Pune, Delhi, Mumbai, Bengaluru, Hyderabad, Kolkata

Bengaluru, Hyderabad, Mumbai (All Areas)

Hyderabad, Gurgaon, Mumbai (All Areas)