5 - 8 years
35.0 - 60.0 Lacs P.A.
Hyderabad
Posted:2 weeks ago| Platform:
Hybrid
Full Time
ABOUT THE TEAM The Generative AI Quality & Experience team owns ABCs end-to-end conversational stack: the prompt library, multi-turn orchestration layer, and the evaluation platform that guards against hallucinations, logic regressions, and prompt drift. We build and run the automated pipelines that score accuracy, safety, latency, and cost for every agent interaction across Evo, Glofox, Ignite, and Trainerize. Working hand-in-hand with product managers, UX writers, backend engineers, and LLM researchers, we turn breakthrough language-model capabilities into reliable, brand-aligned user experiencesat scale and under tight latency budgets. As our Prompt Engineer / Conversation Designer (Backend) youll be the technical linchpin: designing conversation flows, hardening prompts, and proving quality through rigorous, metrics-driven testing. At ABC, we love entrepreneurs because we are entrepreneurs. We know how much grit it takes to start your own business and grow it into something that lasts. We roll our sleeves up, we act fast, and we learn together. You will be working closely with talented software engineers and senior engineering leaders to optimize and build upon our single integration platform strategy. WHAT YOU WILL DO Build automated evaluation pipelines (Promptfoo, OpenAI Evals, custom test harnesses) that score prompts and agent flows for correctness, relevance, safety and cost. Act as the bridge between backend engineering and product interaction teams, making sure every proof-of-concept behaves exactly as designed and agent logic remains consistent from prototype to production. Define Gen-AI quality metricsanswer accuracy, reasoning-step coverage, hallucination rate, latency and user-satisfaction scoresand surface them in live dashboards. Own regression-test coverage: schedule nightly eval runs, track logic regressions / prompt-stability issues and coordinate “human-in-the-loop” reviews for gray-area failures. Rapidly iterate and fine-tune prompts for GPT-4, Claude and open-weights models, using the evaluation pipeline to lift accuracy and suppress hallucinations. Architect multi-turn conversation flows and function-call chains that invoke micro-services or tool APIs to fulfil user intents. Design agent personas, guardrails and JSON-schema response formats that downstream services can reliably parse. Document prompt patterns, failure modes and best practices in an internal library adopted by all product squads. Collaborate with UX conversation designers to match tone, brand voice and accessibility guidelines. Mentor engineers on prompt safety, system-prompt layering and structured output design; champion an experimentation culture across teams. WHAT YOU WILL NEED •5–8 years total software-/ML-engineering experience including 2–3 years focused on QA or evaluation of ML/LLM systems. •Hands-on experience building test harnesses for LLM behaviour using OpenAI Evals, Promptfoo or similar frameworks; comfortable designing A/B and regression suites. •Proven success shipping features on GPT-4, Claude or similar models in production. •Deep skills with LangChain/LlamaIndex, OpenAI function-calling and JSON-schema prompts, plus retrieval-augmented-generation (RAG) pipelines. •Strong coding in Python (preferred) or Node.js; familiarity with CI/CD and serverless runtimes on AWS or Azure. •Meticulous attention to detail and documentation; able to codify evaluation protocols so results are reproducible and debuggable. •Solid grasp of conversation-design heuristics (intent recognition, turn-taking, error recovery) and evaluation metrics (BLEU/F1, toxicity scores). •Exceptional written & verbal communication—able to translate model behaviours and trade-offs for both technical and non-technical audiences. •Demonstrated mentoring skills—cultivating curiosity, psychological safety and innovation within diverse teams.
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Hyderabad, Chennai, Bengaluru
INR 18.0 - 33.0 Lacs P.A.
INR 5.0 - 10.0 Lacs P.A.
INR 6.0 - 15.0 Lacs P.A.
Bengaluru
INR 12.0 - 20.0 Lacs P.A.
INR 10.0 - 20.0 Lacs P.A.
INR 35.0 - 65.0 Lacs P.A.
INR 15.0 - 30.0 Lacs P.A.
Pune, Chennai
INR 5.0 - 13.0 Lacs P.A.
INR 4.0 - 9.0 Lacs P.A.
Bengaluru
INR 15.0 - 25.0 Lacs P.A.