AI / ML Engineer

Puneeth N Ail

Building

I design production-ready AI systems across Voice AI, LLMs, RAG, and agentic pipelines. Co-author of an arXiv-published voice-security model reaching 99.16% accuracy at 90 ms latency.

Puneeth N Ail

About

I'm an AI/ML Engineer focused on the parts of machine learning that reach real users — real-time voice agents, retrieval-augmented assistants, and agentic systems that reason and act. I care about latency, robustness, and shipping models that hold up in production.

Currently pursuing an MSc in Data Science, I co-authored VoiceSHIELD-Small, a single-pass model for real-time malicious-speech detection published on arXiv. My work spans the full stack of applied AI: STT/TTS pipelines, LLM orchestration, RAG, and the evaluation discipline needed to trust what these systems do.

Experience

Dec 2025 — April 2026

Tech Intern · Emvo AI

Bangalore, India

  • Architected and deployed production-ready AI voice agent pipelines using Pipecat, enabling real-time conversational AI at sub-100 ms latency.
  • Co-authored VoiceSHIELD-Small (arXiv:2603.07708) — a real-time malicious-speech detection model achieving 99.16% threat-detection accuracy at 90 ms inference latency using the Whisper-small encoder architecture.
  • Optimized speech-processing pipelines for low-latency real-time voice AI, reducing processing overhead across multi-turn agent workflows.

Skills & Expertise

Technical Skills

Focus Areas

LLMs & Prompt EngineeringRAG & Vector Search Agentic SystemsVoice AI · STT / TTS Deep LearningModel Evaluation Tableau / Power BIPandas / NumPy

Selected Work

Production AI Voice Agent Pipeline

PipecatWebRTCWhisperLLMsTTS

End-to-end real-time voice agents built on Pipecat at sub-100ms latency — STT, LLM reasoning, and TTS in one streaming pipeline, with VoiceSHIELD-Small inline to neutralize adversarial inputs.

AI Mental Health Coach

FlaskGeminiWhisperMurf.aiLangChain

Full-stack multimodal coach for 24/7 empathetic dialogue using Gemini (reasoning), Whisper (STT), and Murf.ai (TTS). Ranked Top 5 of 200+ teams at Christ University Hackathon 2025.

CardioMind — RAG Cardiology Assistant

LangChainGemini 2.5ChromaDBRAG

A RAG-powered assistant that retrieves and synthesizes cardiology literature for evidence-based decision support — Gemini Text-Embedding-004 + ChromaDB for semantic search with source attribution.

AIMIT Admission Assistant

GeminiChainlitPrompt Eng.

A conversational chatbot integrating LLM APIs with Chainlit to automate student enquiries about admissions, fees, scholarships, and hostel facilities.

Face Attendance System

TensorFlowOpenCVFlaskSQLite

Real-time facial recognition at 98% accuracy, deployed for a live student cohort. Cut manual attendance time by 90%, serving 200+ students per session.

Telecom Churn Prediction

scikit-learnXGBoostPandas

Predicted customer churn at 85% accuracy (vs. 72% baseline), enabling targeted retention across a dataset of 7,000+ customers.

Research

cs.SDcs.CReess.AS

VoiceSHIELD-Small: Real-Time Malicious Speech Detection and Transcription

Puneeth N Ail · et al.

arXiv:2603.07708  [cs.SD]  ·  Submitted June 2026

Abstract — We present a single-pass model that combines speech transcription and malicious-intent detection over the Whisper-small encoder, protecting voice AI systems from prompt injection, social engineering, and adversarial voice commands without a second model in the loop. The unified architecture reaches 99.16% threat-detection accuracy at 90 ms inference latency.

99.16%Threat detection
90 msInference latency
Single-passArchitecture

Education

Sep 2024 — 2026

MSc in Data Science

St. Aloysius Institute of Management & IT (AIMIT)

Specializing in Voice AI, LLMs, and Agentic Systems. CGPA 9.27

Oct 2021 — Jul 2024

BCA — Bachelor of Computer Applications

SDM College of Business Management

Advanced problem-solving for technical challenges. CGPA 8.44

Certifications