AI / GenAI Specialist · Full-Stack Engineering Lead

I build production AI & full-stack systems
that scale.

6+ years shipping LLM, RAG and agent systems on a React/Next.js, Node/NestJS, Python and AWS stack. I lead the engineering team at India's largest dental e-commerce platform — where I've proven these systems at real scale. The same patterns power any high-traffic product: SaaS, fintech, marketplaces and healthtech.

5,000+ AI queries handled / day
5x web performance gain
87% search cost cut
Leads a team of 6
Impact at a glance

Numbers from real production systems

5,000+
Daily customer queries auto-resolved by my AI assistant
−70%
Support tickets after deploying the GenAI chatbot
15→80+
Core Web Vitals after Next.js migration (5x)
−87%
Monthly search cost ($2,300→$300)
−80%
Cloud egress cost (S3→Cloudflare R2)
50+
AWS Lambda functions for event-driven scale
25+
Containerized microservices on AWS ECS
30+
CI/CD pipelines (incl. AI code review)
Problems → Solutions → Results

Real engineering problems I've solved

These come from running India's largest dental e-commerce platform — but the walls are universal to any fast-growing product: support can't keep up, pages get slow, search gets expensive, traffic spikes break things, and the cloud bill spirals. Here's how I solved each at scale.

Support team drowning in repetitive order queries

"Where's my order?", returns, refunds, cancellations — thousands of repetitive tickets per day, slow responses, rising support headcount.

PythonFastAPILangChainLlamaIndexRAGMCPVector DBAWS ECS
Built
A production AI assistant with intent classification + tool-calling (Claude/OpenAI), a RAG layer over policy/FAQ docs, and MCP servers exposing live order data from MySQL/MongoDB — so it can actually track, place, search, return and cancel orders, not just chat. Deployed on AWS ECS with Redis caching; integrated into WhatsApp and the storefront widget.
Result
Handles 5,000+ queries/day and resolves them end-to-end — support tickets down 70%.

Slow storefront killing SEO and conversions

A heavy client-rendered React site meant poor Core Web Vitals, weak Google rankings, and a costly third-party prerender service just to be crawlable.

Next.jsSSR / SSG / ISRReactTypeScriptSEO
Built
Architected and led the full migration to Next.js with server-side rendering and incremental static regeneration, eliminating the prerender service and shrinking build size.
Result
Core Web Vitals jumped 15 → 80+ (5x), with improved SEO rankings and organic traffic.

Product search that was slow, weak, and expensive

A $2,300/month hosted search bill, limited relevance control, and customers who couldn't find products by image or voice.

TypesenseVector SearchEmbeddingsAWS BedrockNode/NestJS
Built
Re-architected search onto self-managed Typesense with real-time sync pipelines and a full search dashboard; added image-based product search using vector embeddings and voice/image search via AWS Bedrock.
Result
Search cost cut 87% ($2,300 → $300/mo) with richer discovery (text, image, voice).

A legacy monolith that couldn't scale the catalogue

The Magento catalogue service was slow to extend, expensive to run, and a bottleneck for new features.

NestJSGraphQLRESTKrakenDDocker
Built
Replaced the Magento catalogue with a purpose-built NestJS microservice, exposed via REST and GraphQL behind a KrakenD API gateway, deployed on AWS ECS/ECR.
Result
Lower AWS bill and faster feature delivery, with a clean path to a microservices architecture.

A cloud bill spiralling out of control

Storage egress, an external ETL pipeline, and over-provisioned compute were quietly inflating monthly AWS spend.

Cloudflare R2AWS Zero-ETLRedshiftECS/RDSCloudWatch
Built
Migrated static assets S3 → Cloudflare R2, replaced a Hevo ETL pipeline with AWS Zero-ETL (RDS → Redshift), right-sized ECS tasks and RDS instances, and pruned log retention.
Result
80% egress reduction and significant, recurring monthly savings.
Built for peak load

Handling high-concurrency traffic

E-commerce traffic isn't steady — sales, festival rushes and marketing pushes create sudden floods of concurrent requests. I architected the platform to absorb those spikes without falling over.

How it stays up under load

Compute
25+ containerized microservices on AWS ECS/ECR with load balancing via ELB, so capacity scales horizontally during spikes.
Event-driven
50+ AWS Lambda functions wired through API Gateway, EventBridge, SNS and SQS — absorbing bursts asynchronously instead of blocking users.
Speed
Redis caching and Next.js ISR cut repeated load on the origin; KrakenD gateway routes and protects backend services.

Why it matters

The AI assistant alone fields 5,000+ live queries a day against real order data — concurrent reads/writes across MySQL, MongoDB and microservice APIs — while the storefront serves peak shopping traffic. The result: stable response times and no degradation when load surges.

AutoscalingAsync / queuesCachingLoad balancingObservability
Toolbox

Full-stack, AI-first

AI / GenAI

LLMs (Claude, OpenAI)RAGAI AgentsMCPLangChainLlamaIndexVector SearchAWS BedrockEmbeddings

Backend

Node.jsNestJSPythonFastAPIDjangoGraphQLRESTKrakenDMicroservices

Frontend

ReactNext.jsTypeScriptJavaScript (ES6+)SSR / SSG / ISRHTML5 / CSS3

Cloud, Data & DevOps

AWS (ECS, Lambda, RDS, Redshift, S3)DockerGitLab CI/CDCloudflare R2MongoDBPostgreSQLMySQLRedisTypesense
Always leveling up

What I'm sharpening next

The roadmap I'm following to stay at the front of production AI engineering.

Agentic AI & LLMOps

Multi-agent orchestration (LangGraph, CrewAI), evaluation & guardrails, prompt/version management, and observability for LLM apps (LangSmith, tracing, cost/latency monitoring).

Scale & reliability

Kubernetes (EKS) for orchestration beyond ECS, load/stress testing (k6), and resilience patterns — circuit breakers, rate limiting, and graceful degradation under burst traffic.

Credentials & depth

AWS Certified Solutions Architect, deeper vector-DB tuning (pgvector, Pinecone, Qdrant), fine-tuning vs. RAG trade-offs, and streaming data (Kafka) for real-time personalization.

Let's build something

Have an AI or full-stack problem to solve?

I take ideas from architecture to production — for SaaS, fintech, marketplaces, healthtech and e-commerce. Available for senior full-time roles and select freelance/contract engagements.

New Delhi, India · Open to Remote