Resume
My Resume on GitHub
Install / Use
/learn @puneetsl/ResumeREADME
Puneet Ludu's Resume
Latest Version: v10.8
Resume in Markdown
Puneet Ludu
ML Engineer & Tech Lead | 13+ years in ML/AI systems
puneet [dot] ludu [at] gmail [dot] com | New York, NY | +1-(716) eight six seven four three four four
puneet.io
Experience (13+ years)
Zillow (Zestimate)
Machine Learning Engineer (Tech Lead), Sep 2021 - Present, Remote
-
Listing IQ: Interactive CMA Platform (Django, DocumentDB, PyTorch) [Real-time Valuations + Embeddings]
Led proof-of-concept and core architecture for platform now live as Listing IQ CMA: AI-powered tool enabling agents to customize property comparisons with map-based filtering, editable valuations, and amenity-level explanations. Designed APIs integrating real-time Zestimate valuations, property embeddings, and comparative analysis. Mentored junior engineer through production launch.
Impact: POC → shipped product powering agent pricing decisions; foundation for new agent revenue products
-
Active Listing Comps Engine (Apache Spark, Metaflow, H3 Geospatial) [Similarity Scoring + Batch/REST APIs]
Architected and built similar listings comparison engine from ground up: batch pipeline and real-time API using H3 geospatial indexing and customizable ranking algorithms. Processes 3M+ listings, powers Zillow Showcase dashboards (8M monthly views) helping agents demonstrate listing performance vs non-Showcase homes. Led migration to append-only listing data source; built lifecycle management and deduplication layer to track active, sold, and removed listings across markets. Production in 5 weeks. Later led algorithm improvements.
Impact: 5 weeks to production; 95% complaint reduction, coverage 90%→98.5%; metrics cited in sales, marketing, and investor communications
-
Infrastructure & Engineering Leadership (Terraform, AWS, Metaflow, Docker)
Built Valuation API from POC to production: FastAPI service handling ~6K requests/day with zero P2 alerts since launch. Led Zestimate 6.6 deprecation (Redis→hybrid cache feature store for real-time valuations, legacy system shutdown), saving $350K annually. Drove several Zestimate model point releases to completion; created MR tracking system, led ETL optimizations. Completed CI/CD pipeline modernization across all team services with zero production incidents.
Impact: $500K+ combined annual savings, 61% alert reduction (641→249 YoY)
-
AI/ML Initiatives (In Progress) (PyTorch, CLIP, LangChain, Databricks) [Multimodal Explainability → Agentic AI]
Built LLM-powered customer care agent (React + LangChain): first AI assistant deployed to Zillow customer support for personalized Zestimate explanations. Leading Image Embeddings project: experimenting with DualLossAutoEncoder to predict property condition from CLIP embeddings (3.3M listings) for Zestimate integration. Prototyping explainable valuation architecture to produce per-feature dollar-level contributions for downstream AI agents. Led core Zestimate neural network ETL pipeline migration to Databricks; identified and resolved 8x latency and 10x cost regression before production deployment.
Impact: First LLM tool deployed to customer care; building multimodal and explainability infrastructure for next-generation valuation products
-
Mentorship & Technical Leadership
Managed summer intern (2023): designed project plan, weekly check-ins, received "strongly favorable" feedback. Mentored 4+ engineers across many projects, deployments, and onboardings. Created RFC templates, Stacked MR best practices etc. adopted by the team. Conduct technical interviews, lead code reviews, and coordinate cross-team design discussions.
Match Group (OkCupid)
Machine Learning Engineer, May 2020 - Sep 2021, New York City
-
Discount Optimization (Python, Keras, TensorFlow, Weights and Biases) [Wide&Deep]
Owned end-to-end ML pipeline for subscription discount optimization: feature engineering, model training, A/B testing, deployment, and production monitoring. Discovered high prediction variance across model runs; designed ensemble uncertainty estimation using 100-model bagging to quantify and stabilize outputs for production deployment.
Impact: 6% overall revenue increase through A/B tested pricing models
FactSet
ML Engineer → Senior ML Engineer, Apr 2015 - May 2020, New York City
-
ML-Powered Financial Data Extraction (Python, TensorFlow, Keras, Sagemaker) [CNN, ELMo, BiLSTM]
Led multiple ML initiatives: (1) Speaker identification system for earnings calls using spectrograms and CNNs, (2) Private company fact extraction from 1.6M websites using ELMo/BiLSTM. Rewrote MLangID language identification service. Led machine translation infrastructure (Polish SMT achieving BLEU 69.10).
Impact: 20% reduction in human-hours for earnings call processing, automated extraction from millions of documents
-
Financial Document Search & Ranking Systems (Apache Spark, Java, Python) [Distributed Trie, N-gram LM, Vector Space]
Led team of 3 engineers on Document Screening: built autosuggestion and concept similarity systems. Created FingerPrinter deduplication service (10× response improvement: 1000ms→100ms). Architected Formula Lookup using distributed trie and n-gram language models on Spark.
Impact: Improved formula ranking from 5.6 to 2.3, 66% faster document processing, powered StreetAccount trending news
-
Technical Leadership
Established engineering best practices: Jenkins CI, comprehensive test suites, documentation standards. Mentored new hires and junior engineers. FingerPrinter became the model Java project within the ML group.
Tata Research Development and Design Centre
ML Research Engineer, July 2011 - July 2013, India
-
Event Detection in Time Series (Java, Python, RapidMiner) [SVM - RBF]
Wrote an algorithm based on Shape Context for finding frequently occurring patterns and events, with as good results as SAX, DTW etc. with 7% better results in the particular domain of car sensors.
-
Data Harmonization Framework (DHF) (Java, Apache Pig)
Implemented an ETL framework that exploits the power of map-reduce and big-databases to fuse incongruous enterprise data from disparate sources in near real time.
Skills
| Languages | Python • Java • C/C++ • Bash • SQL | |---|---| | ML/AI | PyTorch • TensorFlow • CLIP • LangChain • RAG • A/B Testing • Uncertainty Estimation • Monitoring | | Data & Infra | PySpark • Databricks • MLflow • Metaflow • FastAPI • Django • Docker • Kubernetes • Terraform | | Cloud & CI/CD | AWS (S3, EC2, SageMaker) • GitLab CI • W&B • Pinecone | | Leadership | System Design • Technical Interviews • Mentoring • RFC Authorship • Cross-team Coordination |
Publications
- Inferring Latent Attributes of an Indian Twitter user using Celebrities and Class Influencers, ACM Hypertext 2015
- Inferring gender of a Twitter user using celebrities it follows, CORR 2014
- Architecture for Automated Tagging and Clustering of Song Files According to Mood, IJCSI, 2010
Education
- Master of Science in Computer Science, State University of New York, Buffalo, NY
- B. Tech. in Computer Science and Engineering, JIIT, India
Open Source & Community
| Organizer @ MUFin | Program committee member, paper reviewer at top ML conferences: Workshop on Modeling Uncertainty in the Financial Sector (AAAI 2023, ECML-PKDD 2022) | |---|---| | Lotion | Unofficial Notion.so Desktop app for Linux (2K+ GitHub stars / 60K+ Clones & Downloads) | | Romadeva | Tool to convert Roman script to Indic(Devanagari) script (Used by Translators Without Borders) | | Quena | Question and Answering system -- Indexed 1.6 Million Wikipedia documents, designed a question parser and a ranking algorithm based on popularity. (Apache Solr, NER, POS tagger) |
Download Options
Previous Versions
All previous versions are available in the versions directory.
Compilation Instructions
For instructions on how to compile this resume, see RUNSTEPS.md.
Related Skills
node-connect
351.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
110.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
351.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
351.4kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
