Sarah Sair

Sarah Sair
Generative AI & Data Engineer · LLM Systems · RAG · Prompt Engineering · GPT-4 · Claude · Gemini · Python · SQL · Chicago, Illinois
Email Sarah
LinkedIn Profile
Most AI demos break. I build the systems that don't.
I'm a Generative AI & Data Engineer specializing in LLM-powered applications, prompt engineering, and AI automation systems built to perform — not just impress in a sandbox. With 10+ years of IT infrastructure behind me, I ask whether systems are reliable under edge cases, scalable under load, and measurable over time.
What I Build
LLM Systems
End-to-end pipelines using GPT-4, Claude, Gemini, and open-source models for production-ready AI workflows.
RAG Pipelines
Retrieval-Augmented Generation with optimized vector retrieval and context ranking for grounded AI responses.
Prompt Engineering
Zero-shot, few-shot, chain-of-thought, and ReAct workflows focused on consistency and output quality.
Data Pipelines
SQL and Python-based ETL pipelines and data models supporting AI applications and business intelligence.
Core Tech Stack
AI / GenAI
GPT-4 · Claude · Gemini
Prompt Engineering & RAG
LLM Evaluation & Hallucination Detection
Model Testing & Scoring
Programming
Python · SQL · REST APIs
scikit-learn · Pandas · NumPy
Data & Analytics
PostgreSQL · T-SQL · ETL Pipelines
Data Modeling · Star Schema
Power BI · DAX · KPI Design
Tools
Git · GitHub · Jupyter
AI Automation · API Integration
AI & Data Engineering — Freelance
January 2024 – Present · Chicago, IL
~70%
Less Manual Work
Reduction in manual content generation time via LLM pipelines.
~40%
Fewer Hallucinations
Estimated cut in hallucination rates through optimized RAG pipelines.
~60%
Faster Pipelines
Reduction in pipeline turnaround time via AI-to-business workflow automation.
10+
AI Use Cases
Distinct use cases served by advanced prompt workflows and evaluation systems.
Built multilingual prompt chains, structured JSON output systems, reusable Python utility libraries, and model evaluation frameworks — cutting evaluation cycles from days to hours.
April 2025 – Present · Chicago, IL
Data Engineering & Analytics
Self-serve analytics tracking engagement, retention, and user performance — no SQL expertise required.
SQL & ETL Pipelines
Scalable pipelines using CTEs, window functions, and star schema models — cutting ad-hoc reporting time by ~50%.
Power BI Dashboards
Self-serve analytics tracking engagement, retention, and user performance — no SQL expertise required.
Lifecycle Analytics
Cohort analysis and churn risk monitoring enabling proactive identification of behavioral patterns.
March 2025 – May 2026 · Independent · Chicago, IL
Machine Learning & AI Projects
Built end-to-end ML pipelines — ingestion, preprocessing, feature engineering, training, and evaluation — using Python and scikit-learn, reducing development cycles from weeks to days.
Achieved consistent F1-scores above 0.85 across classification, regression, and clustering models via rigorous cross-validation and hyperparameter tuning.
Key Outcomes
~50% reduction in manual analysis via LLM-integrated workflows
40% faster project setup with reusable Python script library
Complex results communicated via dashboards for non-technical stakeholders
Education, Certifications & Awards
🎓 Education
DePaul University
Bachelor's Degree, Information Technology
📜 Certifications
Microsoft Azure AI Essentials Professional Certificate
Azure AI Engineer Associate Practice (AI-102)
Machine Learning in Telecommunication
Introduction to Prompt Engineering for Generative AI
🏆 Honors & Awards
TRIO Outstanding Achievement Award
Peer Leadership Award
Mercer Minority Scholarship
CTI Innovative Scholarship
NSF Scholarship
🌐 Languages
Urdu (Native) · Punjabi · English · Hindi — all at full professional proficiency.
AI automation system
AI automation system
Built an AI-Powered Content Generation Pipeline using Python and Zapier that turns a raw content brief into multi-channel marketing assets in seconds
Built an AI-Powered Content Generation Pipeline using Python, Zapier, and OpenAI API to transform raw content briefs into multi-channel marketing assets automatically.
Developed a custom automation workflow with Google Sheets ingestion, dynamic prompt engineering, Regex-based data sanitization, and API cost optimization using Python in Zapier.
Integrated OpenAI API with structured JSON Mode to generate reliable, production-ready outputs for scalable AI workflow orchestration.
Designed an automated distribution system that parsed AI-generated JSON responses and published formatted content directly to WordPress, LinkedIn, and Slack simultaneously.
Tech Stack :
OpenAI GPT-4o Zapier Node.js Express Airtable React  Google Docs API

AI-powered YouTube Script Automation System
Built an AI-powered YouTube Script Automation System using GPT-4o and  Zapier
Tech :Zapier  OpenAI API Prompt Engineering    

Tech: Python OpenAI API Prompt Engineering LLM Evaluation
Prompt Engineering Evaluation Framework
A production-grade harness for testing LLM prompts systematically, not by eyeballing output.
Four-dimensional weighted scoring (JSON structure, required keys, safety compliance, content rules).
Simulated OpenAI provider for continuous testing without API costs. 100% pass rate on test suite.
Reduced hallucinations 30–40% through structured prompt optimization.
Project
AI-powered Prompt 
A full-stack system for evaluating and refining prompts with structured scoring, iterative comparison, and AI-assisted optimization.
Designed a multi-dimensional prompt scoring framework measuring clarity, specificity, robustness, efficiency, and alignment, with hallucination-risk analysis and AI-generated optimization suggestions.
Implemented prompt versioning and head-to-head comparison workflows to track iterative improvements and evaluate performance across task types such as RAG, summarization, reasoning, and extraction.
Built a full-stack React application integrated with the Anthropic Claude API for real-time prompt evaluation, refinement, and predicted output simulation.
Tech :React18 JavaScript Prompt Engineering LLM workflows Tailwind CSS Anthropic Claude API (Claude Sonnet 4) AI Evaluation Systems
Portfolio Project
SQL Mentor — User Performance Analysis
End-to-end analytics pipeline for modeling user performance and engagement.
Built leaderboard, streak, and rolling 7-day metrics to track user progress and retention.
Developed question difficulty modeling and query optimization to improve performance insights.
Tech: PostgreSQL Advanced SQL Window Functions KPI Modeling
Machine Learning
Telecom
Customer Churn Prediction
End-to-end machine learning pipeline for predicting telecom customer churn.
Feature engineering, model training, and evaluation
Business-focused retention insights to support decision-making
Tech: Python Pandas scikit-learn Classificatio
Let's Build Something That Actually Works
If you're building LLM-powered products, AI workflows, or data-driven AI systems, I'd be glad to connect.
📧 Email
[email protected]
💼 LinkedIn
linkedin.com/in/sarahsair
💻 GitHub
github.com/sarahsair25
Get In Touch
View Portfolio
Made with