Donna Glassbrenner, Ph.D.

Statistician & Data Scientist | Math PhD + Deep Stats + ML

Anomaly Detection • Impact Analysis • Risk Quantification

25 Years Solving High-Stakes Analytical Challenges | Fully Remote

GitHub LinkedIn

About Me

Statistician & Data Scientist with 25+ Years Solving High-Stakes Analytical Challenges

PhD mathematician with deep understanding of machine learning mathematics—enabling custom statistical solutions that consistently outperform standard ML approaches. Recent hybrid fraud detection system demonstrates rigorous production thinking: 48% cost reduction by optimizing for business value rather than standard metrics. Foundation projects show statistical depth through custom methods achieving 30% improvements beyond typical approaches.

What I bring:

Domain experience:

Specialized in anomaly detection, risk quantification, and impact measurement. Recognized with 21 federal awards for analytical innovation and cross-functional collaboration.

Technical tools: Python (pandas, numpy, scikit-learn, matplotlib), SQL, SAS, Tableau, Git/GitHub. Deployment experience: dbt, Databricks, Snowflake, Streamlit, Hugging Face.

Explore my projects below to see real-world examples of my analytical approach and impact.

Fraud Detection Projects

Self-Initiated R&D via Analysis Insights, LLC (6 months intensive focus): These projects showcase rigorous production thinking and deep statistical expertise that elevates ML results beyond what standard approaches achieve. My current flagship project demonstrates unusually complete business orientation beyond typical portfolio projects, while foundation projects explore how rigorous statistical methods consistently improve ML accuracy, reveal hidden patterns, and drive superior outcomes.

Hybrid Fraud Detection: Production-Ready ML + Rules System CURRENT PROJECT

Demonstrating rigorous production thinking and business orientation. While most portfolio projects optimize for F1-score, this system optimizes for total business cost—demonstrating unusually complete production orientation beyond typical portfolio work.

Hybrid Architecture: Combined rule-based logic (impossible travel, burst detection, velocity thresholds) with Random Forest ML (18 features) for explainability + nuance. Rules provide instant explanations for blocked transactions; ML captures subtle patterns rules miss.

Rigorous Production Methodology: Grid search over 512 threshold combinations using proper validation methodology. Cost function incorporates realistic payment industry economics (EMV liability, dispute modeling, customer churn)—business-driven optimization that yielded $40K additional savings (20% improvement) vs. arbitrary thresholds. Demonstrates focus on business value rather than standard metrics.

Realistic Issuer Economics: Modeled EMV liability shift (merchant pays for 85% card-present fraud post-2015), 3D Secure adoption (15% U.S. merchants vs. 80% Europe), dispute probability patterns (30% for <$10, 95% for >$500), customer churn (2% after false positive × $2,000 LTV), interchange revenue loss (2%).

8 Fraud Typologies: Analyzed detection rates across card testing (98%), stolen card CNP (95%), account takeover (92%), friendly fraud (90%), synthetic identity (85%), refund fraud (88%), application fraud (87%), lost/stolen card (95%). Detailed pattern specifications for each type.

Results: 48% cost reduction vs. rules-only ($314K → $162K), 10% vs. ML-only ($180K), on simulated transaction data (47,109 transactions, 1.63% fraud rate, 500 cardholders, 6 months). Demonstrates rigorous production thinking beyond typical portfolio projects.

🧠Random Forest 🧠cost-based optimization grid search 🔍business-driven ML 🐍Python 🐍scikit-learn

Foundation Work: How Deep Statistics Elevates ML

Supporting research demonstrating the rare combination of mathematical rigor + modern ML that drives superior results. These projects show how statistical expertise reveals insights ML alone misses, achieves accuracy improvements beyond standard approaches, and enables analytical solutions requiring mathematical depth most data scientists lack.

Statistical Enhancements to Machine Learning

Demonstrating how deep statistical expertise reveals insights and improves accuracy beyond what ML alone achieves— a rare combination of rigorous mathematical methods elevating ML results consistently.

Business Optimization

Analyzing trade-offs between fraud capture, false positives, and investigation costs to optimize business outcomes.

ML Model Benchmarking & Deployment

Systematic model comparison and hands-on deployment demonstrations using modern data platforms.

Technical Expositions

Deep dives into the mathematical foundations and domain-specific considerations of fraud detection ML.

Customized Data Analyses

📋 Coming soon: Detailed write-ups for these projects. To illustrate the analytical challenges I have solved, I use fully simulated data and altered contexts so as not to reveal any non-public information. These examples showcase my problem-solving approach and custom analytical solutions.

This section spotlights challenging, custom data analysis problems I have solved in settings where standard approaches often fall short.

When I was learning and teaching math and statistics, I might have wondered how often the problems I would later encounter in the "real world" would be solved by simple cookie-cutter applications of the formulas and techniques I was learning or teaching. It turns out, not very often.

Most of the time, the data being used or the question being asked deviated from standard protocols in some way (e.g. involving a ratio, rare events, or reporting lag). Or the client knew what s/he wanted in general-but-somewhat-ambiguous terms that didn't quite translate into math. Or the technique involved an approximation and it wasn't quite clear if the approximation would be good enough for the client's requirements.

I don't know what "most" analysts do in these situations. Some (or many?) might lack the in-depth math understanding needed to address such issues head-on and instead default to applying cookie-cutter techniques that might not give accurate answers. They might or might not be able to explain the limitations of their simplified analysis to their client. The client walks away with what they think have solid conclusions, but they don't.

I approach these situations differently. I enjoy the challenge of formalizing ambiguous problems, identifying and addressing deviations from standard protocols, and determining whether approximations are good enough. I have the math and statistics background needed to tackle these issues head-on, and I can explain the limitations of various approaches to my clients so they can make informed decisions.

This section highlights some of the types of customized data analysis problems I have solved. I use made-up numbers and have hidden contextual details so as not to reveal non-public information.

To showcase the value of my custom analyses, I sometimes include what a cookie-cutter approach that you might have gotten from generative AI, a Stats 101 website, or a lesser-equipped analyst.

Vehicle Safety Work

My vehicle safety work includes extensive collaboration with engineers and behavioral scientists to design studies, analyze data, and evaluate safety interventions. While some analyses were unpublished or advisory, the published projects below demonstrate impactful modeling and statistical innovation used to improve vehicle safety and policy. For illustrative examples of unpublished analytic work using simulated data and made-up contexts—including fraud-themed examples—see the Customized Data Analyses section and the Fraud Detection Projects section.

Contact

With expertise in anomaly detection, risk assessment, and impact measurement, I bring deep statistical rigor and production-oriented thinking to complex analytical challenges. Currently available for consulting through Analysis Insights, LLC and open to full-time opportunities.

Best way to reach me: Connect on LinkedIn or send me a message there