LLM Validation
Quality assurance, rubric design, structured evaluation, and review workflows for LLM-supported analysis.
Consulting
I help clients turn messy records, qualitative assessments, text, geospatial information, and small or imbalanced datasets into auditable workflows and decision-ready evidence.
My work is strongest where the data are difficult to structure, the categories require domain judgment, and the results need to be credible to both technical and nontechnical stakeholders. I bring more than a decade of research and applied analysis experience across law, policy, defense, international security, and nonprofit work.
Quality assurance, rubric design, structured evaluation, and review workflows for LLM-supported analysis.
Text classification, coding workflows, training-data reconstruction, and model evaluation for legal or policy corpora.
Analysis workflows linking administrative, survey, social media, and geospatial data for stakeholder-facing evidence.
Risk-score validation, review consistency, rare-outcome prediction, model diagnostics, and reproducible reporting.
Practical analytical systems, not just one-off analysis.
Build review systems that check AI-assisted or analyst-written outputs for omissions, internal consistency, score alignment, evidence support, and category-specific quality standards.
Turn legal, policy, compliance, survey, or research text into structured data that can be reviewed, measured, and modeled.
Design evaluation workflows for programs where outcomes are indirect, hard to measure, or scattered across administrative, survey, social media, and contextual data.
Evaluate whether qualitative assessments, written justifications, and structured scores line up in ways that are consistent and defensible.
Build spatial and contextual features when records need to be linked to districts, counties, precincts, conflict zones, or other geographic units.
Move analysis out of fragile spreadsheets and one-off scripts into documented workflows that can be rerun, inspected, and extended.
Clarify the decision problem, audit available data, identify the right analytical path, and separate what can be measured now from what would require better data or labels.
Build a first version of the workflow, test model options, document assumptions, and produce outputs that stakeholders can inspect before a larger implementation.
Review an existing model, LLM workflow, evaluation design, or analytical report for methodological weaknesses, data leakage, unclear assumptions, or missing diagnostics.
Turn complex analysis into memos, visuals, presentations, and recommendations for legal, policy, executive, research, or program teams.
Built LLM-based validation and quality-control workflows for compliance assessment outputs, including category-specific rubrics, structured context, comparison examples, and schema-constrained review outputs. Earlier work developed a validation framework for private-equity risk scoring using embeddings, clustering, statistical testing, NLP features, and predictive modeling.
Evaluated a high-profile political engagement program by linking candidate records, vendor social media data, and geospatial covariates, then modeling program sign-up and post-signing sentiment across roughly 1.15 million social media posts. Also built a legal NLP workflow for automating the coding of self-expression laws.
Audited and replicated regression and machine-learning models for defense cost estimation, including support vector regression, neural networks, and decision trees.
I can scope the problem, build the data workflow, evaluate modeling options, communicate uncertainty, and translate results into recommendations that stakeholders can use.
Start a conversation