Data Scientist
Data Science
About Arcadia
We simplify energy management, so businesses can focus on everything else
Arcadia is the energy intelligence platform for businesses. One place to pay utility bills, buy energy, and advance sustainability — so teams can stop chasing data and start saving money.
About The Role
Arcadia's Applied AI team sits within R&D and is responsible for the ML and AI systems that power the company's utility data platform. The team is small and delivery-focused: a Director (Dixon Bross), one full-time ML engineer (Akshat), and a part-time contractor (Amy). The team ships production models and is embedded directly with engineering and data—this is not a research or advisory function. The team is at capacity. With a major extraction pipeline release (Hades) reaching production this month, the next set of workstreams—forecasting, audit optimization, and agent-powered data workflows—are ready to start but have no available bandwidth. This hire expands delivery capacity so we can run additional workstreams in parallel.
What You’ll Do
Statement data extraction. Arcadia processes millions of utility bills, converting highly variable pdf formats into a standardized data object. This person will contribute to the classifiers, routing logic, and LLM extraction prompts that modernize existing classification paths and add AI-enabled extraction where cost-effective. We have two AI extraction methods under development, and leading continued refinement of either strategy or developing a new approach are all in scope. Statement acquisition. Arcadia uses largely deterministic navigation logic to find pdf bills on utility provider websites. Supplementing these templates with automated agentic self-healing or developing agent-driven navigation and/or template creation could create substantial time savings and direct value by improving the manual review and correction of broken templates in place today. Forecasting. Several business operations depend on predicting when bills will arrive and what they'll contain. This person will build and iterate on forecasting models for bill availability and spend estimation, and help develop prioritization signals that guide human and automated interventions or change business practices based on more sophisticated customer segmentation. Audit and anomaly detection. A significant manual review queue exists today because our current detection logic generates too many false positives. This person will analyze flagged bills, quantify error rates, and help tune thresholds—groundwork for a larger reduction in manual costs.
What You’ll Bring
We're looking for someone with 1–3 years of hands-on ML experience who is comfortable working iteratively in a production codebase. The strongest candidates will have: • Solid applied ML fundamentals (statistics, classification, regression, evaluation methodology) • Python and SQL proficiency; scikit-learn, pandas, numpy • Some exposure to time series or forecasting work • Familiarity with LLMs and prompt-based workflows • Strong written communication and data visualization skills A Bachelor's or Master's in CS, Statistics, Math, or equivalent experience is expected.
*This job posting exists to fill a vacancy.