Who They Are
Our client is a UK-based Net Zero consultancy that helps organisations measure and manage their energy consumption and operational expenses across multiple sites.
Each month, they receive energy and expense data from numerous locations across the UK. However, this data arrives in a wide range of spreadsheet formats sourced from different platforms, email submissions, and files stored in OneDrive.
Because the structure of these spreadsheets varies significantly, consultants previously spent a large amount of time manually preparing, restructuring, and validating the data before it could be analysed.
Missing months of data were also common. To maintain reporting continuity, consultants relied on manual estimation logic to fill the gaps. However, this logic was rarely captured in a consistent or structured way. When team members changed roles or left the organisation, the reasoning behind earlier transformations was often lost.
As a result, new consultants had to reverse engineer previous work, understand how estimations had been applied, and ensure the methodology remained consistent.
In addition to energy data, the consultancy received a large volume of invoice PDFs containing important information that needed to be extracted manually. This created another time-consuming administrative burden, preventing consultants from focusing on the higher-value analytical work their clients expected.
Tools Used
Automated ingestion of data from OneDrive
Data cleaning and restructuring
Built‑in estimation logic for missing months
PDF parsing and data extraction for invoices
PowerBI
Interactive dashboard visualising energy and expenses per site
Snowflake
Centralised repository for processed and transformed data
Detailed Solution
Data Preparation
We began by automatically ingesting the monthly energy and expenses spreadsheets directly from OneDrive into Alteryx.
We then consolidated multi‑row records into single lines and restructured the data using dataset headings.
Cleaning steps removed null rows and columns, alongside unwanted leading or trailing whitespace.
The dataset was then split into two clearly defined tables, energy and expenses, making downstream reporting more efficient and intuitive.
Estimation Logic
The estimation process is now fully documented, auditable, and repeatable, ensuring consistent application regardless of staffing changes.
We also generated a dedicated report showing every value that was estimated, giving the client full transparency and traceability.
Invoicing PDFs
Using Alteryx, all invoice PDFs were imported directly from OneDrive.
Computer Vision tools extracted the text, which we then structured using a combination of parsing and transformation tools.
We grouped data by client name, removed unnecessary fields, and prepared a clean, reliable dataset ready for analysis.
The transformed invoice data was then loaded into Snowflake as part of the client's central data warehouse.
Output
Once transformed, all datasets were pushed into Power BI to create dynamic dashboards visualising energy usage and expenses by site. Timestamping and user tracking allow the client to see exactly when and by whom changes were made, ensuring ongoing alignment with transformation rationale.
Finally, all processed data was written into the client’s Snowflake warehouse, giving them a single source of truth for all energy and expenses reporting.
Impact
Full Traceability
Automated estimation logic and structured workflows preserve the transformation rationale end‑to‑end.
Efficiency Boost
Replacing manual estimation, restructuring, and invoice extraction greatly reduced consultant workload and freed them to focus on higher‑value activities.
User-Friendly
Timestamping and user metadata ensure that every change is transparent, consistent, and fully auditable.

