The majority of AI projects do not fail at the modeling stage. They fail months earlier — in the conversation where a stakeholder said “let’s do AI” and a team of engineers nodded and opened a Jupyter notebook without asking a single clarifying question. The problem is always upstream.
At the Engrammers, we have observed this pattern enough times to build a structured four-phase workflow that every engagement must pass through before a single line of modeling code is written. This dispatch documents that workflow in full. It is not proprietary. We are publishing it because the industry needs more teams building for production and fewer teams building for demos.
// 01 The Problem With “Just Build A Model”
AI projects fail at deployment, not at modeling. They fail for three compounding reasons that are almost always present together: no clear business question was defined at the outset, data was assumed to exist and to be clean rather than audited, and success was never formally defined before the work began. The consequence is a technically functional model that solves a problem no one actually has, built from data no one fully understood, evaluated against a metric no one agreed to.
This is not a failure of engineering competence. It is a failure of process. The model is often excellent. The engagement is a waste.
A model that no one uses is worth less than a spreadsheet that gets checked every Monday morning. Deployment and adoption are part of the technical spec.
// 02 Phase 1: Scoping
Scoping is the most important phase and the most frequently skipped. Its purpose is to translate a vague request — “we want AI” — into a precise technical brief. Nothing moves forward until four questions have been answered in writing and agreed upon by all stakeholders.
- Define the decisions that need to be made — not “do AI”, but “predict X to make decision Y”
- Identify available data sources and their current state: format, location, access, freshness
- Define success in business terms — e.g. “reduce stock-out events by 20%”
- Agree on what “done” means before starting — including who signs off and when
Scoping typically takes one to two working days for a well-organised client. For clients with fragmented knowledge ownership, it can take a full week. That time is never wasted. Every hour spent in scoping saves at minimum five hours in rework downstream.
// 03 Phase 2: Data Engineering
Data engineering is where most optimistic project plans collide with reality. The assumption that data “exists” is almost never entirely true. Data exists in some form, in some state — and the gap between that and “data the model can learn from” is precisely what this phase closes.
- Data audit: examine schema, null rates, duplicate records, and feature distributions to understand what you actually have
- Cleaning pipeline: standardise date and categorical formats, handle missing values with documented imputation strategies, remove or flag duplicates
- Feature engineering: transform raw data columns into model-relevant inputs — lag features, rolling aggregates, binary encodings, interaction terms
- Pipeline automation: build reproducible, scheduled pipelines so that the same transformations run automatically on new data at inference time
from airflow import DAG from airflow.operators.python import PythonOperator from datetime import datetime, timedelta # Default arguments for the DAG default_args = { "owner": "engrammers", "retries": 2, "retry_delay": timedelta(minutes=5), "start_date": datetime(2026, 3, 4), } with DAG( dag_id="data_engineering_pipeline", default_args=default_args, schedule_interval="@daily", catchup=False, ) as dag: # Task 1: Ingest raw data from source system ingest = PythonOperator(task_id="ingest_raw_data", python_callable=run_ingestion) # Task 2: Clean — nulls, types, duplicates clean = PythonOperator(task_id="clean_and_standardise", python_callable=run_cleaning) # Task 3: Validate schema and distribution expectations validate = PythonOperator(task_id="validate_pipeline_output", python_callable=run_validation) # Task 4: Store feature-engineered dataset to feature store store = PythonOperator(task_id="store_to_feature_store", python_callable=run_storage) # Pipeline execution order: ingest → clean → validate → store ingest >> clean >> validate >> store
Each task is a discrete, testable unit. If validation fails, the DAG stops and alerts — the downstream store task never runs with bad data. Every transformation applied at training time must be replicated exactly at inference time. This is the rule most quick-and-dirty ML projects violate, causing models that validate well historically to degrade immediately in production.
// 04 Phase 3: Modeling & Evaluation
By the time modeling begins, the business question is pinned, the data is understood, and a success metric has been agreed. Modeling can therefore focus entirely on the technical question: what is the simplest model that meets the target metric on held-out data?
- Baseline model first — always start with a simple linear or logistic regression before introducing any complexity. It sets the performance floor and is often good enough.
- Iterate with complexity only if the baseline fails to meet the target RMSE, accuracy, or AUC defined in scoping
- Cross-validate on temporal splits, not random splits — business data has time structure and a random split leaks future information into the training set
- Document every experiment: hyperparameters, feature sets, validation scores — so results are reproducible and decisions are auditable
“We never deploy a model that cannot explain itself to the business owner who will use it. Interpretability is not optional — it is a feature requirement.”
The interpretability requirement is practical risk management. A model a business owner cannot interrogate is a model they will not trust. A model they do not trust is a model they will not use. SHAP values, partial dependence plots, and coefficient tables are not supplementary visualisations — they are part of the deliverable.
// 05 Phase 4: Deployment & Monitoring
Deployment is where an experiment becomes a product. The model leaves the notebook environment and enters a system called by real users, on real data, with real consequences. The engineering requirements shift completely at this point.
- Wrap the model in a FastAPI endpoint with clear input/output schemas and structured error handling
- Connect the endpoint to existing business tooling — ERP system, internal spreadsheet, operational dashboard — so users interact with familiar interfaces, not raw API responses
- Set up data drift detection using statistical distribution comparisons between training-time and live inference feature distributions
- Schedule an automated retraining trigger when drift exceeds a defined threshold, so the model stays calibrated without manual intervention
Production monitoring is not a post-launch add-on. It is designed during phase one. The business metric defined in scoping must be measurable in the live system. If the operational data pipeline does not capture the outcomes the model was built to influence, the model cannot be evaluated against its actual purpose and retraining triggers will be based on proxy signals rather than ground truth.
// 06 Why This Matters For You
Whether you are working with the Engrammers or with another team, the four-phase structure described here is not a proprietary methodology. It is the minimum viable framework for any AI engagement serious about production outcomes rather than proof-of-concept demos.
Demand this process. Ask any team you work with to show you the scoped business question in writing before modeling begins. Ask them to show you the data audit results. Ask them how they plan to monitor model performance after deployment. Any team that jumps straight to modeling without auditing the data and defining success is building for a demo, not for production.
The Engrammers publish this not because it is new, but because it is not yet standard. The gap between how AI projects are described in conference talks and how they actually perform in the field is still largely a process gap. We intend to close it, one engagement at a time.