Blog

The Last Mile of AI: Why “Almost Solved” Isn’t Good Enough 

AI is advancing rapidly, but one reality continues to get overlooked: the last mile of automation is almost always the hardest. The early wins like getting models to do something impressive most of the time create excitement. But it’s the last mile that is the final stretch to match or exceed human-level performance across real-world scenarios that proves truly difficult.

Again and again, that “last mile” is where progress slows and hype collides with reality. Convincing an industry to adopt autonomous, cognitive AI requires a level of performance that’s at least as good as a highly skilled knowledge worker. This is where the real complexity and challenge lies.

You can see this pattern of underestimating across decades of AI progress. Chess was a solved problem in theory by the 1970s. Early engines could beat amateurs. But it took until 1997 for Deep Blue to defeat Garry Kasparov. This symbolic AI moment came nearly 50 years after researchers first built chess-playing machines. Likewise, the foundations for machine translation were laid in the 1950s. But high-quality, near-human translation only emerged with the advent of deep learning and Transformers in the early 2020s, after decades of disappointing progress.

And then there’s self-driving cars. They’ve been “almost here” for several decades. Billions have been poured into the space. But most of us are still driving ourselves to work. Why? Because while 95% of driving is predictable and solvable, the other 5% – the ambiguous, unexpected, messy real-world scenarios – are incredibly hard to model. I’d really like to tell my car to drive me from my office in New York City to a favorite beach in La Jolla, CA over the weekend while I read my favorite novel in the backseat, but current self-driving AI technology is not quite there yet.

The “last mile” of automation is not just difficult, it’s often what separates a cool demo from a commercially viable, trusted solution.

Complexity Hides in the Outliers

At the heart of the last mile problem is the challenge of generalization. AI systems are great at learning from large datasets and handling typical cases reasonably well. But they struggle with the outlier cases that may be rare but matter enormously in practice.

Consider chess again. The number of possible legal board positions is estimated at 10^120, that’s far more than the number of atoms in the observable universe. You can’t solve it by brute force. Chess engines had to learn how to prune possibilities intelligently and evaluate complex positions with nuance.

Now consider the outlier complexity of self-driving cars. There are effectively infinite driving scenarios like weather, lighting, pedestrian behavior, road construction, cultural norms. You’re not just modeling one problem space, but an unbounded set of them. No dataset can fully cover that territory, and no LLM or deep learning model can reason about every rare combination on its own.

This same challenge is emerging today in generative AI applications across finance, insurance, law, and healthcare. Companies are racing to automate knowledge work with large language models. And there’s clear value: LLMs are powerful tools for drafting, summarizing, pattern recognition, and much more.

But here’s the catch: just because a model can complete 90% of a task doesn’t mean it’s ready for production. Building a fully automated system means handling the unexpected. And LLMs, while impressive, don’t reason like humans. They don’t have intuition. They generate statistically likely sequences of text or actions, not robust decisions under ambiguity.

Mortgage Automation and the Illusion of Simplicity

Nowhere is this last-mile problem more relevant than in home lending.

On paper, using AI to build an automated financial profile of a borrower seems straightforward. Feed in income data, bank statements, credit reports, employment history and let the model generate a lending decision. The top 80% of applicants follow a pretty clear pattern. Where is the complexity in automating this process?

The reality is, building a truly robust system means accounting for edge cases: gig workers with irregular income, applicants with thin credit files, lending applications where docs and data are not in sync, borrowers with strong cash flow but low FICO scores, or applicants who switch jobs frequently. These cases, among others, can often make or break real-world lending operations.

An LLM might summarize a tax return or extract paystub data with reasonable accuracy, but it lacks the intuition to know with what certainty it actually knows what it knows. Moreover, deciding how to interpret a complex income situation, apply policy rules, or flag inconsistencies are all problems difficult for an LLM to decide. Loan officers bring judgment, context, and institutional memory. GenAI doesn’t have that out of the box.

Solvable, But Not Simple

This doesn’t mean AI won’t eventually transform lending and other areas of finance and knowledge work. It will. But we must be realistic about timelines and complexity. These problems are solvable, but not by strapping together some open-source LLMs and calling it a day. They require strong AI expertise, thoughtful engineering, deep domain knowledge, careful data curation, and layers of domain-specific logic to catch what the model misses.

Most importantly, they require humility. The last mile will take time.

Too many teams underestimate this. They assume because the first demo works, the problem is basically solved. But production-grade automation, especially in regulated, high-stakes environments like lending and finance, is a different game entirely. It’s not about getting a good answer most of the time. It’s about getting the right answer every time.

The Path Forward

If you’re in lending or financial services and thinking about AI, here’s the bottom line: aim for augmentation, not replacement. Use models to boost productivity, improve data access, and reduce manual work. But understand that solutions performing at human levels of expertise often take many years to develop, especially where edge cases and judgment calls matter.

Over time, with better models, richer data, and smarter systems, we’ll close the gap. We’ll automate more of the last mile. But it won’t happen overnight. It will take the same AI-based persistent progress that we’ve seen in chess, language translation, and self-driving cars.

Progress is real. But complexity is real too. Let’s not confuse the first step with the finish line.

Join the Conversation!

Subscribe to our newsletter for my future blogs on AI in mortgage lending, where I’ll explore technical advancements like intelligent automation, AI-driven analytics, and self-learning systems. 

Subscribe on LinkedIn