How to Evaluate an AI Engineering Partner: 10 Questions to Ask Before You Sign

There's a version of this story that has become depressingly common. A company hires an AI consultancy. Spends four to six months and a significant budget on discovery workshops, strategy sessions, and a proof of concept demo that impresses the board. Then the vendor rolls off the engagement, the POC sits on a staging server no one maintains, and the operations team is back to doing everything manually.

The problem usually isn't that the vendor was dishonest. It's that the company didn't know what questions to ask before they signed. Strategy firms are good at strategy. They're not always good at shipping and supporting production software - and the two are not the same thing.

If you're evaluating AI development partners right now, these ten questions will tell you faster than any sales pitch whether you're talking to someone who builds things or someone who sells ideas about building things.

1. Who owns the code and IP when the engagement ends?

This one should be non-negotiable, and yet it regularly catches buyers off guard. Some vendors build on proprietary frameworks that create ongoing licensing dependencies. Some retain joint ownership. Some own the code outright and licence it back to you.

Ask for the IP clause in plain language before the contract goes to a lawyer. You want full, unencumbered ownership of everything built on your behalf - the code, the models trained on your data, the documentation. If there are carve-outs, understand exactly what they cover and why.

2. Can you show me something you've built that's running in production today?

Demos are easy. Running software serving real users under real conditions is hard. Ask to see a live system - not a recorded walkthrough, not a staging environment - and ask how long it's been running.

The follow-up question matters as much as the answer: ask what broke in the first three months after deployment and how they fixed it. A vendor with real production experience will have a clear, specific answer. A vendor who's only ever delivered proofs of concept will give you something vague about edge cases and continuous improvement.

3. Who is responsible for maintaining the system after launch?

AI systems are not install-and-forget software. Models drift as real-world inputs diverge from training data. APIs change. The underlying LLM your vendor built on releases a new version that behaves differently. Integrations break when your other systems update.

Who owns that? Is there a support retainer in the proposal? Is the maintenance team the same team that built it, or does it get handed off to a lower-cost support function that didn't write the code?

Get this in writing before you sign, not as an afterthought during the handoff conversation.

4. What does failure look like, and how would I know?

This question reveals more about a vendor's engineering maturity than almost anything else. A well-built AI system has monitoring, alerting, and defined failure thresholds. If the model's accuracy degrades past a certain point, someone gets notified. If a workflow automation fails to trigger, there's a log and an alert, not a silent error that sits undetected for two weeks.

Ask specifically: how will I know if this system stops working correctly? The answer should include concrete monitoring tools, alert conditions, and a defined response process. "We'll check in regularly" is not an answer.

5. How do you handle my data during the build?

If the system involves training on your proprietary data - customer records, transaction history, internal documents - you need clear answers here. Where does that data go during training? Is it used to improve the vendor's own models or shared with any third parties? What happens to it after the engagement ends?

For Gulf-region companies, this intersects with PDPL compliance in Saudi Arabia and equivalent regulations across the GCC. Your vendor should be able to speak to this fluently, not treat it as a procurement formality.

6. What decisions are you making that I won't see?

Every AI system involves architectural choices that aren't visible in the demo: which model to build on, how to structure the database, where to run inference, how to handle edge cases. Some of those choices optimise for the vendor's convenience or familiarity rather than your long-term interests.

Ask your vendor to walk you through the three or four biggest architecture decisions for your project and explain the trade-offs they considered. If they can't - or if every answer is "we use our standard stack" without further justification - that's a flag.

7. Can I speak with a client you built something for 18 months ago?

Not a recent client who's still in the honeymoon phase. An older one, far enough post-launch that the initial excitement has worn off and they know whether the system actually delivered.

Specifically ask: is the system still running? Did it get adopted by the team it was built for? What would they do differently? A vendor with genuine production track record will have references who can answer those questions. A vendor who's mostly delivering proofs of concept will struggle to find one.

8. What happens if the underlying AI model changes or gets deprecated?

Most AI systems today are built on top of third-party models - OpenAI's GPT, Anthropic's Claude, Google's Gemini. These models update, release new versions, and occasionally change behavior in ways that break downstream applications. What's the plan when that happens?

The answer should include an abstraction layer that isolates your application logic from the specific model version, a testing process for model updates before they go live, and a clear owner for that process. "We'll deal with it when it happens" is the answer of a vendor who won't be around when it happens.

9. How is ongoing support priced relative to the initial build?

Some vendors price the initial build attractively and recoup margin on support contracts. Others include a maintenance period and walk away. Neither model is inherently wrong - but you need to understand which one you're entering before you commit.

Get a concrete number for year-two costs, not a range. If the vendor can't give you a year-two estimate before you sign, it means they haven't thought carefully about what ongoing support will actually require. That's a gap in planning you'll pay for later.

10. What would you tell me not to build with AI right now?

This is the most important question on this list, and most buyers never ask it.

A vendor who is confident enough in their own pipeline won't lose work by being honest about what AI is and isn't ready for. If every problem you raise gets met with "yes, we can build that," you're not getting advice - you're getting a sales conversation.

The right answer sounds something like: here's what we'd automate immediately, here's what we'd approach differently, and here's what isn't worth the investment at your current data maturity or operational scale. That kind of specific, honest scoping is how you know you're dealing with engineers rather than salespeople.

One Last Thing

These questions protect you from the common failure mode. But they also describe, pretty accurately, how a good vendor should behave regardless of whether you asked. If you're going through this list and a vendor is giving you clear, specific, honest answers without hedging - that's your signal.

The AI industry has a credibility problem right now because too many companies sold strategy and delivered slides. The way out of that for buyers is specificity. The more concrete the question, the harder it is to answer with confidence you don't have.

If you want to run through these questions with us, we're happy to answer all of them - including the last one.