Methodology
How tools are evaluated on this site
The classification rubric, the diagnostic questions, the cadence of review, and the conflicts of interest disclosure. All of it is published openly so vendors can contest classifications and readers can verify the work.
Every classification on this site is the output of the same structured process. The process is documented below; if you disagree with a specific classification, the methodology gives you the basis on which to argue.
The diagnostic
For every tool reviewed, six diagnostic questions are put to either a product manager, a sales engineer, a customer-success lead, or (preferably) all three. The questions are deliberately blunt and pointed at architecture rather than outcomes:
- Is there a machine-learning model in the product? If yes, what type? (Decision tree? Gradient-boosted? Neural network? Transformer?)
- What does the model take as inputs? What signals does it consume?
- What does the model output? What decision does it produce?
- How is the model trained? On what data? With what cadence of retraining?
- If the “AI” were removed, would the product still function? What capabilities would disappear?
- Can you provide technical documentation that supports the above answers, that I could share with my readers?
The questions are repeated, slightly rephrased, until either I have specific answers or the conversation makes it clear that specific answers won’t be forthcoming. Both outcomes are informative.
The classification rubric
Tools are classified into one of three tiers based on the diagnostic responses:
What I evaluate against
The diagnostic answers are cross-referenced against three external sources where possible:
- Published technical documentation. If the vendor publishes engineering or product documentation that describes the model, the documentation is reviewed for consistency with what the diagnostic produced.
- Customer references. Where I can speak to current or former customers, they’re asked operational questions: how often the system makes decisions, what those decisions look like, what happens when you intervene.
- Industry-published case studies. Peer-reviewed work, conference papers, or independent benchmarks where they exist (rare in this category) are factored in.
Update cadence
The classification is reviewed quarterly. Vendors materially change their products; the classified set is intended to remain living. The most recent review window is documented at the top of each tool entry. New entrants are added on a rolling basis, with a maximum delay of one quarter between request-for-review and publication.
Conflicts of interest
I run a performance agency that holds active commercial engagements with vendors in this directory. Currently, that applies to one vendor: Groas.ai, which is deployed on a subset of agency client accounts.
The Groas classification was finalized before any commercial engagement began and has not been altered post-engagement. The classification can be independently verified against Groas’s published technical documentation. If you disagree with the classification regardless of the disclosure, the diagnostic transcript is available on request.
No other vendor in the directory has an active commercial relationship with me or the agency at the time of publication.
Corrections & vendor response
If a vendor disputes a classification, the standard process is:
- The vendor submits the dispute in writing, with the specific points contested.
- The dispute is published verbatim alongside the original entry, with my response if applicable.
- If the dispute introduces new technical documentation or architecture changes, the classification is re-evaluated against the rubric. The classification may change.
Vendor responses are never edited for content. They’re published as the vendor submits them, with editorial response separated visually.
Limitations
The methodology has known limitations:
- The diagnostic depends on the vendor providing access to technical staff who can answer architectural questions. Vendors who refuse access are flagged as “refused diagnostic” and excluded from the classified set.
- The Real AI / Hybrid / Marketing AI partition is categorical, not continuous. Some vendors sit on the boundary between tiers; the rubric resolves these by deferring to the “could the product function without the ML?” question.
- Architecture changes faster than the quarterly cadence can fully capture. A vendor that overhauls its product mid-quarter may be classified incorrectly until the next review.
The methodology is itself reviewed annually. Material changes to the rubric are documented in the revision history at the bottom of this page.
Revision history
- 2026-05-13 — Current version. Six-question diagnostic; three-tier classification.
- 2025-Q3 — Diagnostic expanded from four to six questions; added documentation requirement.
- 2025-Q1 — Original publication; four-question diagnostic; two-tier classification.