What we test
Every comparison piece on this blog scores tools across the same seven dimensions. We do not let a tool win a category by being good at one thing and untested at the rest. The full scorecard runs against this list before any verdict is written.
Deliverability
Inbox vs spam placement across Gmail, Outlook, GMX, Free.fr, Libero, and corporate Microsoft 365 tenants. Bounce rate, hard vs soft, catch-all behaviour.
Multichannel coverage
Email, LinkedIn, calls, WhatsApp, SMS - what is native, what runs through Zapier duct tape, what breaks at step 7 of a sequence.
Pricing transparency
Price at 1 seat, 3 seats, 10 seats. Hidden credit caps. Annual vs monthly. We re-verify against the vendor's pricing page every quarter.
EU compliance
RGPD posture, data hosting region, DPA quality, opt-out mechanics. Read against actual CNIL / AEPD / Garante / Wettbewerbszentrale enforcement.
AI quality
Personalisation output an SDR would actually send. Hallucinations, signal quality, model behaviour at scale. Tested against 50+ real ICP profiles.
Integrations
Native vs Zapier vs API. CRM (HubSpot, Pipedrive, Salesforce), data providers, calling, calendar. What syncs both directions, what does not.
Support response time
We open a real ticket on day 7. Stopwatch starts. We measure first human reply, time to resolution, and whether the answer was useful or canned.
How long we test
Speed-running a tool in two hours is how marketing-fluff comparisons get written. We do not do that. The minimum bar before any verdict ships:
- 14 calendar days minimum hands-on with the tool, signed up like a normal customer (no vendor handshake, no insider account).
- 1,000+ sequences sent through the tool's primary inbox flow during the test window. This is non-negotiable - deliverability problems do not show up at 50 sends.
- 2 full weekly cycles so we catch Monday-morning rollout issues, Friday support response time, and weekend warmup behaviour.
- 3 ICPs minimum tested in parallel - French mid-market, DACH enterprise, US SMB - so we surface market-specific data quality gaps.
For deeper dives on category leaders (Apollo, Outreach, Salesloft, Lemlist, Instantly, Cognism), the test window stretches to 30+ days because the surface area is bigger and the corner cases matter more.
Who tests
Three expert reviewers, three distinct lenses. No piece ships without all three signing off.
The three lenses are deliberate. A tool can have great deliverability (Vincenzo passes it), but flunk pricing math at 10 seats (Nicolas catches it), or fail the French CMO test on subject lines (Nathalie kills it). All three sign-offs are required. If one says no, we go back to the test, not to the article.
Where data comes from
We do not paraphrase G2 reviews and call it research. Every claim in a comparison piece traces back to one of the following sources, and we name the source in the article.
What we don't accept
The fastest way to wreck reader trust is to source-launder. The list below is what would never make it into a comparison piece on this blog.
- Paid placements or "sponsored ranking"
- Affiliate-only reviews (kickback bias)
- Vendor-supplied screenshots without us re-shooting
- Anonymous reviewer claims (no name = no claim)
- Marketing fluff masquerading as data
- Untested-on-our-side feature claims
- "X said on a podcast" without a transcript link
- G2 / Capterra rating imports without our own test
Tools sometimes ask to be added to a comparison in exchange for a backlink swap or a co-marketing push. The answer is no. Our position cannot be bought, and that is the only reason readers should keep paying attention.
Update cadence
A comparison published in Q1 is wrong by Q3 if nobody touches it. We have a fixed update rhythm:
How a verdict is built
Once a tool has finished its 14+ day test window, the verdict pipeline runs through three stages - independent scoring, cross-review, and contradiction resolution.
Stage 1 - Independent scoring
Each of the three reviewers scores the tool independently across the seven dimensions. We deliberately do not share scores until everyone is done - collective drift toward a "consensus" view is exactly the bias that wrecks comparison content. Vincenzo scores deliverability and sequence design from the lab data. Nicolas scores pricing math, EU compliance posture, and integrations from the platform / market lens. Nathalie scores AI quality, support response time, and the operator pass.
Stage 2 - Cross-review
The three score sheets land on the same table. Disagreements are flagged. This is where the most useful editorial work happens - when two reviewers say a tool is great and one reviewer says it is unusable, the article is not done until we understand which buyer the contradiction reflects. A tool can win for high-volume Anglo SaaS and lose for French mid-market. We say so explicitly.
Stage 3 - Contradiction resolution
When stage 2 surfaces a real disagreement (not a minor scoring delta), one of two things happens. Either we add a second mini-test that closes the question, or we publish the disagreement explicitly as part of the verdict. We do not paper over it. Most "best for" callouts in our comparisons exist because of a stage-3 split - the tool wins for buyer A and loses for buyer B, and we name both.
Stage 4 - The verdict ships
Only after all three reviewers have signed off does a comparison go live. The byline reflects who did what. If a piece is single-authored, the other two have still reviewed it - we just credit the lead writer. If you spot a verdict that contradicts your own real-world experience, we want to hear it: corrections@overloop.com.
Conflicts of interest disclosure
Two facts you should know about this blog before reading any comparison:
- Overloop is one of the tools we cover. When Overloop appears in a comparison piece, the same scoring rules apply. We have published comparisons where Overloop loses on a specific dimension to a specific competitor - and we have left those verdicts unchanged because changing them would defeat the whole point of running this blog. When Overloop wins, we still source the claim against the same evidence bar as any other tool.
- We do not run an affiliate program on this blog. No "buy via this link, we get a kickback" layer. The CTAs on the blog point to Overloop because we own Overloop, not because there is a hidden commission tier behind any vendor link.
If we ever change either of those policies, the change will be disclosed at the top of every affected article and dated.
Author bylines
Every article on this blog is signed. The byline is a real person, with a real face, a real role at a real company, and a real public profile you can vet. The author bylines on this blog link to:
- Nicolas Finet - CEO Sortlist + Overloop (founder voice, market data, pricing math, regulatory posture)
- Vincenzo Ruggiero - Co-founder Overloop (deliverability, sequence design, hands-on testing protocol)
- Nathalie Saikali - Head of Sales Overloop (operator pass, French / Belgian / Swiss market relevance)
If a piece is co-authored, the schema reflects all three. If a guest contributor writes a piece, their full bio and conflict-of-interest statement is published alongside.
Contact for corrections
Spotted an error or a stale claim?
Email corrections@overloop.com with the article URL, the issue, and a source. We acknowledge within 48 hours and patch within 7 days when the source checks out.
Send a correction