Engineering

How We Built a Website Benchmark Tool with Playwright and GPT-4o

April 1, 2026 · 7 min read

BenchmarkHQ analyzes thousands of websites using AI. Here's the technical stack, from browser automation to API deployment — all running on a $5/month infrastructure.

The Pipeline

Research — Custom feature checklist per industry (45 for e-commerce, 50 for SaaS, 151 for news)
Crawl — Playwright (headless Chromium) visits each site, takes 3-6 screenshots
Analyze — GPT-4o Vision examines screenshots against the checklist, marks each feature Y/N/P
Aggregate — Frequency analysis classifies features as CRITICAL/REQUIRED/RECOMMENDED/OPTIONAL

Tech Stack

Component	Technology	Cost
Browser automation	Playwright (Python)	Free
AI analysis	GPT-4o Vision	~$0.01/site
API server	FastAPI	Free
Hosting	Railway	$5/month
Landing page	GitHub Pages	Free

Why GPT-4o Vision Instead of HTML Parsing

We tested three approaches: HTML parsing (too fragile — every site has different markup), Lighthouse-style audits (can't detect product features like wishlists), and GPT-4o Vision (looks at the page like a human). Vision-based analysis correctly identified 90%+ of features from screenshots alone, regardless of underlying technology.

Everything is open source: github.com/abdur-rehman10/benchmarkhq

Try the API yourself

42 industries. No signup required. Free and open source.

Explore API Docs →