Your Build Benchmarked
WillMyGPURunIt is a free tool that tells you what your PC can really do. Enter your CPU and GPU below and get a straight answer: which local AI models your card can run (and how fast), whether anything bottlenecks, the power supply you actually need, part compatibility, and the games you can play. Prefer a focused page? The full build calculator lives on its own too.
Benchmark Your Build
What WillMyGPURunIt Checks
One form, one report: everything you need to understand a build, whether you're buying parts, upgrading, or seeing what you already own can handle.
Local AI models + tokens/sec
Every popular LLM (Llama, Qwen, Gemma, DeepSeek): whether it fits your VRAM, the best quant, and an estimated speed in tokens/sec.
AI & gaming bottlenecks
Whether your CPU and GPU are a balanced match, in plain language, with a concrete upgrade suggestion either way.
Power supply wattage
The PSU size your parts actually need, sized for real transient spikes and ~10% headroom, never below the GPU's minimum.
Part compatibility
Socket, RAM type and PSU checks from your CPU, motherboard chipset and supply, so you catch a mismatch before you build.
Games you can run
110 popular games scored against your GPU at 1080p high, plus a search box for anything not listed.
1-100 build scores
At-a-glance Local AI and gaming ratings so you can size up a build in a single number each.
New to Local AI? Start Here
Plain-English guides to running AI on your own machine, no jargon assumed.
Why Run AI Locally — and Why Not
Privacy, cost, offline access and control versus the real downsides: setup, hardware cost and speed. An honest both-sides look.
Read →How Much VRAM Do You Need to Run an LLM?
A practical VRAM-by-model-size table — from 7B chat models on 8 GB cards to 70B on 24 GB — plus how quantization and context change the math.
Read →Best GPUs for Local LLMs
Ranked GPU picks for running local AI by budget and VRAM tier — built from real benchmark data and the actual model each card can run.
Read →What Is CUDA? And Why It Matters for Local AI
CUDA is the reason NVIDIA dominates AI. A plain-English explainer on what it is, what it does, and how AMD (ROCm) and Apple (Metal) compare.
Read →Deciding Between Two Builds?
Compare two PCs side by side: every score, bottleneck and number at once.
Frequently Asked Questions
Is WillMyGPURunIt free?
Yes, completely free. Enter your parts and get every result with no account or sign-up.
How much VRAM do I need to run AI locally?
Roughly 0.6 GB of VRAM per billion parameters at 4-bit, plus overhead, so an 8B model needs about 8 GB, a 14B around 12-16 GB, and a 32B needs 24 GB. The calculator shows exactly what your card runs.
What is a CPU or GPU bottleneck?
A bottleneck is when one part holds the other back, for example a CPU too slow to keep a fast GPU fed with frames. For gaming a GPU being the limiter is the healthy state; for local AI the limiter is almost always VRAM capacity.
How accurate are the numbers?
They're estimates built from published benchmark data (PassMark G3D, single-thread ratings) and real GPU specs (VRAM, bandwidth, board power). They're a reliable guide for comparing builds, not a guarantee of exact frame rates or speeds.
What does tokens per second mean?
It's how fast a model writes. A token is about ¾ of a word, so ~40 tokens/sec already outpaces your reading speed. We estimate it from your GPU's memory bandwidth.