Project Peak
Studies

Controlled comparisons

Each study holds everything fixed but one variable and shows how it moves the result. The contenders are ranked by the hardest step each one holds, with a written analysis.

4 levels model analysis

The GPT-5 family, ranked by capability

How the GPT-5 variants compare on the capability ceiling for long-context code reasoning, from nano up to the full model and the Codex variant. Same instrument and scoring; only the model changes.

Read the study →