Best SWE-bench Verified Score

Highest score resolving real GitHub issues autonomously — core software engineering capability

93.9%

Trend YoY growth is +17.2%, slowing by 23.5 pp/year over the last 2Y. Latest: +21.2%, 4.0 pp above trend, a 0.7σ deviation.

Level

YoY Change

y = 54.0% 23.5 pp/yr · t

Deviation from trend

Forecast

Projected value by forecast vintage (%)

Projected value (%)

Forecast made inJan '25Feb '25Mar '25Apr '25May '25Jun '25Jul '25Aug '25Sep '25Oct '25Nov '25Dec '25Jan '26Feb '26Mar '26Apr '26May '26Jun '26Jul '26Aug '26Sep '26Oct '26Nov '26Dec '26Jan '27Feb '27Mar '27Apr '27May '27MAPE
Jan '2571.770.988.8103.5108.4115.5134.4134.2136.8139.8143.4148.7181.0152.9210.4211.2159.0173.1220.1147%
Feb '2570.371.389.4104.3109.3116.5135.6135.4138.7141.3145.0150.6183.8154.9213.7214.7161.3175.5224.0176%
May '2572.779.690.093.599.2115.2113.5115.0117.2118.2134.2119.5154.9152.9120.0132.6153.4108%
Jun '2575.287.290.495.8111.2109.2109.7111.6111.7124.2112.4143.1140.5111.6124.0139.2110%
Sep '2577.284.889.6103.8101.2100.0101.399.6106.099.2121.3117.596.1107.8112.886%
Nov '2579.293.189.184.284.278.675.684.578.067.577.965.845%
Mar '2680.978.577.970.766.656.366.247.758%
Apr '2693.985.485.580.577.870.481.070.2
May '2685.485.580.577.870.481.070.2

YoY change forecast

Forecast made inJan '25Feb '25Mar '25Apr '25May '25Jun '25Jul '25Aug '25Sep '25Oct '25Nov '25Dec '25Jan '26Feb '26Mar '26Apr '26May '26Jun '26Jul '26Aug '26Sep '26Oct '26Nov '26Dec '26Jan '27Feb '27Mar '27Apr '27May '27MAPE
Jan '25+53.6%+50.6%+51.6%+52.8%+53.8%+55.0%+56.1%+57.2%+58.3%+59.4%+60.5%+61.6%+62.7%+63.8%+64.9%+66.0%+67.1%+68.2%+69.3%+70.4%+71.5%+72.6%+73.7%+74.8%+75.9%+77.1%+78.1%+79.2%+80.3%147%
Feb '25+52.2%+52.0%+53.2%+54.4%+55.6%+56.7%+57.9%+59.1%+60.3%+61.5%+62.7%+63.9%+65.1%+66.2%+67.4%+68.6%+69.8%+70.9%+72.1%+73.4%+74.5%+75.7%+76.9%+78.1%+79.3%+80.4%+81.6%+82.8%176%
May '25+38.9%+45.8%+45.5%+45.2%+44.8%+44.5%+44.2%+43.9%+43.5%+43.2%+42.9%+42.6%+42.3%+41.9%+41.6%+41.3%+41.0%+40.7%+40.3%+40.0%+39.7%+39.4%+39.1%+38.7%+38.4%108%
Jun '25+41.4%+43.3%+42.6%+42.0%+41.4%+40.7%+40.1%+39.5%+38.8%+38.3%+37.6%+37.0%+36.4%+35.8%+35.1%+34.5%+33.9%+33.2%+32.6%+32.0%+31.3%+30.7%+30.1%+29.5%110%
Sep '25+32.0%+35.8%+34.6%+33.4%+32.1%+30.9%+29.8%+28.5%+27.3%+26.1%+24.9%+23.6%+22.4%+21.2%+20.0%+18.8%+17.5%+16.3%+15.2%+13.9%+12.7%86%
Nov '25+24.2%+23.9%+21.4%+18.8%+16.5%+14.0%+11.5%+9.0%+6.5%+4.0%+1.4%-1.1%-3.6%-6.1%-8.6%-11.1%-13.4%-16.0%-18.5%45%
Mar '26+10.6%+8.8%+5.8%+2.7%-0.3%-3.4%-6.5%-9.5%-12.6%-15.6%-18.7%-21.8%-24.6%-27.7%-30.7%58%
Apr '26+21.2%+12.7%+10.3%+8.0%+5.7%+3.3%+1.0%-1.3%-3.6%-6.0%-8.4%-10.5%-12.9%-15.2%
May '26+12.7%+10.3%+8.0%+5.7%+3.3%+1.0%-1.3%-3.6%-6.0%-8.4%-10.5%-12.9%-15.2%

Forecasts use ordinary least-squares linear regression fitted to the YoY change series over a rolling 1Y window. Each row shows a vintage — the forecast as it would have appeared at that point in time. Projected values apply the forecasted YoY change to the prior year's level, chaining forward where actuals are unavailable. MAPE measures forecast accuracy against realized values. These are mechanical trend extrapolations, not economic models.