Moonshot AI Releases Kimi K2.7-Code: Code Registration Model +21.8% in Kimi Code Bench v2 Over K2.6

admin June 13, 2026

0 2 4 minutes read

Moonshot AI Releases Kimi K2.7-Code: Code Registration Model +21.8% in Kimi Code Bench v2 Over K2.6

This week, Moonshot AI was released Kimi K2.7-Code. It is a coding-oriented, agent-based model. The weight model is submitted to Hugging Face under a modified MIT license. You can also access it through Kimi API and Kimi Code.

K2.7-Code targets long-term software engineering, not general discussion. Plans, organizes, implements, and corrects errors in multiple steps. Moonshot matches the model with the surrounding registry coding platform.

Kimi K2.7-Code

K2.7-Code is a Hybrid-Scientist model. It holds 1T parameters and executes 32B per token. The design uses 384 experts, with 8 selected per token and 1 shared. It has 61 layers, including 1 thick layer.

The focus uses MLA, and the feed method uses SwiGLU. MoonViT vision encoder adds 400M parameters for image and video encoding. The model is exported with native INT4 quantization. The window contains 256K (262,144) tokens.

Two constraints are important: The thinking mode is mandatory; disabling it returns an API error. Sampling is fixed: temperature 1.0, top_p 0.95, n 1, fines 0.0. The default maximum output is 32,768 tokens.

You can handle it yourself with vLLM, SGlang, or KTransformers. The Hugging Face cache is huge, about 595 GB on disk. This is a server class deployment target, not a laptop model.

Benchmark

The Moonshot team published six benchmark lines. They compare K2.7-Code with K2.6, GPT-5.5, and Claude Opus 4.8. K2.7-Code beats K2.6 on all lines. The biggest jump in codes is Kimi Code Bench v2, from 50.9 to 62.0.

Benchmark	For me K2.6	Kimi K2.7-Code	GPT-5.5	Claude Opus 4.8	K2.7 vs K2.6
Kimi Code Bench v2	50.9	62.0	69.0	67.4	+ 21.8%
Program Bench	48.3	53.6	69.1	63.8	+11.0%
MLS Bench Lite	26.7	35.1	35.5	42.8	+ 31.5%
Kimi Claw 24/7 Bench	42.9	46.9	52.8	50.4	+9.3%
The MCP Atlas	69.4	76.0	79.4	81.3	+9.5%
MCP Mark Verified	72.8	81.1	92.9	76.4	+11.4%

K2.7-Code beats Opus 4.8 in MCP Mark Verified, 81.1 vs. 76.4. It’s also close to GPT-5.5 on MLS Bench Lite. K2.7-Code worked with Kimi Code CLI, GPT-5.5 on Codex xhigh, and Opus 4.8 on Claude Code xhigh.

Consulting Token Performance: A Claim for Cost, Not Just Quality

The Moonshot team reports about 30% lower thinking token usage than K2.6. It describes this as ‘overthinking.’

Bill of lading tokens as output tokens for most value cards. Agentic coding runs hundreds or thousands of steps. Each program, try again, and confirmation pays the cost of thinking again. Compounds are prescribed 30% over time.

The result is in three places at once. First, the low cost of each transaction token. Second, quick steps, enabling interactive CLI sessions. Third, additional steps before reaching content limits.

Use Cases with examples

Repo-scale refactors are the main use case. Point the agent to the failing checkpoint. It reads the files, sorts through all the modules, and runs the test until green.
Code review is the second equation. Feed the request to pull the difference and ask for a risk analysis. A 256K window holds large diffs, logs, and related files together.
The workflow for using MCP tools is three-dimensional. K2.7-Code scored 81.1 on MCP Mark Verified. That suite checks for correct device invocation with the Model Context Protocol. Consider CI testing, ticket review, and file editing in one loop.
Longitudinal content analysis is the fourth equation. The model accepts text, image, and video input. Documents, screenshots, and repro recordings can share a single command.

Marktechpost’s Interactive Explorer

Kimi K2.7-Code — Interactive Checker

Company-reported benchmarks and official API values. Issued June 12, 2026. Verified June 12, 2026.

Measurements

Cost Calculator

Details

Source: Moonshot AI model card Kimi K2.7-Code. K2.7-Code work with Kimi Code CLI; GPT-5.5 in Codex xhigh; Claude Opus 4.8 on Claude Code xhigh. First team numbers, not an independent leaderboard.

Installation costs$0.00

Output costs$0.00

Est. monthly total$0.00

$0.00

Prices: cached input $0.19 / 1M, cached input $0.95 / 1M, output $4.00 / 1M (official Kimi price). The save line shows the reported K2.7-Code ~30% lower logic token usage compared to K2.6, which is used in the logic allocation for the output. Measure only.

Source: Kimi K2.7-Code Hugging Face model card and Kimi API documentation.

A little Quickstart

The Kimi API is compatible with OpenAI. The model string kimi-k2.7-code. Do not issue static sample parameters, or application errors.

import os
from openai import OpenAI

# Base URL and key per the Kimi API docs at platform.moonshot.ai
client = OpenAI(
    api_key=os.environ.get("MOONSHOT_API_KEY"),
    base_url="
)

messages = [
    {"role": "system", "content": "You are a coding agent."},
    {"role": "user", "content": "Refactor utils.py to remove duplicate code."},
]

resp = client.chat.completions.create(
    model="kimi-k2.7-code",
    messages=messages,
    max_tokens=32768,  # default cap; also the maximum
    # thinking is enabled by default and cannot be disabled.
    # temperature (1.0), top_p (0.95), n (1), and penalties (0.0) are
    # fixed server-side. Passing any other value returns an error.
)

msg = resp.choices[0].message
print(msg.content)

# Multi-step tool calls: append the full assistant message so that
# reasoning_content is preserved. Dropping it errors on the next turn.
# messages.append(msg.model_dump())

Two rules for using tools appear in the documentation. Save reasoning_content from the present turn of context. And set tool_choice at the end "auto" or "none".

How it compares to the K2.7 Code

Model	License	Parameters	Context	API price (in/out per 1M)
Kimi K2.7-Code	Modified MIT (open)	1T total / 32B active	256K	$0.95 / $4.00
For me K2.6	Open weight	1T-class MoE	256K	~$0.67–0.95 / ~$3.39–4.00
GPT-5.5	It’s closed	Not disclosed	–	Not on the Moonshot table
Claude Opus 4.8	It’s closed	Not disclosed	1M	$5.00 / $25.00
Qwen3-Coder-480B-A35B	Open (Qwen license)	480B / 35B active	256K native	Varies by host

K2.7-Code lists for $0.19 per 1M cache.

Strengths and Weaknesses

Power:

Open the weights under the Modified MIT, in the form of real abstinence.
Extensive, consistent benefits over K2.6 in coding and agent agents.
Low cost API related to closed border models.
It beats Opus 4.8 in the MCP Mark Verified benchmark (reported by the company).

Weaknesses:

All headline numbers are part one of the launch.
Imagination mode cannot be disabled.
Sample controls are locked to fixed values.
Multi-step tool calls should be saved reasoning_content.
595 GB weights make holding a big commitment.

Key Takeaways

All benchmarks are subject to vendor management; Independent results are pending.
K2.7-Code is an open weight, specially coded, and built on Kimi K2.6.
Moonshot reports +21.8% in Kimi Code Bench v2 over K2.6.
The model uses about 30% more memory tokens than K2.6.

Check out Model weight, The Kimi code again API. Also, feel free to follow us Twitter and don’t forget to join our 150k+ML SubReddit and Subscribe to Our newspaper. Wait! are you on telegram? now you can join us on telegram too.

Need to work with us on developing your GitHub Repo OR Hug Face Page OR Product Release OR Webinar etc.? contact us