China’s Z.ai Releases GLM-5.2: An Open-Source Coding Model That Matches GPT-5.5 for a Fraction of the Cost

On June 13, Z.ai, the Chinese AI company formerly known as Zhipu AI, released GLM-5.2, its latest flagship large language model. The model is available under the MIT open-source license with no regional restrictions, making it one of the most capable openly downloadable AI systems ever released. Early benchmark results show it matching or beating OpenAI’s GPT-5.5 on several long-horizon coding tasks, while API pricing comes in at roughly one-sixth the cost of the leading proprietary alternatives.

GLM-5.2 is the third major release in Z.ai’s GLM-5 family this year, following GLM-5 in February and GLM-5.1 in April. The model uses a Mixture of Experts architecture with approximately 744 billion total parameters and 40 billion active per forward pass. Its headline feature is a usable 1-million-token context window, up from 200,000 tokens in previous versions, enabling the model to work across entire codebases and multi-hour engineering sessions.

Z.ai released GLM-5.2 under the MIT license, allowing unrestricted use, modification, commercial deployment, and self-hosting on private infrastructure. The model weights are available on Hugging Face as `zai-org/GLM-5.2`.

The open licensing carries extra weight given the current geopolitical climate. A week before GLM-5.2’s launch, the Trump administration issued an export-control directive that forced Anthropic to take its Claude Fable 5 model entirely offline. Z.ai explicitly markets its MIT license as protection against such disruptions: a model hosted on your own servers cannot be remotely revoked.

“Technical access without borders,” the company writes in its technical documentation.

Benchmarks: How It Stacks Up

Z.ai published results across a wide range of coding and reasoning benchmarks. The table below compares GLM-5.2 against OpenAI’s GPT-5.5 and Anthropic’s Claude Opus 4.8 on key evaluations:

| Benchmark | GLM-5.2 | GPT-5.5 | Claude Opus 4.8 |

|—|—|—|—|

| FrontierSWE (long-horizon) | 74.4% | 72.6% | 75.1% |

| SWE-bench Pro | 62.1 | 58.6 | N/A |

| Terminal-Bench 2.1 | 81.0 | N/A | 85.0 |

| MCP-Atlas (tool use) | 77.0 | 75.3 | 77.8 |

| HLE w/ tools | 54.7 | 52.2 | 57.9 |

| AIME 2026 (math) | 99.2 | 98.3 | 95.7 |

On FrontierSWE, a benchmark designed to measure an AI agent’s ability to complete open-ended engineering projects over hours or tens of hours, GLM-5.2 scored 74.4%, edging out GPT-5.5’s 72.6% and landing within 0.7 points of Claude Opus 4.8 (75.1%).

On SWE-bench Pro, which evaluates real-world software engineering tasks across 41 professional repositories, GLM-5.2 scored 62.1, ahead of GPT-5.5 at 58.6. This is the first time an open-weights model has led a top-tier proprietary model on this benchmark.

Terminal-Bench 2.1, which tests command-line task completion, shows GLM-5.2 at 81.0, just 4 points behind Claude Opus 4.8 at 85.0. According to the official Z.ai blog, this is “the strongest open-source model” on the benchmark.

On MCP-Atlas, a benchmark measuring tool-orchestration via the Model Context Protocol, GLM-5.2 scored 77.0, ahead of GPT-5.5’s 75.3 and within 0.8 points of Opus 4.8.

The model also secured first place on the crowdsourced Design Arena benchmark with an ELO of 1360, outperforming Claude Fable 5. On the AIME 2026 math competition, it scored 99.2, the highest among the three systems in the comparison.

The picture is not universally favorable to Z.ai. On SWE-Marathon, an ultra-long-horizon benchmark covering compiler construction and production-grade service development, GLM-5.2 scored 13.0, trailing Claude Opus 4.8 by a wide margin (26.0). Underscoring the gap between Z.ai’s model and Anthropic’s top offering, Z.ai itself describes the model as “still having room to grow” on that evaluation.

Six Times Cheaper

Z.ai’s API pricing for GLM-5.2 is $1.40 per million input tokens and $4.40 per million output tokens, for a combined $5.80. Cached inputs cost just $0.26 per million tokens with a limited-time free storage promotion.

By comparison, GPT-5.5 costs $5.00 input and $30.00 output for a combined $35.00. Claude Opus 4.8 runs $5.00 input and $25.00 output, for a combined $30.00. GLM-5.2 is roughly six times cheaper per task than GPT-5.5 for equal-length conversations.

For self-hosted deployments, the cost drops to electricity and hardware alone, making it practical for enterprises processing millions of codebase interactions per month.

Architecture: IndexShare and Thinking Modes

GLM-5.2 introduces IndexShare, a sparsified attention mechanism that reuses a single lightweight indexer across every four transformer layers. At the full 1-million-token context length, this reduces per-token floating-point operations by 2.9x.

A multi-token prediction layer increases the number of tokens accepted during speculative decoding by 20%, accelerating inference.

The model offers two selectable thinking modes: Max for peak reasoning performance (averaging around 85,000 output tokens per complex task) and High for a balance of speed and token efficiency. Z.ai recommends Max as the default for coding work.

The company says GLM-5.2 was trained using slime, its asynchronous reinforcement learning infrastructure, which merged expertise from over 10 specialist models in roughly two days of parallel training. A critic-based PPO formulation with token-level loss handles the variable-length outputs typical of agentic coding sessions.

Z.ai also implemented an “anti-hacking” module that detects when the model tries to cheat on evaluation benchmarks by reading protected artifacts, copying answers from upstream commits, or fetching target source files. This two-stage detection system runs during training, not at inference time.

Developer Ecosystem and Adoption

GLM-5.2 is compatible from day one with over 20 third-party coding environments, including Claude Code, Cline, Kilo Code, OpenClaw, Roo Code, and Goose. Developers can point their existing configuration files at the model and begin using it immediately.

Cline IDE described it as “the first open-weights model to cross 80% on Terminal-Bench” and “a frontier-level model for a fraction of the cost.”

Z.ai also offers GLM Coding Plan subscriptions ranging from $12.60 per month (Lite) to $112 per month (Max) for users who prefer managed access over self-hosting.

What It Means

GLM-5.2 is the strongest case yet that open-weights models can compete with proprietary frontier systems on real software engineering. The combination of MIT licensing, pricing 6x below closed alternatives, and benchmark results that match or beat GPT-5.5 on multiple coding evaluations creates a compelling option for enterprises concerned about vendor lock-in and export-control exposure.

The gap to Claude Opus 4.8 remains real, particularly on ultra-long-horizon tasks like SWE-Marathon where Z.ai’s model scored half of Anthropic’s. But with Z.ai having shipped four flagship model releases in roughly four months, the distance between Chinese open-source AI and American proprietary frontier models is shrinking faster than many expected.

We are in a new phase of the AI race: one where the most capable model you can download and run yourself may not come from Silicon Valley.


Sources: Z.ai blog (June 16, 2026); VentureBeat (June 16, 2026); Tech Startups (June 17, 2026); Frandroid (June 17, 2026)

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top