The Kimi K2.5 brought a “swarm moment”.
The K2.5 update has generated a lot of discussion both domestically and internationally in the past two days. It features a native multimodal model that provides state-of-the-art coding and vision capabilities, as well as an autonomous agent swarm paradigm—summoning a group of agents to complete tasks. It sounds incredibly cool.
Multiple agents with various skills, so cool and fun!
K2.5 is now fully released and available for immediate use on client devices. The K2.5 Agent offers a free trial, while the K2.5 Cluster is a paid feature, currently only available on the Allegretto plan. Subscriptions also have a points limit: starting at 47 points per month, with each task consuming 3 points.
Overall, it's sufficient. If you're unsure, you can participate in today's giveaway and try it out first.

However, as a long-time Kimi user, of course I had to buy it. I happened to have a bunch of files that needed merging, and I was too lazy to copy and paste them manually, so I sent them to Kimi and enabled cluster mode to handle them all at once.

In the cluster model, Kimi added a design element to this area: a name tag will fall down, allowing you to see which "person in charge" is performing the task.

The final result of merging the documents was quite good, and I further suggested that it be used to organize and adjust the subheadings at each level, which would allow for a workflow of analysis, proposal, and execution. However, it's best to download the documents locally to check the formatting, as Kimi's built-in preview function sometimes doesn't accurately reflect the effects of changes made in each round.
To further examine its multi-concurrency operation, I referred to the official demo and tested a task: retrieved all literature on clustered agents from the past three months, compiled it into an Excel spreadsheet, and extracted the core findings and research innovations.

This time, there were more "personnel" arranged, with various agents rushing to provide support, and each person had their own assigned tasks.

This took significantly longer than before, but that's okay; I can leave it running in the background for now. Meanwhile, I also assigned a task to test its multimodal capabilities.

This is the original source image uploaded to Kimi; the video version has more animations. Kimi's task is to convert this design into a webpage while preserving all design elements and style. The prompt is simple, but the actual work is complex: it requires recognizing and understanding the image, generating the raw image, and writing the front-end.

This task took a considerable amount of time, but the final result was excellent. There were a few minor issues, such as image layout, hover effects, and navigation problems. However, the core design elements were retained, and the website functionality was complete.

Looking back, the literature search task is also done, and a neat Excel spreadsheet has been generated:

The final test task was to find influencers on Xiaohongshu (Little Red Book), specifically tech bloggers with over 5,000 followers and more than 100 posts. These two conditions are actually quite lenient, making the search very broad.

Kimi's first problem was that she couldn't access Xiaohongshu. Actually, this could be addressed by proactively asking the user, similar to the method used by GPTagent.
But that didn't work. Kimi instead went to Newrank to scrape data, which bypassed website permissions and allowed him to directly access numbers. This wasn't a very good strategy, as he could only find a small number of bloggers, which is obviously far more than what's available on Xiaohongshu. Furthermore, being excluded from the platform prevented him from showcasing his visual abilities, since he was only scraping readily available data.

Overall, however, Swarm Agent gives a sense of reliability. Can a single agent do these tasks? Of course, it can, but it takes time and is prone to errors. Having a group of people do it provides greater reassurance.
Where is the "innovation"?
At this point, you might ask: Isn't this just Multi-Agent? Many companies are doing it.
The key difference lies in "who will be the boss".
In traditional multi-agent systems, humans need to pre-design the entire workflow: who is responsible for what, what comes first, and how the results are summarized. It's like building with blocks; you have to draw the blueprints first. The core innovation of Agent Swarm lies in the fact that AI itself is the designer.
Kimi's team used a training method called PARL (Parallel-Agent Reinforcement Learning) to teach the model the ability to "decompose tasks" and "allocate resources ." You don't need to tell it "send 3 people to search for information first, then send 2 people to write the summary," it can determine on its own: how many parts should this task be broken down into? Who should do each part? When should it be done in parallel, and when should it be done sequentially?
In other words, Multi-Agent is a "symphony orchestra arranged by humans," while Agent Swarm is a jazz ensemble assembled by AI itself.

Another easily confused concept is MoE: Mixture of Experts. Many mainstream large-scale models use the MoE architecture internally, but they are completely different from Agent Swarm.
MoE occurs within the model. You can think of it as: a group of "experts" living inside the model, and each time a task is processed, the model dynamically decides which experts to activate to participate. However, these experts do not have independent identities, nor do they collaborate with each other; they are simply different computational paths within the model.
Agent Swarm occurs outside the model. Each sub-agent is a relatively independent execution unit with its own task objectives, can run in parallel, and can even invoke tools (such as searching web pages or writing code). The relationship between them is a true "collaboration," not a simple "activation" relationship.
To use a somewhat imprecise analogy: MoE is like the partitioned work of a person's brain, while Agent Swarm is like team collaboration in a company .
Based on real-world testing and official demonstrations, Agent Swarm performs exceptionally well in at least the following task categories:
The first category is large-scale information collection. Examples include the survey of creators in 100 fields in the official case and the Xiaohongshu blogger search in our test. The common feature of handling this type of task is that it is "parallelizable"—each subtask is relatively independent and does not require much intermediate coordination.

The second category is complex tasks involving both vision and code. Kimi K2.5 emphasizes that it is a "native multimodal" model, capable of understanding images and videos. When combined with Agent Swarm, it can analyze UI screenshots while dispatching different agents to handle layout, style, and interaction logic, ultimately generating complete front-end code.

The third category is long document processing. The official documentation states that Kimi Agent can handle "a 10,000-word paper or a 100-page document," supporting advanced features such as Word annotations, Excel pivot tables, and LaTeX formulas. Agent Swarm can break long documents down into multiple chapters, allowing different agents to process them in parallel, and then aggregate them into a unified format—just like the initial test case.
However, don't get too excited yet; Agent Swarm isn't "cheating." In practical use, you'll find several obvious limitations:
First, the task itself must be "decomposeable". If there are strong dependencies between the steps of the task – such as "thinking out the argument first, then finding evidence, and finally writing the conclusion" – forcing them to run in parallel will actually do more harm than good.
Second, costs will increase significantly. 100 proxies working simultaneously means 100 times more API calls. Although the total time is reduced, the token consumption is substantial.
Third, the quality isn't necessarily better than a single agent. For certain tasks requiring deep reasoning, such as mathematical proofs or complex programming problems, the "deep thinking mode" of a single agent is actually more reliable. Agent Swarm's advantage lies in its "breadth" and "speed," not its "depth." In actual testing, Kimi automatically switched to a single-agent model for some tasks, a fact confirmed by Kimi's team members in online Q&A on Reddit.

The future as seen by Kimi's team
During a Reddit AMA (Ask Me Anything) session, Kimi's team answered numerous questions about technology, products, and vision. Through these answers, we can piece together their thoughts on Agent Swarm and even the future of AI as a whole.
When asked about the next steps for Agent Swarm, Kimi's team revealed several directions:
[Smarter Scheduling] The current Agent Swarm can automatically decompose tasks and create sub-agents, but the scheduling strategy is still relatively "coarse-grained". In the future, it is hoped that more granular resource allocation can be established – for example, dynamically deciding "how many people to send and how long to work" based on the urgency, complexity, and dependencies of the task.
[Deeper Collaboration] Currently, communication between sub-agents is limited, mainly consisting of "each completing their work and submitting the results to the lead for aggregation." In the future, direct collaboration between sub-agents may be supported, such as "Agent A discovering a problem can proactively call Agent B for assistance."
[Wideer Tool Integration] The Kimi team stated that they are expanding the tool library that Agent can call upon, including but not limited to more office software, development environments, and data analysis tools. The goal is to enable Agent Swarm to truly complete complex workflows "end-to-end".
Another interesting question from the AMA was: many say that scaling law has reached its limit. How does Kimi's team view this issue?
Kimi's team responded that agent clusters were their initial attempt. Looking to the future, perhaps a model will emerge that requires little or no prior human information.

This vision may sound idealistic, but it has profound implications upon closer examination. For the past two years, the AI field has been focused on "parameter scaling"—models are getting larger and models are becoming increasingly expensive. Agent Swarm represents a different approach: instead of having a single superbrain do everything, it's better to have a group of brains working together, each with its own tasks.
This may be a more pragmatic path to AGI: a single bee may seem insignificant, but when thousands of bees work together, they can build intricate hives.
#Welcome to follow iFanr's official WeChat account: iFanr (WeChat ID: ifanr), where more exciting content will be presented to you as soon as possible.











The logo is in the main body of the logo | https://rentahuman.ai/























































