Agent - 大模型笔记

Agent#

相关工作#

[2026.01] Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces Agent 在命令行环境的 benchmark
[2026.02] SWE-Universe: Scale Real-World Verifiable Environments to Millions 大规模真实世界可验证的软件开发环境
[2026.02] GLM-5: from Vibe Coding to Agentic Engineering 从 Vibe Coding 到智能体工程
[2026.01] When Single-Agent with Skills Replace Multi-Agent Systems and When They Fail
[2025.11] [Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory]https://arxiv.org/abs/2511.20857)
[2025.09] The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
[2025.09] Effective context engineering for AI agents anthropic，
[2025.08] A SURVEY OF SELF-EVOLVING AGENTS: ON PATH TOARTIFICIAL SUPER INTELLIGENCE
[2025.05] Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration system1和system2
[2025.03] Why Do Multi-Agent LLM Systems Fail?
- 为什么多 Agent 系统总是失败？
[2024.10] AutoGLM: Autonomous Foundation Agents for GUIs 三个insight，中间接口设计、自进化的课程RL、策略分布漂移
- AutoGLM 演示视频
[2024.03] 深度长文』吴恩达：AI Agent 4种最常见的设计模式 reflection、tool use、planning、multi-agent collaboration
[2024.02] Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models 跟deepresearch有点像了
- stanford-oval/storm
[2024.01] Agent AI: Surveying the Horizons of Multimodal Interaction
- [《Agent AI：多模态交互前沿调查》-- 李飞飞团队]((https://zhuanlan.zhihu.com/p/12759357195)
[2023.12] An LLM Compiler for Parallel Function Calling
- SqueezeAILab/LLMCompiler
[2023.03] Reflexion: Language Agents with Verbal Reinforcement Learning
- noahshinn/reflexion
[2022.10] ReAct: Synergizing Reasoning and Acting in Language Models ReAct，Google，query、think、action、result。
- Agent的九种设计模式(图解+代码)
- ysymyth/ReAct

个性化#

[2025.04] ADAPT: Actively Discovering and Adapting to Preferences for any Task

记忆#

[2025.12] Nested Learning: The Illusion of Deep Learning Architecture

开源项目#

mem0ai/mem0 agent的记忆层
Perplexica ai搜索引擎
bytedance/deer-flow

« Previous Next »