让 Agent 更可靠地调用工具和复用技能、识别并缓解模型安全、越狱和对齐风险

本期从 2026-06-01 论文源抓取并去重 273 篇候选论文，筛选 5 篇重点论文与 15 篇补充关注。

今天最值得跟进的方向

今天的高分论文主要指向：让 Agent 更可靠地调用工具和复用技能、让 Agent 更可靠地调用工具和复用技能、识别并缓解模型安全、越狱和对齐风险。下面按核心问题、方法线索、主要论点和关键词整理。

重点论文：题目、看点与核验线索

让 Agent 更可靠地调用工具和复用技能

Cosmos 3: Omnimodal World Models for Physical AI (Aditi, Niket Agarwal, Arslan Ali, Jon Allen, Martin Antolini, Adeline Aubame, et al.) 2606.02800 PDF

让 Agent 更可靠地调用工具和复用技能。核心线索：We introduce Cosmos 3, a family of omnimodal world models designed to jointly process and generate language, image, video, audio, and action sequences within a unified mixture-of-transformers architecture. 代码/数据可用性需查看原文确认。

让 Agent 更可靠地调用工具和复用技能

Thinking Past the Answer: Evaluating Harmful Overthinking in Large Reasoning Models (Simone Caldarella, Davide Talon, Rahaf Aljundi, Elisa Ricci, Massimiliano Mancini) 2606.02835 PDF

让 Agent 更可靠地调用工具和复用技能。核心线索：Large Reasoning Models (LRMs) improve performance by generating explicit intermediate reasoning traces through increased test-time compute, yet the assumption that longer reasoning is consistently beneficial remains under-examined. 代码/数据可用性需查看原文确认。

识别并缓解模型安全、越狱和对齐风险

Breaking the Information Silo: Semantic Personas for Cross-Domain Recommendation (Jonathan Mayo, Moshe Unger, Konstantin Bauman) 2606.01783 PDF

识别并缓解模型安全、越狱和对齐风险。核心线索：Digital platforms increasingly operate as isolated information silos, limiting their ability to construct comprehensive user representations across domains. 代码/数据可用性需查看原文确认。

让 Agent 更可靠地调用工具和复用技能

KForge: LLM-Driven Cross-Platform Kernel Generation for AI Accelerators (Taras Sereda, Burak Bartan, Ankita Nayak, Tom St. John, Natalie Serrino, Zain Asgar) 2606.02963 PDF

让 Agent 更可靠地调用工具和复用技能。核心线索：Production inference increasingly targets a heterogeneous mix of accelerators. 代码/数据可用性需查看原文确认。

提升代码生成、执行反馈和自动修复能力

EntangleCodec: A Unified Discrete Audio Tokenizer via Semantic-Acoustic Entanglement (Hui Li, Yangfan Gao, Junlin Shang, Changhao Jiang, Tao Gui, Qi Zhang, et al.) 2606.02739 PDF

提升代码生成、执行反馈和自动修复能力。核心线索：Audio tokenizers serve as the discrete interface between continuous audio and Audio Language Models (ALMs), but existing tokenizers often struggle to support both understanding and generation. 代码/数据可用性需查看原文确认。

其他值得关注

Large AI Models in Dental Healthcare: From General-Purpose Systems to Domain-Specific Foundation Models：关注任务设置、指标和失效案例，适合补充模型评测与回归测试。

GloResNet: A lightweight 3D CNN with global topological features for preterm brain injury prediction：关注工具调用、执行反馈和可复用能力，适合跟进 Agent 工作流和工程可靠性。

MASER: Modality-Adaptive Specialist Routing for Embodied 3D Spatial Intelligence：关注工具调用、执行反馈和可复用能力，适合跟进 Agent 工作流和工程可靠性。

AgentRedBench: Dynamic Redteaming and Integration-Aware Defense for LLM Agents over SaaS Integrations：关注工具调用、执行反馈和可复用能力，适合跟进 Agent 工作流和工程可靠性。

OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents：关注工具调用、执行反馈和可复用能力，适合跟进 Agent 工作流和工程可靠性。

What Benchmarks Don't Measure: The Case for Evaluating Abstention Competence in Autonomous Agents：关注工具调用、执行反馈和可复用能力，适合跟进 Agent 工作流和工程可靠性。

ATLAS: A Large-Scale Evaluation Benchmark for Adversarial LiDAR Perception：关注检索、知识库问答与证据可靠性，适合跟进 RAG 评测和企业知识系统。

Tiny Collaborative Inference for Occlusion-Robust Object Detection：关注检索、知识库问答与证据可靠性，适合跟进 RAG 评测和企业知识系统。

Do Transformers Need Three Projections? Systematic Study of QKV Variants：关注推理成本、延迟、吞吐和部署约束，适合跟进系统优化。

Pathway-Structured Privileged Distillation for Deployable Computational Pathology：关注检索、知识库问答与证据可靠性，适合跟进 RAG 评测和企业知识系统。

RRISE: Robust Radius Inference via a Surrogate Estimator：关注任务设置、指标和失效案例，适合补充模型评测与回归测试。

Toward a Modular Architecture for Embedded AI Agent Systems at the Edge：关注工具调用、执行反馈和可复用能力，适合跟进 Agent 工作流和工程可靠性。

Which Defense Closes Which Threat? Attributing OWASP-LLM-Top-10 Coverage and Its Brittleness Under Paraphrasing：关注检索、知识库问答与证据可靠性，适合跟进 RAG 评测和企业知识系统。

Traj-Evolve: A Self-Evolving Multi-Agent System for Patient Trajectory Modeling in Lung Cancer Early Detection：关注工具调用、执行反馈和可复用能力，适合跟进 Agent 工作流和工程可靠性。

Acceptance-Test-Driven Evaluation Protocols for Business-Centric LLM Systems：关注工具调用、执行反馈和可复用能力，适合跟进 Agent 工作流和工程可靠性。

阅读边界

自动排序会偏向有社区信号、代码信号和工程关键词的论文。
简报默认基于标题、摘要和公开元数据，不替代全文精读。
外部 API 限流或不可用时，相关信号会降级为空并在内部记录中保留说明。