每日 AI 资讯 by Homer

by Homer LYJIEBOX@QQ.COM

Gemini Diffusion could be Google's most important I/O news that slipped under the radar

阅读更多

来源: The Decoder | 23-05-25

Google shows AI filmmaking tool, XR glasses and launches $250 Gemini subscription

阅读更多

来源: The Decoder | 23-05-25

Mistral launches Devstral Small 24B, a new open-source LLM for coding

阅读更多

来源: The Decoder | 23-05-25

OpenAI's Stargate secured $11.6 billion for a massive data center

阅读更多

来源: The Decoder | 23-05-25

Google Gemini is everything Siri never was

阅读更多

来源: The Decoder | 23-05-25

Claude Opus 4 blackmailed an engineer after learning it might be replaced

阅读更多

来源: The Decoder | 23-05-25

OpenAI has upgraded the Responses API with remote MCP servers and new tools

阅读更多

来源: The Decoder | 23-05-25

OpenAI and Jony Ive are building a new AI device that is not a smartphone or smart glasses

阅读更多

来源: The Decoder | 23-05-25

Management = Bullshit (LLM Edition)funcall.blogspot.com

阅读更多

来源: Hacker News | 23-05-25

Problems in AI alignment: A scale modelmuldoon.cloud

阅读更多

来源: Hacker News | 23-05-25

Launch HN: WorkDone (YC X25) – AI Audit of Medical Charts

阅读更多

来源: Hacker News | 23-05-25

Claude 4anthropic.com

阅读更多

来源: Hacker News | 23-05-25

Logic-of-Thought: Empowering Large Language Models with Logic Programs for Solving Puzzles in Natural Language

Authors: Naiqi Li, Peiyuan Liu, Zheng Liu, Tao Dai, Yong Jiang, Shu-Tao Xia |

阅读更多

来源: ArXiv AI | 23-05-25

Can AI Read Between The Lines? Benchmarking LLMs On Financial Nuance

Authors: Dominick Kubica, Dylan T. Gordon, Nanami Emura, Derleen Saini, Charlie Goldenberg |

阅读更多

来源: ArXiv AI | 23-05-25

Optimizing LLM-Based Multi-Agent System with Textual Feedback: A Case Study on Software Development

Authors: Ming Shen, Raphael Shu, Anurag Pratik, James Gung, Yubin Ge, Monica Sunkara, Yi Zhang |

阅读更多

来源: ArXiv AI | 23-05-25

LightRouter: Towards Efficient LLM Collaboration with Minimal Overhead

Authors: Yifan Zhang, Xinkui Zhao, Zuxin Wang, Guanjie Cheng, Yueshen Xu, Shuiguang Deng, Jianwei Yin |

阅读更多

来源: ArXiv AI | 23-05-25

LLM-Powered AI Agent Systems and Their Applications in Industry

Authors: Guannan Liang, Qianqian Tong |

阅读更多

来源: ArXiv AI | 23-05-25

Incentivizing Dual Process Thinking for Efficient Large Language Model Reasoning

Authors: Xiaoxue Cheng, Junyi Li, Zhenduo Zhang, Xinyu Tang, Wayne Xin Zhao, Xinyu Kong, Zhiqiang Zhang |

阅读更多

来源: ArXiv AI | 23-05-25

EquivPruner: Boosting Efficiency and Quality in LLM-Based Search via Action Pruning

Authors: Jiawei Liu, Qisi Chen, Jianshu Zhang, Quan Liu, Defu Lian |

阅读更多

来源: ArXiv AI | 23-05-25

How do Scaling Laws Apply to Knowledge Graph Engineering Tasks? The Impact of Model Size on Large Language Model Performance

Authors: Desiree Heim, Lars-Peter Meyer, Markus Schröder, Johannes Frey, Andreas Dengel |

阅读更多

来源: ArXiv AI | 23-05-25

Psychology-driven LLM Agents for Explainable Panic Prediction on Social Media during Sudden Disaster Events

Authors: Mengzhu Liu, Zhengqiu Zhu, Chuan Ai, Chen Gao, Xinghong Li, Lingnan He, Kaisheng Lai, Yingfeng Chen, Xin Lu, Yong Li, Quanjun Yin |

阅读更多

来源: ArXiv AI | 23-05-25

ReflectEvo: Improving Meta Introspection of Small LLMs by Learning Self-Reflection

Authors: Jiaqi Li, Xinyi Dong, Yang Liu, Zhizhuo Yang, Quansen Wang, Xiaobo Wang, SongChun Zhu, Zixia Jia, Zilong Zheng |

阅读更多

来源: ArXiv AI | 23-05-25

Advancing the Scientific Method with Large Language Models: From Hypothesis to Discovery

Authors: Yanbo Zhang, Sumeer A. Khan, Adnan Mahmud, Huck Yang, Alexander Lavin, Michael Levin, Jeremy Frey, Jared Dunnmon, James Evans, Alan Bundy, Saso Dzeroski, Jesper Tegner, Hector Zenil |

阅读更多

来源: ArXiv AI | 23-05-25

ELABORATION: A Comprehensive Benchmark on Human-LLM Competitive Programming

Authors: Xinwei Yang, Zhaofeng Liu, Chen Huang, Jiashuai Zhang, Tong Zhang, Yifan Zhang, Wenqiang Lei |

阅读更多

来源: ArXiv AI | 23-05-25

SMART: Self-Generating and Self-Validating Multi-Dimensional Assessment for LLMs' Mathematical Problem Solving

Authors: Yujie Hou, Ting Zhang, Mei Wang, Xuetao Ma, Hu Huang |

阅读更多

来源: ArXiv AI | 23-05-25

MCP-RADAR: A Multi-Dimensional Benchmark for Evaluating Tool Use Capabilities in Large Language Models

Authors: Xuanqi Gao, Siyi Xie, Juan Zhai, Shqing Ma, Chao Shen |

阅读更多

来源: ArXiv AI | 23-05-25

Data-Driven Breakthroughs and Future Directions in AI Infrastructure: A Comprehensive Review

Authors: Beyazit Bestami Yuksel, Ayse Yilmazer Metin |

阅读更多

来源: ArXiv AI | 23-05-25

Predicate-Conditional Conformalized Answer Sets for Knowledge Graph Embeddings

Authors: Yuqicheng Zhu, Daniel Hernández, Yuan He, Zifeng Ding, Bo Xiong, Evgeny Kharlamov, Steffen Staab |

阅读更多

来源: ArXiv AI | 23-05-25

Identifying, Evaluating, and Mitigating Risks of AI Thought Partnerships

Authors: Kerem Oktar, Katherine M. Collins, Jose Hernandez-Orallo, Diane Coyle, Stephen Cave, Adrian Weller, Ilia Sucholutsky |

阅读更多

来源: ArXiv AI | 23-05-25

AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios

Authors: Yunjia Qi, Hao Peng, Xiaozhi Wang, Amy Xin, Youfeng Liu, Bin Xu, Lei Hou, Juanzi Li |

阅读更多

来源: ArXiv AI | 23-05-25

Know the Ropes: A Heuristic Strategy for LLM-based Multi-Agent System Design

Authors: Zhenkun Li, Lingyao Li, Shuhang Lin, Yongfeng Zhang |

阅读更多

来源: ArXiv AI | 23-05-25

HyGenar: An LLM-Driven Hybrid Genetic Algorithm for Few-Shot Grammar Generation

Authors: Weizhi Tang, Yixuan Li, Chris Sypherd, Elizabeth Polgreen, Vaishak Belle |

阅读更多

来源: ArXiv AI | 23-05-25

Beyond Correlation: Towards Causal Large Language Model Agents in Biomedicine

Authors: Adib Bazgir, Amir Habibdoust Lafmajani, Yuwen Zhang |

阅读更多

来源: ArXiv AI | 23-05-25

X-MAS: Towards Building Multi-Agent Systems with Heterogeneous LLMs

Authors: Rui Ye, Xiangrui Liu, Qimin Wu, Xianghe Pang, Zhenfei Yin, Lei Bai, Siheng Chen |

阅读更多

来源: ArXiv AI | 23-05-25

Google upgrades Gemini 2.5 Pro with a new Deep Think mode for advanced reasoning abilities

阅读更多

来源: The Decoder | 22-05-25

An upgraded dev experience in Google AI Studiogoogleblog.com

阅读更多

来源: Hacker News | 22-05-25

OpenAI to buy AI startup from Jony Ivebloomberg.com

阅读更多

来源: Hacker News | 22-05-25

LLM function calls don't scale; code orchestration is simpler, more effectivejngiam.bearblog.dev

阅读更多

来源: Hacker News | 22-05-25

Gemini figured out my nephew’s namenawaz.org

阅读更多

来源: Hacker News | 22-05-25

Robert Musil Forgotten Plays Inspired His Greatest Work of Fictionlithub.com

阅读更多

来源: Hacker News | 22-05-25

Gemini Diffusionsimonwillison.net

阅读更多

来源: Hacker News | 22-05-25

FragFake: A Dataset for Fine-Grained Detection of Edited Images with Vision Language Models

Authors: Zhen Sun, Ziyi Zhang, Zeren Luo, Zeyang Sha, Tianshuo Cong, Zheng Li, Shiwen Cui, Weiqiang Wang, Jiaheng Wei, Xinlei He, Qi Li, Qian Wang |

阅读更多

来源: ArXiv AI | 22-05-25

Listen to the Context: Towards Faithful Large Language Models for Retrieval Augmented Generation on Climate Questions

Authors: David Thulke, Jakob Kemmler, Christian Dugast, Hermann Ney |

阅读更多

来源: ArXiv AI | 22-05-25

From Problem-Solving to Teaching Problem-Solving: Aligning LLMs with Pedagogy using Reinforcement Learning

Authors: David Dinucu-Jianu, Jakub Macina, Nico Daheim, Ido Hakimi, Iryna Gurevych, Mrinmaya Sachan |

阅读更多

来源: ArXiv AI | 22-05-25

Exploring LLM-Generated Feedback for Economics Essays: How Teaching Assistants Evaluate and Envision Its Use

Authors: Xinyi Lu, Aditya Mahesh, Zejia Shen, Mitchell Dudley, Larissa Sano, Xu Wang |

阅读更多

来源: ArXiv AI | 22-05-25

A Federated Splitting Framework for LLMs: Security, Efficiency, and Adaptability

Authors: Zishuai Zhang, Hainan Zhang, Jiaying Zheng, Ziwei Wang, Yongxin Tong, Jin Dong, Zhiming Zheng |

阅读更多

来源: ArXiv AI | 22-05-25

HybridProver: Augmenting Theorem Proving with LLM-Driven Proof Synthesis and Refinement

Authors: Jilin Hu, Jianyu Zhang, Yongwang Zhao, Talia Ringer |

阅读更多

来源: ArXiv AI | 22-05-25

Alignment Under Pressure: The Case for Informed Adversaries When Evaluating LLM Defenses

Authors: Xiaoxue Yang, Bozhidar Stevanoski, Matthieu Meeus, Yves-Alexandre de Montjoye |

阅读更多

来源: ArXiv AI | 22-05-25

Shared Path: Unraveling Memorization in Multilingual LLMs through Language Similarities

Authors: Xiaoyu Luo, Yiyi Chen, Johannes Bjerva, Qiongxiu Li |

阅读更多

来源: ArXiv AI | 22-05-25

Multi-modal Integration Analysis of Alzheimer's Disease Using Large Language Models and Knowledge Graphs

Authors: Kanan Kiguchi, Yunhao Tu, Katsuhiro Ajito, Fady Alnajjar, Kazuyuki Murase |

阅读更多

来源: ArXiv AI | 22-05-25

Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space

Authors: Zhen Zhang, Xuehai He, Weixiang Yan, Ao Shen, Chenyang Zhao, Shuohang Wang, Yelong Shen, Xin Eric Wang |

阅读更多

来源: ArXiv AI | 22-05-25

Large Language Models as Computable Approximations to Solomonoff Induction

Authors: Jun Wan, Lingrui Mei |

阅读更多

来源: ArXiv AI | 22-05-25

VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models

Authors: Yuchen Yan, Jin Jiang, Zhenbang Ren, Yijun Li, Xudong Cai, Yang Liu, Xin Xu, Mengdi Zhang, Jian Shao, Yongliang Shen, Jun Xiao, Yueting Zhuang |

阅读更多

来源: ArXiv AI | 22-05-25

R&D-Agent: Automating Data-Driven AI Solution Building Through LLM-Powered Automated Research, Development, and Evolution

Authors: Xu Yang, Xiao Yang, Shikai Fang, Bowen Xian, Yuante Li, Jian Wang, Minrui Xu, Haoran Pan, Xinpeng Hong, Weiqing Liu, Yelong Shen, Weizhu Chen, Jiang Bian |

阅读更多

来源: ArXiv AI | 22-05-25

Self-Evolving Curriculum for LLM Reasoning

Authors: Xiaoyin Chen, Jiarui Lu, Minsu Kim, Dinghuai Zhang, Jian Tang, Alexandre Piché, Nicolas Gontier, Yoshua Bengio, Ehsan Kamalloo |

阅读更多

来源: ArXiv AI | 22-05-25

lmgame-Bench: How Good are LLMs at Playing Games?

Authors: Lanxiang Hu, Mingjia Huo, Yuxuan Zhang, Haoyang Yu, Eric P. Xing, Ion Stoica, Tajana Rosing, Haojian Jin, Hao Zhang |

阅读更多

来源: ArXiv AI | 22-05-25

ModelingAgent: Bridging LLMs and Mathematical Modeling for Real-World Challenges

Authors: Cheng Qian, Hongyi Du, Hongru Wang, Xiusi Chen, Yuji Zhang, Avirup Sil, Chengxiang Zhai, Kathleen McKeown, Heng Ji |

阅读更多

来源: ArXiv AI | 22-05-25

Generalised Probabilistic Modelling and Improved Uncertainty Estimation in Comparative LLM-as-a-judge

Authors: Yassir Fathullah, Mark J. F. Gales |

阅读更多

来源: ArXiv AI | 22-05-25

ClickSight: Interpreting Student Clickstreams to Reveal Insights on Learning Strategies via LLMs

Authors: Bahar Radmehr, Ekaterina Shved, Fatma Betül Güreş, Adish Singla, Tanja Käser |

阅读更多

来源: ArXiv AI | 22-05-25

Average Reward Reinforcement Learning for Omega-Regular and Mean-Payoff Objectives

Authors: Milad Kazemi, Mateo Perez, Fabio Somenzi, Sadegh Soudjani, Ashutosh Trivedi, Alvaro Velasquez |

阅读更多

来源: ArXiv AI | 22-05-25

Microsoft Build 2025 showcases new AI agent tools and open interfaces for developers

阅读更多

来源: The Decoder | 21-05-25

Large language models often struggle with decision-making — a new study explains why

阅读更多

来源: The Decoder | 21-05-25

Deep Learning Is Applied Topologytheahura.substack.com

阅读更多

来源: Hacker News | 21-05-25

Watching AI drive Microsoft employees insanereddit.com

阅读更多

来源: Hacker News | 21-05-25

Someone got an LLM running on a Commodore 64 from 1982, and it runs as wellxda-developers.com

阅读更多

来源: Hacker News | 21-05-25

5 Boring Things That Have a Bigger Impact Than AI Assistants on Dev Productivitycodemanship.wordpress.com

阅读更多

来源: Hacker News | 21-05-25

DrugPilot: LLM-based Parameterized Reasoning Agent for Drug Discovery

Authors: Kun Li, Zhennan Wu, Shoupeng Wang, Wenbin Hu |

阅读更多

来源: ArXiv AI | 21-05-25

Divide by Question, Conquer by Agent: SPLIT-RAG with Question-Driven Graph Partitioning

Authors: Ruiyi Yang, Hao Xue, Imran Razzak, Hakim Hacid, Flora D. Salim |

阅读更多

来源: ArXiv AI | 21-05-25

RL of Thoughts: Navigating LLM Reasoning with Inference-time Reinforcement Learning

Authors: Qianyue Hao, Sibo Li, Jian Yuan, Yong Li |

阅读更多

来源: ArXiv AI | 21-05-25

ProMind-LLM: Proactive Mental Health Care via Causal Reasoning with Sensor Data

Authors: Xinzhe Zheng, Sijie Ji, Jiawei Sun, Renqi Chen, Wei Gao, Mani Srivastava |

阅读更多

来源: ArXiv AI | 21-05-25

MM-Agent: LLM as Agents for Real-world Mathematical Modeling Problem

Authors: Fan Liu, Zherui Yang, Cancheng Liu, Tianrui Song, Xiaofeng Gao, Hao Liu |

阅读更多

来源: ArXiv AI | 21-05-25

Toward Embodied AGI: A Review of Embodied AI and the Road Ahead

Authors: Yequan Wang, Aixin Sun |

阅读更多

来源: ArXiv AI | 21-05-25

Reinforcement Learning vs. Distillation: Understanding Accuracy and Capability in LLM Reasoning

Authors: Minwu Kim, Anubhav Shrestha, Safal Shrestha, Aadim Nepal, Keith Ross |

阅读更多

来源: ArXiv AI | 21-05-25

SafetyNet: Detecting Harmful Outputs in LLMs by Modeling and Monitoring Deceptive Behaviors

Authors: Maheep Chaudhary, Fazl Barez |

阅读更多

来源: ArXiv AI | 21-05-25

Unearthing Gems from Stones: Policy Optimization with Negative Sample Augmentation for LLM Reasoning

Authors: Zhaohui Yang, Shilei Jiang, Chen Hu, Linjing Li, Shihong Deng, Daxin Jiang |

阅读更多

来源: ArXiv AI | 21-05-25

Towards Reliable Proof Generation with LLMs: A Neuro-Symbolic Approach

Authors: Oren Sultan, Eitan Stern, Dafna Shahaf |

阅读更多

来源: ArXiv AI | 21-05-25

Guarded Query Routing for Large Language Models

Authors: Richard Šléher, William Brach, Tibor Sloboda, Kristián Košťál, Lukas Galke |

阅读更多

来源: ArXiv AI | 21-05-25

BACON: A fully explainable AI model with graded logic for decision making problems

Authors: Haishi Bai, Jozo Dujmovic, Jianwu Wang |

阅读更多

来源: ArXiv AI | 21-05-25

Let LLMs Break Free from Overthinking via Self-Braking Tuning

Authors: Haoran Zhao, Yuchen Yan, Yongliang Shen, Haolei Xu, Wenqi Zhang, Kaitao Song, Jian Shao, Weiming Lu, Jun Xiao, Yueting Zhuang |

阅读更多

来源: ArXiv AI | 21-05-25

SATBench: Benchmarking LLMs' Logical Reasoning via Automated Puzzle Generation from SAT Formulas

Authors: Anjiang Wei, Yuheng Wu, Yingjia Wan, Tarun Suresh, Huanmi Tan, Zhanke Zhou, Sanmi Koyejo, Ke Wang, Alex Aiken |

阅读更多

来源: ArXiv AI | 21-05-25

Cost-Augmented Monte Carlo Tree Search for LLM-Assisted Planning

Authors: Zihao Zhang, Fei Liu |

阅读更多

来源: ArXiv AI | 21-05-25

ContextAgent: Context-Aware Proactive LLM Agents with Open-World Sensory Perceptions

Authors: Bufang Yang, Lilin Xu, Liekang Zeng, Kaiwei Liu, Siyang Jiang, Wenrui Lu, Hongkai Chen, Xiaofan Jiang, Guoliang Xing, Zhenyu Yan |

阅读更多

来源: ArXiv AI | 21-05-25

Google AI Ultrablog.google

阅读更多

来源: Hacker News | 21-05-25

Ask HN: Conversational AI to Learn a Language

阅读更多

来源: Hacker News | 21-05-25

US officials warn Apple's iPhone AI deal with Alibaba may boost China's AI sector

阅读更多

来源: The Decoder | 20-05-25

Stability AI releases a compact open text-to-audio model that runs on mobile devices

阅读更多

来源: The Decoder | 20-05-25

Japanese startup Sakana AI explores time-based thinking with brain-inspired AI model

阅读更多

来源: The Decoder | 20-05-25

Google's AI answers are changing user behavior by sharply reducing clicks to websites

阅读更多

来源: The Decoder | 20-05-25

Solving physics-based initial value problems with unsupervised machine learningaps.org

阅读更多

来源: Hacker News | 20-05-25

Questioning Representational Optimism in Deep Learninggithub.com/akarshkumar0101

阅读更多

来源: Hacker News | 20-05-25

Claude Code SDKanthropic.com

阅读更多

来源: Hacker News | 20-05-25

The behavior of LLMs in hiring decisions: Systemic biases in candidate selectiondavidrozado.substack.com

阅读更多

来源: Hacker News | 20-05-25

NeuroGen: Neural Network Parameter Generation via Large Language Models

Authors: Jiaqi Wang, Yusen Zhang, Xi Li |

阅读更多

来源: ArXiv AI | 20-05-25

ALAS: A Stateful Multi-LLM Agent Framework for Disruption-Aware Planning

Authors: Edward Y. Chang, Longling Geng |

阅读更多

来源: ArXiv AI | 20-05-25

MARGE: Improving Math Reasoning for LLMs with Guided Exploration

Authors: Jingyue Gao, Runji Lin, Keming Lu, Bowen Yu, Junyang Lin, Jianyu Chen |

阅读更多

来源: ArXiv AI | 20-05-25

Accelerating Adaptive Retrieval Augmented Generation via Instruction-Driven Representation Reduction of Retrieval Overlaps

Authors: Jie Ou, Jinyu Guo, Shuaihong Jiang, Zhaokun Wang, Libo Qin, Shunyu Yao, Wenhong Tian |

阅读更多

来源: ArXiv AI | 20-05-25

Bullying the Machine: How Personas Increase LLM Vulnerability

Authors: Ziwei Xu, Udit Sanghi, Mohan Kankanhalli |

阅读更多

来源: ArXiv AI | 20-05-25

Reasoning BO: Enhancing Bayesian Optimization with Long-Context Reasoning Power of LLMs

Authors: Zhuo Yang, Lingli Ge, Dong Han, Tianfan Fu, Yuqiang Li |

阅读更多

来源: ArXiv AI | 20-05-25

Correspondence of high-dimensional emotion structures elicited by video clips between humans and Multimodal LLMs

Authors: Haruka Asanuma, Naoko Koide-Majima, Ken Nakamura, Takato Horii, Shinji Nishimoto, Masafumi Oizumi |

阅读更多

来源: ArXiv AI | 20-05-25

TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenarios

Authors: Shaohang Wei, Wei Li, Feifan Song, Wen Luo, Tianyi Zhuang, Haochen Tan, Zhijiang Guo, Houfeng Wang |

阅读更多

来源: ArXiv AI | 20-05-25

From Grunts to Grammar: Emergent Language from Cooperative Foraging

Authors: Maytus Piriyajitakonkij, Rujikorn Charakorn, Weicheng Tao, Wei Pan, Mingfei Sun, Cheston Tan, Mengmi Zhang |

阅读更多

来源: ArXiv AI | 20-05-25

LLM-KG-Bench 3.0: A Compass for SemanticTechnology Capabilities in the Ocean of LLMs

Authors: Lars-Peter Meyer, Johannes Frey, Desiree Heim, Felix Brei, Claus Stadler, Kurt Junghanns, Michael Martin |

阅读更多

来源: ArXiv AI | 20-05-25

CAIM: Development and Evaluation of a Cognitive AI Memory Framework for Long-Term Interaction with Intelligent Agents

Authors: Rebecca Westhäußer, Frederik Berenz, Wolfgang Minker, Sebastian Zepf |

阅读更多

来源: ArXiv AI | 20-05-25

StarFT: Robust Fine-tuning of Zero-shot Models via Spuriosity Alignment

Authors: Younghyun Kim, Jongheon Jeong, Sangkyung Kwak, Kyungmin Lee, Juho Lee, Jinwoo Shin |

阅读更多

来源: ArXiv AI | 20-05-25

Adversarial Testing in LLMs: Insights into Decision-Making Vulnerabilities

Authors: Lili Zhang, Haomiaomiao Wang, Long Cheng, Libao Deng, Tomas Ward |

阅读更多

来源: ArXiv AI | 20-05-25

Enhancing LLMs for Time Series Forecasting via Structure-Guided Cross-Modal Alignment

Authors: Siming Sun, Kai Zhang, Xuejun Jiang, Wenchao Meng, Qinmin Yang |

阅读更多

来源: ArXiv AI | 20-05-25

Multi-Armed Bandits Meet Large Language Models

Authors: Djallel Bouneffouf, Raphael Feraud |

阅读更多

来源: ArXiv AI | 20-05-25

Agentic Publications: An LLM-Driven Framework for Interactive Scientific Publishing, Supplementing Traditional Papers with AI-Powered Knowledge Systems

Authors: Roberto Pugliese, George Kourousias, Francesco Venier, Grazia Garlatti Costa |

阅读更多

来源: ArXiv AI | 20-05-25

AutoMathKG: The automated mathematical knowledge graph based on LLM and vector database

Authors: Rong Bian, Yu Geng, Zijian Yang, Bing Cheng |

阅读更多

来源: ArXiv AI | 20-05-25

MIT says a high-profile AI productivity study used data that cannot be trusted

阅读更多

来源: The Decoder | 20-05-25

OpenAI says GPT-5 is about doing everything better with "less model switching"

阅读更多

来源: The Decoder | 20-05-25

Dilbert creator Scott Adams says he will die soon from same cancer as Joe Bidenthewrap.com

阅读更多

来源: Hacker News | 20-05-25

Remarks on AI from NZnealstephenson.substack.com

阅读更多

来源: Hacker News | 20-05-25

GODBench: A Benchmark for Multimodal Large Language Models in Video Comment Art

Authors: Chenkai Zhang, Yiming Lei, Zeming Liu, Haitao Leng, Shaoguo Liu, Tingting Gao, Qingjie Liu, Yunhong Wang |

阅读更多

来源: ArXiv AI | 20-05-25

Disentangling Reasoning and Knowledge in Medical Large Language Models

Authors: Rahul Thapa, Qingyang Wu, Kevin Wu, Harrison Zhang, Angela Zhang, Eric Wu, Haotian Ye, Suhana Bedi, Nevin Aresh, Joseph Boen, Shriya Reddy, Ben Athiwaratkun, Shuaiwen Leon Song, James Zou |

阅读更多

来源: ArXiv AI | 20-05-25

LLMs unlock new paths to monetizing exploits

Authors: Nicholas Carlini, Milad Nasr, Edoardo Debenedetti, Barry Wang, Christopher A. Choquette-Choo, Daphne Ippolito, Florian Tramèr, Matthew Jagielski |

阅读更多

来源: ArXiv AI | 20-05-25

Code-Driven Planning in Grid Worlds with Large Language Models

Authors: Ashwath Vaithinathan Aravindan, Zhisheng Tang, Mayank Kejriwal |

阅读更多

来源: ArXiv AI | 20-05-25

Embodied AI in Machine Learning -- is it Really Embodied?

Authors: Matej Hoffmann, Shubhan Parag Patni |

阅读更多

来源: ArXiv AI | 20-05-25

Interpretable Risk Mitigation in LLM Agent Systems

Authors: Jan Chojnacki |

阅读更多

来源: ArXiv AI | 20-05-25

Modeling cognitive processes of natural reading with transformer-based Language Models

Authors: Bruno Bianchi, Fermín Travi, Juan E. Kamienkowski |

阅读更多

来源: ArXiv AI | 20-05-25

Improving Assembly Code Performance with Large Language Models via Reinforcement Learning

Authors: Anjiang Wei, Tarun Suresh, Huanmi Tan, Yinglun Xu, Gagandeep Singh, Ke Wang, Alex Aiken |

阅读更多

来源: ArXiv AI | 20-05-25

Creativity or Brute Force? Using Brainteasers as a Window into the Problem-Solving Abilities of Large Language Models

Authors: Simeng Han, Stephen Xia, Grant Zhang, Howard Dai, Chen Liu, Lichang Chen, Hoang Huy Nguyen, Hongyuan Mei, Jiayuan Mao, R. Thomas McCoy |

阅读更多

来源: ArXiv AI | 20-05-25

TACO: Rethinking Semantic Communications with Task Adaptation and Context Embedding

Authors: Achintha Wijesinghe, Weiwei Wang, Suchinthaka Wanninayaka, Songyang Zhang, Zhi Ding |

阅读更多

来源: ArXiv AI | 20-05-25

RAGSynth: Synthetic Data for Robust and Faithful RAG Component Optimization

Authors: Haiyang Shen, Hang Yan, Zhongshi Xing, Mugeng Liu, Yue Li, Zhiyang Chen, Yuxiang Wang, Jiuzheng Wang, Yun Ma |

阅读更多

来源: ArXiv AI | 20-05-25

Rethinking the Role of Prompting Strategies in LLM Test-Time Scaling: A Perspective of Probability Theory

Authors: Yexiang Liu, Zekun Li, Zhi Fang, Nan Xu, Ran He, Tieniu Tan |

阅读更多

来源: ArXiv AI | 20-05-25

Navigating the Alpha Jungle: An LLM-Powered MCTS Framework for Formulaic Factor Mining

Authors: Yu Shi, Yitong Duan, Jian Li |

阅读更多

来源: ArXiv AI | 20-05-25

Can Global XAI Methods Reveal Injected Bias in LLMs? SHAP vs Rule Extraction vs RuleSHAP

Authors: Francesco Sovrano |

阅读更多

来源: ArXiv AI | 20-05-25

LD-Scene: LLM-Guided Diffusion for Controllable Generation of Adversarial Safety-Critical Driving Scenarios

Authors: Mingxing Peng, Yuting Xie, Xusen Guo, Ruoyu Yao, Hai Yang, Jun Ma |

阅读更多

来源: ArXiv AI | 20-05-25

Is PRM Necessary? Problem-Solving RL Implicitly Induces PRM Capability in LLMs

Authors: Zhangying Feng, Qianglong Chen, Ning Lu, Yongqian Li, Siqi Cheng, Shuangmu Peng, Duyu Tang, Shengcai Liu, Zhirui Zhang |

阅读更多

来源: ArXiv AI | 20-05-25

SelfBudgeter: Adaptive Token Allocation for Efficient LLM Reasoning

Authors: Zheng Li, Qingxiu Dong, Jingyuan Ma, Di Zhang, Zhifang Sui |

阅读更多

来源: ArXiv AI | 20-05-25

Anthropic is forced to apologize after Claude undercuts its legal team

阅读更多

来源: The Decoder | 19-05-25

Show HN: I modeled the Voynich Manuscript with SBERT to test for structuregithub.com/brianmg

阅读更多

来源: Hacker News | 19-05-25

Meta's Behemoth AI model delay signals struggles to match new paradigms

阅读更多

来源: The Decoder | 19-05-25

Emergent social conventions and collective bias in LLM populationsscience.org

阅读更多

来源: Hacker News | 19-05-25

Understanding Transformers via N-gram Statisticsarxiv.org

阅读更多

来源: Hacker News | 18-05-25

O2 VoLTE: locating any customer with a phone callmastdatabase.co.uk

阅读更多

来源: Hacker News | 18-05-25

Emergence of Structure in Ensembles of Random Neural Networks

Authors: Luca Muscarnera, Luigi Loreti, Giovanni Todeschini, Alessio Fumagalli, Francesco Regazzoni |

阅读更多

来源: ArXiv AI | 18-05-25

SpikeVideoFormer: An Efficient Spike-Driven Video Transformer with Hamming Attention and $\mathcal{O}(T)$ Complexity

Authors: Shihao Zou, Qingfeng Li, Wei Ji, Jingjing Li, Yongkui Yang, Guoqi Li, Chao Dong |

阅读更多

来源: ArXiv AI | 18-05-25

ILIF: Temporal Inhibitory Leaky Integrate-and-Fire Neuron for Overactivation in Spiking Neural Networks

Authors: Kai Sun, Peibo Duan, Levin Kuhlmann, Beilun Wang, Bin Zhang |

阅读更多

来源: ArXiv AI | 18-05-25

Visual Fidelity Index for Generative Semantic Communications with Critical Information Embedding

Authors: Jianhao Huang, Qunsong Zeng, Kaibin Huang |

阅读更多

来源: ArXiv AI | 18-05-25

Rethinking Repetition Problems of LLMs in Code Generation

Authors: Yihong Dong, Yuchen Liu, Xue Jiang, Zhi Jin, Ge Li |

阅读更多

来源: ArXiv AI | 18-05-25

Are Large Language Models Robust in Understanding Code Against Semantics-Preserving Mutations?

Authors: Pedro Orvalho, Marta Kwiatkowska |

阅读更多

来源: ArXiv AI | 18-05-25

IN-RIL: Interleaved Reinforcement and Imitation Learning for Policy Fine-Tuning

Authors: Dechen Gao, Hang Wang, Hanchu Zhou, Nejib Ammar, Shatadal Mishra, Ahmadreza Moradipari, Iman Soltani, Junshan Zhang |

阅读更多

来源: ArXiv AI | 18-05-25

PIF: Anomaly detection via preference embedding

Authors: Filippo Leveni, Luca Magri, Giacomo Boracchi, Cesare Alippi |

阅读更多

来源: ArXiv AI | 18-05-25

Fine-tuning Diffusion Policies with Backpropagation Through Diffusion Timesteps

Authors: Ningyuan Yang, Jiaxuan Gao, Feng Gao, Yi Wu, Chao Yu |

阅读更多

来源: ArXiv AI | 18-05-25

Neural Thermodynamic Laws for Large Language Model Training

Authors: Ziming Liu, Yizhou Liu, Jeff Gore, Max Tegmark |

阅读更多

来源: ArXiv AI | 18-05-25

Pre-Act: Multi-Step Planning and Reasoning Improves Acting in LLM Agents

Authors: Mrinal Rawat, Ambuje Gupta, Rushil Goomer, Alessandro Di Bari, Neha Gupta, Roberto Pieraccini |

阅读更多

来源: ArXiv AI | 18-05-25

Demystifying AI Agents: The Final Generation of Intelligence

Authors: Kevin J McNamara, Rhea Pritham Marpu |

阅读更多

来源: ArXiv AI | 18-05-25

Leveraging Graph Retrieval-Augmented Generation to Support Learners' Understanding of Knowledge Concepts in MOOCs

Authors: Mohamed Abdelmagied, Mohamed Amine Chatti, Shoeb Joarder, Qurat Ul Ain, Rawaa Alatrash |

阅读更多

来源: ArXiv AI | 18-05-25

Empirically evaluating commonsense intelligence in large language models with large-scale human judgments

Authors: Tuan Dung Nguyen, Duncan J. Watts, Mark E. Whiting |

阅读更多

来源: ArXiv AI | 18-05-25

Towards a Deeper Understanding of Reasoning Capabilities in Large Language Models

Authors: Annie Wong, Thomas Bäck, Aske Plaat, Niki van Stein, Anna V. Kononova |

阅读更多

来源: ArXiv AI | 18-05-25

Soundcloud updates its AI training policy, but it's still unclear

阅读更多

来源: The Decoder | 18-05-25

Geoffrey Hinton's wildly overconfident AI prediction failed—now it's a lesson in humility

阅读更多

来源: The Decoder | 18-05-25

How 'The Little Prince' and AI help us better understand language development in the brain

阅读更多

来源: The Decoder | 18-05-25

LLMs are more persuasive than incentivized human persuadersarxiv.org

阅读更多

来源: Hacker News | 18-05-25

Unspoken Currency of Office Politics: Leverage and Sanction Between Coworkersgraphthinking.blogspot.com

阅读更多

来源: Hacker News | 18-05-25

Transformer neural net learns to run Conway's Game of Life just from examplessidsite.com

阅读更多

来源: Hacker News | 17-05-25

I'm Peter Roberts, immigration attorney, who does work for YC and startups. AMA

阅读更多

来源: Hacker News | 17-05-25

Show HN: Merliot – plugging physical devices into LLMsgithub.com/merliot

阅读更多

来源: Hacker News | 17-05-25

A Research Preview of Codexopenai.com

阅读更多

来源: Hacker News | 17-05-25

MIT asks arXiv to withdraw preprint of paper on AI and scientific discoveryeconomics.mit.edu

阅读更多

来源: Hacker News | 17-05-25

Getting AI to write good SQLcloud.google.com

阅读更多

来源: Hacker News | 17-05-25

Meta introduces OMol25 and UMA, new open AI tools for molecular research

阅读更多

来源: The Decoder | 17-05-25

Anthropic is reportedly testing Claude models that can fix their own mistakes

阅读更多

来源: The Decoder | 17-05-25

Will AI systems perform poorly due to AI-generated material in training data?acm.org

阅读更多

来源: Hacker News | 17-05-25

U.S. is cracking down on Huawei's AI hardware while loosening its general export regulations

阅读更多

来源: The Decoder | 16-05-25

After months of coding with LLMs, I'm going back to using my brainalbertofortin.com

阅读更多

来源: Hacker News | 16-05-25

The unreasonable effectiveness of an LLM agent loop with tool usesketch.dev

阅读更多

来源: Hacker News | 16-05-25

Show HN: Min.js style compression of tech docs for LLM contextgithub.com/marv1nnnnn

阅读更多

来源: Hacker News | 16-05-25

Google brings Gemini AI to smartwatches, cars, TVs, and XR headsets

阅读更多

来源: The Decoder | 15-05-25

OpenAI says its latest models outperform doctors in medical benchmark

阅读更多

来源: The Decoder | 15-05-25

Saudi Arabia founds AI company "Humain" - US relaxes chip export rules for Gulf states

阅读更多

来源: The Decoder | 15-05-25

Nvidia will supply advanced chips for Saudi Arabia’s Humain AI project

阅读更多

来源: The Decoder | 15-05-25

GreenFactory: Ensembling Zero-Cost Proxies to Estimate Performance of Neural Networks

Authors: Gabriel Cortês, Nuno Lourenço, Paolo Romano, Penousal Machado |

阅读更多

来源: ArXiv AI | 15-05-25

Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures

Authors: Chenggang Zhao, Chengqi Deng, Chong Ruan, Damai Dai, Huazuo Gao, Jiashi Li, Liyue Zhang, Panpan Huang, Shangyan Zhou, Shirong Ma, Wenfeng Liang, Ying He, Yuqing Wang, Yuxuan Liu, Y.X. Wei |

阅读更多

来源: ArXiv AI | 15-05-25

A 2D Semantic-Aware Position Encoding for Vision Transformers

Authors: Xi Chen, Shiyang Zhou, Muqi Huang, Jiaxu Feng, Yun Xiong, Kun Zhou, Biao Yang, Yuhui Zhang, Huishuai Bao, Sijia Peng, Chuan Li, Feng Shi |

阅读更多

来源: ArXiv AI | 15-05-25

Evaluating GPT- and Reasoning-based Large Language Models on Physics Olympiad Problems: Surpassing Human Performance and Implications for Educational Assessment

Authors: Paul Tschisgale, Holger Maus, Fabian Kieser, Ben Kroehs, Stefan Petersen, Peter Wulff |

阅读更多

来源: ArXiv AI | 15-05-25

Customizing a Large Language Model for VHDL Design of High-Performance Microprocessors

Authors: Nicolas Dupuis, Ravi Nair, Shyam Ramji, Sean McClintock, Nishant Chauhan, Priyanka Nagpal, Bart Blaner, Ken Valk, Leon Stok, Ruchir Puri |

阅读更多

来源: ArXiv AI | 15-05-25

How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference

Authors: Nidhal Jegham, Marwen Abdelatti, Lassad Elmoubarki, Abdeltawab Hendawi |

阅读更多

来源: ArXiv AI | 15-05-25

WorldView-Bench: A Benchmark for Evaluating Global Cultural Perspectives in Large Language Models

Authors: Abdullah Mushtaq, Imran Taj, Rafay Naeem, Ibrahim Ghaznavi, Junaid Qadir |

阅读更多

来源: ArXiv AI | 15-05-25

Automated Meta Prompt Engineering for Alignment with the Theory of Mind

Authors: Aaron Baughman, Rahul Agarwal, Eduardo Morales, Gozde Akay |

阅读更多

来源: ArXiv AI | 15-05-25

The Influence of Human-inspired Agentic Sophistication in LLM-driven Strategic Reasoners

Authors: Vince Trencsenyi, Agnieszka Mensfelt, Kostas Stathis |

阅读更多

来源: ArXiv AI | 15-05-25

Reproducibility Study of "Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents"

Authors: Pedro M. P. Curvo, Mara Dragomir, Salvador Torpes, Mohammadmahdi Rahimi |

阅读更多

来源: ArXiv AI | 15-05-25

Beyond the Known: Decision Making with Counterfactual Reasoning Decision Transformer

Authors: Minh Hoang Nguyen, Linh Le Pham Van, Thommen George Karimpanal, Sunil Gupta, Hung Le |

阅读更多

来源: ArXiv AI | 15-05-25

Improving the Reliability of LLMs: Combining CoT, RAG, Self-Consistency, and Self-Verification

Authors: Adarsh Kumar, Hwiyoon Kim, Jawahar Sai Nathani, Neil Roy |

阅读更多

来源: ArXiv AI | 15-05-25

Show HN: Muscle-Mem, a behavior cache for AI agentsgithub.com/pig-dot-dev

阅读更多

来源: Hacker News | 15-05-25

A server that wasn't meant to existdragas.net

阅读更多

来源: Hacker News | 15-05-25

LLMs get lost in multi-turn conversationarxiv.org

阅读更多

来源: Hacker News | 15-05-25

AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithmsdeepmind.google

阅读更多

来源: Hacker News | 15-05-25

Launch HN: Jazzberry (YC X25) – AI agent for finding bugs

阅读更多

来源: Hacker News | 15-05-25

Show HN: YapCards (iOS) – Voice-driven flashcards with AI feedback

阅读更多

来源: Hacker News | 15-05-25

100 experts call for more research into the control of AI systems

阅读更多

来源: The Decoder | 14-05-25

Show HN: HelixDB – Open-source vector-graph database for AI applications (Rust)github.com/helixdb

阅读更多

来源: Hacker News | 14-05-25

Build real-time knowledge graph for documents with LLMcocoindex.io

阅读更多

来源: Hacker News | 14-05-25

EM-LLM: Human-Inspired Episodic Memory for Infinite Context LLMsgithub.com/em-llm

阅读更多

来源: Hacker News | 14-05-25

A Survey of Deep Learning for Complex Speech Spectrograms

Authors: Yuying Xie, Zheng-Hua Tan |

阅读更多

来源: ArXiv AI | 14-05-25

Securing RAG: A Risk Assessment and Mitigation Framework

Authors: Lukas Ammann, Sara Ott, Christoph R. Landolt, Marco P. Lehmann |

阅读更多

来源: ArXiv AI | 14-05-25

CodePDE: An Inference Framework for LLM-driven PDE Solver Generation

Authors: Shanda Li, Tanya Marwah, Junhong Shen, Weiwei Sun, Andrej Risteski, Yiming Yang, Ameet Talwalkar |

阅读更多

来源: ArXiv AI | 14-05-25

Winning at All Cost: A Small Environment for Eliciting Specification Gaming Behaviors in Large Language Models

Authors: Lars Malmqvist |

阅读更多

来源: ArXiv AI | 14-05-25

Enhancing Trust Management System for Connected Autonomous Vehicles Using Machine Learning Methods: A Survey

Authors: Qian Xu, Lei Zhang, Yixiao Liu |

阅读更多

来源: ArXiv AI | 14-05-25

The Correspondence Between Bounded Graph Neural Networks and Fragments of First-Order Logic

Authors: Bernardo Cuenca Grau, Przemysław A. Wałęga |

阅读更多

来源: ArXiv AI | 14-05-25

Lost in Transmission: When and Why LLMs Fail to Reason Globally

Authors: Tobias Schnabel, Kiran Tomlinson, Adith Swaminathan, Jennifer Neville |

阅读更多

来源: ArXiv AI | 14-05-25

Decoding Neighborhood Environments with Large Language Models

Authors: Andrew Cart, Shaohu Zhang, Melanie Escue, Xugui Zhou, Haitao Zhao, Prashanth BusiReddyGari, Beiyu Lin, Shuang Li |

阅读更多

来源: ArXiv AI | 14-05-25

Benchmarking AI scientists in omics data-driven biological research

Authors: Erpai Luo, Jinmeng Jia, Yifan Xiong, Xiangyu Li, Xiaobo Guo, Baoqi Yu, Lei Wei, Xuegong Zhang |

阅读更多

来源: ArXiv AI | 14-05-25

Evaluating LLM Metrics Through Real-World Capabilities

Authors: Justin K Miller, Wenjia Tang |

阅读更多

来源: ArXiv AI | 14-05-25

Learning Like Humans: Advancing LLM Reasoning Capabilities via Adaptive Difficulty Curriculum Learning and Expert-Guided Self-Reformulation

Authors: Enci Zhang, Xingang Yan, Wei Lin, Tianxiang Zhang, Qianchun Lu |

阅读更多

来源: ArXiv AI | 14-05-25

Strategy-Augmented Planning for Large Language Models via Opponent Exploitation

Authors: Shuai Xu, Sijia Cui, Yanna Wang, Bo Xu, Qi Wang |

阅读更多

来源: ArXiv AI | 14-05-25

Achieving Scalable Robot Autonomy via neurosymbolic planning using lightweight local LLM

Authors: Nicholas Attolino, Alessio Capitanelli, Fulvio Mastrogiovanni |

阅读更多

来源: ArXiv AI | 14-05-25

Guiding LLM-based Smart Contract Generation with Finite State Machine

Authors: Hao Luo, Yuhao Lin, Xiao Yan, Xintong Hu, Yuxiang Wang, Qiming Zeng, Hao Wang, Jiawei Jiang |

阅读更多

来源: ArXiv AI | 14-05-25

Integrating Natural Language Processing and Exercise Monitoring for Early Diagnosis of Metabolic Syndrome: A Deep Learning Approach

Authors: Yichen Zhao, Yuhua Wang, Xi Cheng, Junhao Fang, Yang Yang |

阅读更多

来源: ArXiv AI | 14-05-25

LLM-based Prompt Ensemble for Reliable Medical Entity Recognition from EHRs

Authors: K M Sajjadul Islam, Ayesha Siddika Nipu, Jiawei Wu, Praveen Madiraju |

阅读更多

来源: ArXiv AI | 14-05-25

DeepMath-Creative: A Benchmark for Evaluating Mathematical Creativity of Large Language Models

Authors: Xiaoyang Chen, Xinan Dai, Yu Du, Qian Feng, Naixu Guo, Tingshuo Gu, Yuting Gao, Yingyi Gao, Xudong Han, Xiang Jiang, Yilin Jin, Hongyi Lin, Shisheng Lin, Xiangnan Li, Yuante Li, Yixing Li, Zhentao Lai, Zilu Ma, Yingrong Peng, Jiacheng Qian, Hao-Yu Sun, Jianbo Sun, Zirui Wang, Siwei Wu, Zian Wang, Bin Xu, Jianghao Xu, Yiyang Yu, Zichuan Yang, Hongji Zha, Ruichong Zhang |

阅读更多

来源: ArXiv AI | 14-05-25

OpenAI's chief scientist Jakub Pachocki says there is evidence that AI models discover novel insights

阅读更多

来源: The Decoder | 14-05-25

Insurers launch cover for losses caused by AI chatbot errorsft.com

阅读更多

来源: Hacker News | 14-05-25

Garbage collection of object storage at scalewarpstream.com

阅读更多

来源: Hacker News | 14-05-25

DeepSeek’s founder is threatening US dominance in AI racebloomberg.com

阅读更多

来源: Hacker News | 14-05-25

Confident user prompts make LLMs more likely to hallucinate

阅读更多

来源: The Decoder | 13-05-25

Stanford researchers find AI agents improve when guided by past successes

阅读更多

来源: The Decoder | 13-05-25

Microsoft could sacrifice some OpenAI shares - but wants to secure access to AI technology

阅读更多

来源: The Decoder | 13-05-25

HealthBench – An evaluation for AI systems and human healthopenai.com

阅读更多

来源: Hacker News | 13-05-25

A conversation about AI for science with Jason Pruetlanl.gov

阅读更多

来源: Hacker News | 13-05-25

A class of distributed automata that contains the modal mu-fragment

Authors: Veeti Ahvonen, Damian Heiman, Antti Kuusisto |

阅读更多

来源: ArXiv AI | 13-05-25

Reliable Collaborative Conversational Agent System Based on LLMs and Answer Set Programming

Authors: Yankai Zeng, Gopal Gupta |

阅读更多

来源: ArXiv AI | 13-05-25

KCluster: An LLM-based Clustering Approach to Knowledge Component Discovery

Authors: Yumou Wei, Paulo Carvalho, John Stamper |

阅读更多

来源: ArXiv AI | 13-05-25

Exploring Multimodal Foundation AI and Expert-in-the-Loop for Sustainable Management of Wild Salmon Fisheries in Indigenous Rivers

Authors: Chi Xu, Yili Jin, Sami Ma, Rongsheng Qian, Hao Fang, Jiangchuan Liu, Xue Liu, Edith C.H. Ngai, William I. Atlas, Katrina M. Connors, Mark A. Spoljaric |

阅读更多

来源: ArXiv AI | 13-05-25

Control Plane as a Tool: A Scalable Design Pattern for Agentic AI Systems

Authors: Sivasathivel Kandasamy |

阅读更多

来源: ArXiv AI | 13-05-25

Embodied Intelligence: The Key to Unblocking Generalized Artificial Intelligence

Authors: Jinhao Jiang, Changlin Chen, Shile Feng, Wanru Geng, Zesheng Zhou, Ni Wang, Shuai Li, Feng-Qi Cui, Erbao Dong |

阅读更多

来源: ArXiv AI | 13-05-25

From Knowledge to Reasoning: Evaluating LLMs for Ionic Liquids Research in Chemical and Biological Engineering

Authors: Gaurab Sarkar, Sougata Saha |

阅读更多

来源: ArXiv AI | 13-05-25

LLM-Augmented Chemical Synthesis and Design Decision Programs

Authors: Haorui Wang, Jeff Guo, Lingkai Kong, Rampi Ramprasad, Philippe Schwaller, Yuanqi Du, Chao Zhang |

阅读更多

来源: ArXiv AI | 13-05-25

Explainable AI the Latest Advancements and New Trends

Authors: Bowen Long, Enjie Liu, Renxi Qiu, Yanqing Duan |

阅读更多

来源: ArXiv AI | 13-05-25

DialogueReason: Rule-Based RL Sparks Dialogue Reasoning in LLMs

Authors: Yubo Shu, Zhewei Huang, Xin Wu, Chen Hu, Shuchang Zhou, Daxin Jiang |

阅读更多

来源: ArXiv AI | 13-05-25

Efficient Fault Detection in WSN Based on PCA-Optimized Deep Neural Network Slicing Trained with GOA

Authors: Mahmood Mohassel Feghhi, Raya Majid Alsharfa, Majid Hameed Majeed |

阅读更多

来源: ArXiv AI | 13-05-25

RefPentester: A Knowledge-Informed Self-Reflective Penetration Testing Framework Based on Large Language Models

Authors: Hanzheng Dai, Yuanliang Li, Zhibo Zhang, Jun Yan |

阅读更多

来源: ArXiv AI | 13-05-25

Architectural Precedents for General Agents using Large Language Models

Authors: Robert E. Wray, James R. Kirk, John E. Laird |

阅读更多

来源: ArXiv AI | 13-05-25

AIS Data-Driven Maritime Monitoring Based on Transformer: A Comprehensive Review

Authors: Zhiye Xie, Enmei Tu, Xianping Fu, Guoliang Yuan, Yi Han |

阅读更多

来源: ArXiv AI | 13-05-25

Web-Bench: A LLM Code Benchmark Based on Web Standards and Frameworks

Authors: Kai Xu, YiWei Mao, XinYi Guan, ZiLong Feng |

阅读更多

来源: ArXiv AI | 13-05-25

How well do LLMs reason over tabular data, really?

Authors: Cornelius Wolff, Madelon Hulsebos |

阅读更多

来源: ArXiv AI | 13-05-25

QuantX: A Framework for Hardware-Aware Quantization of Generative AI Workloads

Authors: Khurram Mazher, Saad Bin Nasir |

阅读更多

来源: ArXiv AI | 13-05-25

YuLan-OneSim: Towards the Next Generation of Social Simulator with Large Language Models

Authors: Lei Wang, Heyang Gao, Xiaohe Bo, Xu Chen, Ji-Rong Wen |

阅读更多

来源: ArXiv AI | 13-05-25

"I Apologize For Not Understanding Your Policy": Exploring the Specification and Evaluation of User-Managed Access Control Policies by AI Virtual Assistants

Authors: Jennifer Mondragon, Carlos Rubio-Medrano, Gael Cruz, Dvijesh Shastri |

阅读更多

来源: ArXiv AI | 13-05-25

Multi-Agent Systems for Robotic Autonomy with LLMs

Authors: Junhong Chen, Ziqi Yang, Haoyuan G Xu, Dandan Zhang, George Mylonas |

阅读更多

来源: ArXiv AI | 13-05-25

Evolutionary thoughts: integration of large language models and evolutionary algorithms

Authors: Antonio Jimeno Yepes, Pieter Barnard |

阅读更多

来源: ArXiv AI | 13-05-25

What Is Next for LLMs? Next-Generation AI Computing Hardware Using Photonic Chips

Authors: Renjie Li, Wenjie Wei, Qi Xin, Xiaoli Liu, Sixuan Mao, Erik Ma, Zijian Chen, Malu Zhang, Haizhou Li, Zhaoyu Zhang |

阅读更多

来源: ArXiv AI | 13-05-25

AgentXploit: End-to-End Redteaming of Black-Box AI Agents

Authors: Zhun Wang, Vincent Siu, Zhe Ye, Tianneng Shi, Yuzhou Nie, Xuandong Zhao, Chenguang Wang, Wenbo Guo, Dawn Song |

阅读更多

来源: ArXiv AI | 13-05-25

Human-in-the-Loop AI for HVAC Management Enhancing Comfort and Energy Efficiency

Authors: Xinyu Liang, Frits de Nijs, Buser Say, Hao Wang |

阅读更多

来源: ArXiv AI | 13-05-25

Leveraging Vision-Language Models for Visual Grounding and Analysis of Automotive UI

Authors: Benjamin Raphael Ernhofer, Daniil Prokhorov, Jannica Langner, Dominik Bollmann |

阅读更多

来源: ArXiv AI | 13-05-25

IRNN: Innovation-driven Recurrent Neural Network for Time-Series Data Modeling and Prediction

Authors: Yifan Zhou, Yibo Wang, Chao Shang |

阅读更多

来源: ArXiv AI | 13-05-25

Multimodal Sentiment Analysis on CMU-MOSEI Dataset using Transformer-based Models

Authors: Jugal Gajjar, Kaustik Ranaware |

阅读更多

来源: ArXiv AI | 13-05-25

LLMs Outperform Experts on Challenging Biology Benchmarks

Authors: Lennart Justen |

阅读更多

来源: ArXiv AI | 13-05-25

UniSymNet: A Unified Symbolic Network Guided by Transformer

Authors: Xinxin Li, Juan Zhang, Da Li, Xingyu Liu, Jin Xu, Junping Yin |

阅读更多

来源: ArXiv AI | 13-05-25

The Application of Deep Learning for Lymph Node Segmentation: A Systematic Review

Authors: Jingguo Qu, Xinyang Han, Man-Lik Chui, Yao Pu, Simon Takadiyi Gunda, Ziman Chen, Jing Qin, Ann Dorothy King, Winnie Chiu-Wing Chu, Jing Cai, Michael Tin-Cheung Ying |

阅读更多

来源: ArXiv AI | 13-05-25

A Scaling Law for Token Efficiency in LLM Fine-Tuning Under Fixed Compute Budgets

Authors: Ryan Lagasse, Aidan Kiernans, Avijit Ghosh, Shiri Dori-Hacohen |

阅读更多

来源: ArXiv AI | 13-05-25

HiBayES: A Hierarchical Bayesian Modeling Framework for AI Evaluation Statistics

Authors: Lennart Luettgau, Harry Coppock, Magda Dubois, Christopher Summerfield, Cozmin Ududec |

阅读更多

来源: ArXiv AI | 13-05-25

Safety by Measurement: A Systematic Literature Review of AI Safety Evaluation Methods

Authors: Markov Grey, Charbel-Raphaël Segerie |

阅读更多

来源: ArXiv AI | 13-05-25

Leveraging Large Language Models for enzymatic reaction prediction and characterization

Authors: Lorenzo Di Fruscia, Jana Marie Weber |

阅读更多

来源: ArXiv AI | 13-05-25

Combining Abstract Argumentation and Machine Learning for Efficiently Analyzing Low-Level Process Event Streams

Authors: Bettina Fazzinga, Sergio Flesca, Filippo Furfaro, Luigi Pontieri, Francesco Scala |

阅读更多

来源: ArXiv AI | 13-05-25

APOLLO: Automated LLM and Lean Collaboration for Advanced Formal Reasoning

Authors: Azim Ospanov, Roozbeh Yousefzadeh |

阅读更多

来源: ArXiv AI | 13-05-25

ArtRAG: Retrieval-Augmented Generation with Structured Context for Visual Art Understanding

Authors: Shuai Wang, Ivona Najdenkoska, Hongyi Zhu, Stevan Rudinac, Monika Kackovic, Nachoem Wijnberg, Marcel Worring |

阅读更多

来源: ArXiv AI | 13-05-25

Free and Fair Hardware: A Pathway to Copyright Infringement-Free Verilog Generation using LLMs

Authors: Sam Bush, Matthew DeLorenzo, Phat Tieu, Jeyavijayan Rajendran |

阅读更多

来源: ArXiv AI | 13-05-25

Bytedance launches Agent TARS, an open-source AI automation agent

阅读更多

来源: The Decoder | 12-05-25

Google recaps how its LLMs could change in-game interactions

阅读更多

来源: The Decoder | 12-05-25

Five major obstacles are holding back RAG systems in healthcare

阅读更多

来源: The Decoder | 12-05-25

Writing an LLM from scratch, part 13 – attention heads are dumbgilesthomas.com

阅读更多

来源: Hacker News | 12-05-25

Spark AI (YC W24) Is Hiring a Full Stack Engineer in San Franciscoycombinator.com

阅读更多

来源: Hacker News | 12-05-25

US Copyright Office found AI companies breach copyright. Its boss was firedtheregister.com

阅读更多

来源: Hacker News | 12-05-25

Klarna changes its AI tune and again recruits humans for customer servicecustomerexperiencedive.com

阅读更多

来源: Hacker News | 12-05-25

Avoiding AI is hard – but our freedom to opt out must be protectedtheconversation.com

阅读更多

来源: Hacker News | 12-05-25

Custom SIM card in Tesla Model 3 2024, Tesla Model Y 2025 and Cybertruckolegkutkov.me

阅读更多

来源: Hacker News | 12-05-25

OpenAI adds new fine-tuning options for o4-mini and GPT-4.1

阅读更多

来源: The Decoder | 11-05-25

Software Development Life Cycle Perspective: A Survey of Benchmarks for CodeLLMs and Agents

Authors: Kaixin Wang, Tianlin Li, Xiaoyu Zhang, Chong Wang, Weisong Sun, Yang Liu, Bin Shi |

阅读更多

来源: ArXiv AI | 11-05-25

T-T: Table Transformer for Tagging-based Aspect Sentiment Triplet Extraction

Authors: Kun Peng, Chaodong Tong, Cong Cao, Hao Peng, Qian Li, Guanlin Wu, Lei Jiang, Yanbing Liu, Philip S. Yu |

阅读更多

来源: ArXiv AI | 11-05-25

Put CASH on Bandits: A Max K-Armed Problem for Automated Machine Learning

Authors: Amir Rezaei Balef, Claire Vernade, Katharina Eggensperger |

阅读更多

来源: ArXiv AI | 11-05-25

Incentive-Aware Machine Learning; Robustness, Fairness, Improvement & Causality

Authors: Chara Podimata |

阅读更多

来源: ArXiv AI | 11-05-25

High-fidelity Grain Growth Modeling: Leveraging Deep Learning for Fast Computations

Authors: Pungponhavoan Tep, Marc Bernacki |

阅读更多

来源: ArXiv AI | 11-05-25

Threshold Modulation for Online Test-Time Adaptation of Spiking Neural Networks

Authors: Kejie Zhao, Wenjia Hua, Aiersi Tuerhong, Luziwei Leng, Yuxin Ma, Qinghua Guo |

阅读更多

来源: ArXiv AI | 11-05-25

Towards Artificial Intelligence Research Assistant for Expert-Involved Learning

Authors: Tianyu Liu, Simeng Han, Xiao Luo, Hanchen Wang, Pan Lu, Biqing Zhu, Yuge Wang, Keyi Li, Jiapeng Chen, Rihao Qu, Yufeng Liu, Xinyue Cui, Aviv Yaish, Yuhang Chen, Minsheng Hao, Chuhan Li, Kexing Li, Arman Cohan, Hua Xu, Mark Gerstein, James Zou, Hongyu Zhao |

阅读更多

来源: ArXiv AI | 11-05-25

StreamBridge: Turning Your Offline Video Large Language Model into a Proactive Streaming Assistant

Authors: Haibo Wang, Bo Feng, Zhengfeng Lai, Mingze Xu, Shiyu Li, Weifeng Ge, Afshin Dehghan, Meng Cao, Ping Huang |

阅读更多

来源: ArXiv AI | 11-05-25

TransProQA: an LLM-based literary Translation evaluation metric with Professional Question Answering

Authors: Ran Zhang, Wei Zhao, Lieve Macken, Steffen Eger |

阅读更多

来源: ArXiv AI | 11-05-25

Large Language Models are Autonomous Cyber Defenders

Authors: Sebastián R. Castro, Roberto Campbell, Nancy Lau, Octavio Villalobos, Jiaqi Duan, Alvaro A. Cardenas |

阅读更多

来源: ArXiv AI | 11-05-25

The Promise and Limits of LLMs in Constructing Proofs and Hints for Logic Problems in Intelligent Tutoring Systems

Authors: Sutapa Dey Tithi, Arun Kumar Ramesh, Clara DiMarco, Xiaoyi Tian, Nazia Alam, Kimia Fazeli, Tiffany Barnes |

阅读更多

来源: ArXiv AI | 11-05-25

Position: The AI Conference Peer Review Crisis Demands Author Feedback and Reviewer Rewards

Authors: Jaeho Kim, Yunseok Lee, Seulki Lee |

阅读更多

来源: ArXiv AI | 11-05-25

Position: Epistemic Artificial Intelligence is Essential for Machine Learning Models to Know When They Do Not Know

Authors: Shireen Kudukkil Manchingal, Fabio Cuzzolin |

阅读更多

来源: ArXiv AI | 11-05-25

A Reputation System for Large Language Model-based Multi-agent Systems to Avoid the Tragedy of the Commons

Authors: Siyue Ren, Wanli Fu, Xinkun Zou, Chen Shen, Yi Cai, Chen Chu, Zhen Wang, Shuyue Hu |

阅读更多

来源: ArXiv AI | 11-05-25

Is there a half-life for the success rates of AI agents?

Authors: Toby Ord |

阅读更多

来源: ArXiv AI | 11-05-25

Advancing Neural Network Verification through Hierarchical Safety Abstract Interpretation

Authors: Luca Marzari, Isabella Mastroeni, Alessandro Farinelli |

阅读更多

来源: ArXiv AI | 11-05-25

A Pain Assessment Framework based on multimodal data and Deep Machine Learning methods

Authors: Stefanos Gkikas |

阅读更多

来源: ArXiv AI | 11-05-25

ZeroSearch: Alibaba trains search assistant in AI simulation

阅读更多

来源: The Decoder | 11-05-25

Show HN: Code Claude Codegithub.com/rvca212

阅读更多

来源: Hacker News | 11-05-25

LTXVideo 13B AI video generationltxv.video

阅读更多

来源: Hacker News | 10-05-25

ChatGPT's user base expands while established web giants lose ground

阅读更多

来源: The Decoder | 10-05-25

Hugging Face unveils experimental AI agent for computers

阅读更多

来源: The Decoder | 10-05-25

OpenAI plans "cderGPT" for the US Food and Drug Administration (FDA)

阅读更多

来源: The Decoder | 10-05-25

Odin, a Pragmatic C Alternative with a Go Flavourbitshifters.cc

阅读更多

来源: Hacker News | 10-05-25

Fighting Unwanted Notifications with Machine Learning in Chromechromium.org

阅读更多

来源: Hacker News | 10-05-25

Microsoft leverages Google's open A2A protocol for interoperable AI agents

阅读更多

来源: The Decoder | 09-05-25

A flat pricing subscription for Claude Codeanthropic.com

阅读更多

来源: Hacker News | 09-05-25

Ciro (YC S22) is hiring a software engineer to build AI agents for salesycombinator.com

阅读更多

来源: Hacker News | 09-05-25

Notes on rolling out Cursor and Claude Codeghiculescu.substack.com

阅读更多

来源: Hacker News | 09-05-25

OpenAI launches a program to partner with governments on global AI infrastructure

阅读更多

来源: The Decoder | 08-05-25

EU's leading AI startup Mistral unveils Medium 3 and Le Chat Enterprise

阅读更多

来源: The Decoder | 08-05-25

By 2026, most firms expect to have a Chief AI Officer on staff

阅读更多

来源: The Decoder | 08-05-25

Web search on the Anthropic APIanthropic.com

阅读更多

来源: Hacker News | 08-05-25

Create and edit images with Gemini 2.0 in previewgoogleblog.com

阅读更多

来源: Hacker News | 08-05-25

Mistral ships Le Chat – enterprise AI assistant that can run on premmistral.ai

阅读更多

来源: Hacker News | 08-05-25

Sparsity is All You Need: Rethinking Biological Pathway-Informed Approaches in Deep Learning

Authors: Isabella Caranzano, Corrado Pancotti, Cesare Rollo, Flavio Sartori, Pietro Liò, Piero Fariselli, Tiziana Sanavia |

阅读更多

来源: ArXiv AI | 08-05-25

Balancing Accuracy, Calibration, and Efficiency in Active Learning with Vision Transformers Under Label Noise

Authors: Moseli Mots'oehli, Hope Mogale, Kyungim Baek |

阅读更多

来源: ArXiv AI | 08-05-25

Multi-Granular Attention based Heterogeneous Hypergraph Neural Network

Authors: Hong Jin, Kaicheng Zhou, Jie Yin, Lan You, Zhifeng Zhou |

阅读更多

来源: ArXiv AI | 08-05-25

Detecting Concept Drift in Neural Networks Using Chi-squared Goodness of Fit Testing

Authors: Jacob Glenn Ayers, Buvaneswari A. Ramanan, Manzoor A. Khan |

阅读更多

来源: ArXiv AI | 08-05-25

OBLIVIATE: Robust and Practical Machine Unlearning for Large Language Models

Authors: Xiaoyu Xu, Minxin Du, Qingqing Ye, Haibo Hu |

阅读更多

来源: ArXiv AI | 08-05-25

The Aloe Family Recipe for Open and Specialized Healthcare LLMs

Authors: Dario Garcia-Gasulla, Jordi Bayarri-Planas, Ashwin Kumar Gururajan, Enrique Lopez-Cuena, Adrian Tormos, Daniel Hinjos, Pablo Bernabeu-Perez, Anna Arias-Duart, Pablo Agustin Martin-Torres, Marta Gonzalez-Mallo, Sergio Alvarez-Napagao, Eduard Ayguadé-Parra, Ulises Cortés |

阅读更多

来源: ArXiv AI | 08-05-25

"I Can See Forever!": Evaluating Real-time VideoLLMs for Assisting Individuals with Visual Impairments

Authors: Ziyi Zhang, Zhen Sun, Zongmin Zhang, Zifan Peng, Yuemeng Zhao, Zichun Wang, Zeren Luo, Ruiting Zuo, Xinlei He |

阅读更多

来源: ArXiv AI | 08-05-25

Automatic Music Transcription using Convolutional Neural Networks and Constant-Q transform

Authors: Yohannis Telila, Tommaso Cucinotta, Davide Bacciu |

阅读更多

来源: ArXiv AI | 08-05-25

Model-Based AI planning and Execution Systems for Robotics

Authors: Or Wertheim, Ronen I. Brafman |

阅读更多

来源: ArXiv AI | 08-05-25

Proceedings of 1st Workshop on Advancing Artificial Intelligence through Theory of Mind

Authors: Mouad Abrini, Omri Abend, Dina Acklin, Henny Admoni, Gregor Aichinger, Nitay Alon, Zahra Ashktorab, Ashish Atreja, Moises Auron, Alexander Aufreiter, Raghav Awasthi, Soumya Banerjee, Joe M. Barnby, Rhea Basappa, Severin Bergsmann, Djallel Bouneffouf, Patrick Callaghan, Marc Cavazza, Thierry Chaminade, Sonia Chernova, Mohamed Chetouan, Moumita Choudhury, Axel Cleeremans, Jacek B. Cywinski, Fabio Cuzzolin, Hokin Deng, N'yoma Diamond, Camilla Di Pasquasio, Guillaume Dumas, Max van Duijn, Mahapatra Dwarikanath, Qingying Gao, Ashok Goel, Rebecca Goldstein, Matthew Gombolay, Gabriel Enrique Gonzalez, Amar Halilovic, Tobias Halmdienst, Mahimul Islam, Julian Jara-Ettinger, Natalie Kastel, Renana Keydar, Ashish K. Khanna, Mahdi Khoramshahi, JiHyun Kim, MiHyeon Kim, YoungBin Kim, Senka Krivic, Nikita Krasnytskyi, Arun Kumar, JuneHyoung Kwon, Eunju Lee, Shane Lee, Peter R. Lewis, Xue Li, Yijiang Li, Michal Lewandowski, Nathan Lloyd, Matthew B. Luebbers, Dezhi Luo, Haiyun Lyu, Dwarikanath Mahapatra, Kamal Maheshwari, Mallika Mainali, Piyush Mathur, Patrick Mederitsch, Shuwa Miura, Manuel Preston de Miranda, Reuth Mirsky, Shreya Mishra, Nina Moorman, Katelyn Morrison, John Muchovej, Bernhard Nessler, Felix Nessler, Hieu Minh Jord Nguyen, Abby Ortego, Francis A. Papay, Antoine Pasquali, Hamed Rahimi, Charumathi Raghu, Amanda Royka, Stefan Sarkadi, Jaelle Scheuerman, Simon Schmid, Paul Schrater, Anik Sen, Zahra Sheikhbahaee, Ke Shi, Reid Simmons, Nishant Singh, Mason O. Smith, Ramira van der Meulen, Anthia Solaki, Haoran Sun, Viktor Szolga, Matthew E. Taylor, Travis Taylor, Sanne Van Waveren, Juan David Vargas |

阅读更多

来源: ArXiv AI | 08-05-25

EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning

Authors: Zhenghao Xing, Xiaowei Hu, Chi-Wing Fu, Wenhai Wang, Jifeng Dai, Pheng-Ann Heng |

阅读更多

来源: ArXiv AI | 08-05-25

Fight Fire with Fire: Defending Against Malicious RL Fine-Tuning via Reward Neutralization

Authors: Wenjun Cao |

阅读更多

来源: ArXiv AI | 08-05-25

The Power of Stories: Narrative Priming Shapes How LLM Agents Collaborate and Compete

Authors: Gerrit Großmann, Larisa Ivanova, Sai Leela Poduru, Mohaddeseh Tabrizian, Islam Mesabah, David A. Selby, Sebastian J. Vollmer |

阅读更多

来源: ArXiv AI | 08-05-25

LogiDebrief: A Signal-Temporal Logic based Automated Debriefing Approach with Large Language Models Integration

Authors: Zirong Chen, Ziyan An, Jennifer Reynolds, Kristin Mullen, Stephen Martini, Meiyi Ma |

阅读更多

来源: ArXiv AI | 08-05-25

TrajEvo: Designing Trajectory Prediction Heuristics via LLM-driven Evolution

Authors: Zhikai Zhao, Chuanbo Hua, Federico Berto, Kanghoon Lee, Zihan Ma, Jiachen Li, Jinkyoo Park |

阅读更多

来源: ArXiv AI | 08-05-25

ChatGPT sees about 50 percent more use on weekdays than weekends

阅读更多

来源: The Decoder | 08-05-25

OpenAI restructures as public benefit corporation under non-profit control

阅读更多

来源: The Decoder | 08-05-25

Google upgrades Gemini 2.5 Pro for coding and app development

阅读更多

来源: The Decoder | 08-05-25

Wikidive – AI guided rabbitholes in Wikipediawikidive.tulv.in

阅读更多

来源: Hacker News | 08-05-25

How to Average in Prolog (2017)storytotell.org

阅读更多

来源: Hacker News | 08-05-25

Detecting Quishing Attacks with Machine Learning Techniques Through QR Code Analysis

Authors: Fouad Trad, Ali Chehab |

阅读更多

来源: ArXiv AI | 07-05-25

An Analysis of Hyper-Parameter Optimization Methods for Retrieval Augmented Generation

Authors: Matan Orbach, Ohad Eytan, Benjamin Sznajder, Ariel Gera, Odellia Boni, Yoav Kantor, Gal Bloch, Omri Levy, Hadas Abraham, Nitzan Barzilay, Eyal Shnarch, Michael E. Factor, Shila Ofek-Koifman, Paula Ta-Shma, Assaf Toledo |

阅读更多

来源: ArXiv AI | 07-05-25

Blending 3D Geometry and Machine Learning for Multi-View Stereopsis

Authors: Vibhas Vats, Md. Alimoor Reza, David Crandall, Soon-heung Jung |

阅读更多

来源: ArXiv AI | 07-05-25

Rapid AI-based generation of coverage paths for dispensing applications

Authors: Simon Baeuerle, Ian F. Mendonca, Kristof Van Laerhoven, Ralf Mikut, Andreas Steimer |

阅读更多

来源: ArXiv AI | 07-05-25

LlamaFirewall: An open source guardrail system for building secure AI agents

Authors: Sahana Chennabasappa, Cyrus Nikolaidis, Daniel Song, David Molnar, Stephanie Ding, Shengye Wan, Spencer Whitman, Lauren Deason, Nicholas Doucette, Abraham Montilla, Alekhya Gampa, Beto de Paola, Dominik Gabi, James Crnkovich, Jean-Christophe Testud, Kat He, Rashnil Chaturvedi, Wu Zhou, Joshua Saxe |

阅读更多

来源: ArXiv AI | 07-05-25

Holmes: Automated Fact Check with Large Language Models

Authors: Haoran Ou, Gelei Deng, Xingshuo Han, Jie Zhang, Xinlei He, Han Qiu, Shangwei Guo, Tianwei Zhang |

阅读更多

来源: ArXiv AI | 07-05-25

Is AI currently capable of identifying wild oysters? A comparison of human annotators against the AI model, ODYSSEE

Authors: Brendan Campbell, Alan Williams, Kleio Baxevani, Alyssa Campbell, Rushabh Dhoke, Rileigh E. Hudock, Xiaomin Lin, Vivek Mange, Bernhard Neuberger, Arjun Suresh, Alhim Vera, Arthur Trembanis, Herbert G. Tanner, Edward Hale |

阅读更多

来源: ArXiv AI | 07-05-25

CombiBench: Benchmarking LLM Capability for Combinatorial Mathematics

Authors: Junqi Liu, Xiaohan Lin, Jonas Bayer, Yael Dillies, Weijie Jiang, Xiaodan Liang, Roman Soletskyi, Haiming Wang, Yunzhou Xie, Beibei Xiong, Zhengfeng Yang, Jujian Zhang, Lihong Zhi, Jia Li, Zhengying Liu |

阅读更多

来源: ArXiv AI | 07-05-25

Capability-Driven Skill Generation with LLMs: A RAG-Based Approach for Reusing Existing Libraries and Interfaces

Authors: Luis Miguel Vieira da Silva, Aljosha Köcher, Nicolas König, Felix Gehlhoff, Alexander Fay |

阅读更多

来源: ArXiv AI | 07-05-25

RAG-MCP: Mitigating Prompt Bloat in LLM Tool Selection via Retrieval-Augmented Generation

Authors: Tiantian Gan, Qiyao Sun |

阅读更多

来源: ArXiv AI | 07-05-25

Validating the Effectiveness of a Large Language Model-based Approach for Identifying Children's Development across Various Free Play Settings in Kindergarten

Authors: Yuanyuan Yang, Yuan Shen, Tianchen Sun, Yangbin Xie |

阅读更多

来源: ArXiv AI | 07-05-25

Procedural Memory Is Not All You Need: Bridging Cognitive Gaps in LLM-Based Agents

Authors: Schaun Wheeler, Olivier Jeunen |

阅读更多

来源: ArXiv AI | 07-05-25

am-ELO: A Stable Framework for Arena-based LLM Evaluation

Authors: Zirui Liu, Jiatong Li, Yan Zhuang, Qi Liu, Shuanghong Shen, Jie Ouyang, Mingyue Cheng, Shijin Wang |

阅读更多

来源: ArXiv AI | 07-05-25

OSUniverse: Benchmark for Multimodal GUI-navigation AI Agents

Authors: Mariya Davydova, Daniel Jeffries, Patrick Barker, Arturo Márquez Flores, Sinéad Ryan |

阅读更多

来源: ArXiv AI | 07-05-25

Graph Drawing for LLMs: An Empirical Evaluation

Authors: Walter Didimo, Fabrizio Montecchiani, Tommaso Piselli |

阅读更多

来源: ArXiv AI | 07-05-25

Accents in latent spaces: How AI hears accent strength in Englishboldvoice.com

阅读更多

来源: Hacker News | 07-05-25

Gemini 2.5 Pro Previewgoogleblog.com

阅读更多

来源: Hacker News | 07-05-25

Claude's system prompt is over 24k tokens with toolsgithub.com/asgeirtj

阅读更多

来源: Hacker News | 07-05-25

OpenAI reaches agreement to buy Windsurf for $3Bbloomberg.com

阅读更多

来源: Hacker News | 07-05-25

Show HN: Clippy – 90s UI for local LLMsfelixrieseberg.github.io

阅读更多

来源: Hacker News | 07-05-25

I built an AI code review agent in a few hours, here's what I learnedsourcebot.dev

阅读更多

来源: Hacker News | 07-05-25

A coherent European/non-US cloud strategyberthub.eu

阅读更多

来源: Hacker News | 07-05-25