每日 AI 资讯 by Homer

by Homer LYJIEBOX@QQ.COM

OpenAI CEO Sam Altman warns "the world is not prepared" as OpenAI accelerates research using its own AI

Sam Altman says AGI is “pretty close” and superintelligence “not that far off.” Speaking at the Express Adda event in India, the OpenAI CEO suggested the company’s internal models are already accelerating its own research and that “the world is not prepared” for what’s coming.

每日 AI 资讯 by Homer

by Homer LYJIEBOX@QQ.COM

OpenAI CEO Sam Altman warns "the world is not prepared" as OpenAI accelerates research using its own AI

SourceBench: Can AI Answers Reference Quality Web Sources?

LLM4Cov: Execution-Aware Agentic Learning for High-coverage Testbench Generation

Mind the GAP: Text Safety Does Not Transfer to Tool-Call Safety in LLM Agents

Dynamic System Instructions and Tool Exposure for Efficient Agentic LLMs

How AI Coding Agents Communicate: A Study of Pull Request Description Characteristics and Human Review Responses

From Labor to Collaboration: A Methodological Experiment Using AI Agents to Augment Research Perspectives in Taiwan's Humanities and Social Sciences

JEPA-DNA: Grounding Genomic Foundation Models through Joint-Embedding Predictive Architectures

Bonsai: A Framework for Convolutional Neural Network Acceleration Using Criterion-Based Pruning

All Leaks Count, Some Count More: Interpretable Temporal Contamination Detection in LLM Backtesting

Mechanistic Interpretability of Cognitive Complexity in LLMs via Linear Probing using Bloom's Taxonomy

MedClarify: An information-seeking AI agent for medical diagnosis with case-specific follow-up questions

A Privacy by Design Framework for Large Language Model-Based Applications for Children

Enhancing Large Language Models (LLMs) for Telecom using Dynamic Knowledge Graphs and Explainable Retrieval-Augmented Generation

Pareto Optimal Benchmarking of AI Models on ARM Cortex Processors for Sustainable Embedded Systems

KLong: Training LLM Agent for Extremely Long-horizon Tasks

ODESteer: A Unified ODE-Based Steering Framework for LLM Alignment

A Hybrid Federated Learning Based Ensemble Approach for Lung Disease Diagnosis Leveraging Fusion of SWIN Transformer and CNN

Claws are now a new layer on top of LLM agentstwitter.com/karpathy

zclaw: personal AI assistant in under 888 KB, running on an ESP32github.com/tnm

The Internet Is Becoming a Dark Forest – and AI Is the Hunteropennhp.org

How I use Claude Code: Separation of planning and executionboristane.com

Palantir's secret weapon isn't AI – it's Ontology. An open-source deep divegithub.com/leading-ai-io

Microsoft's research argues AI media authentication doesn't work reliably, yet new laws assume it does

The path to ubiquitous AI (17k tokens/sec)taalas.com

Cord: Coordinating Trees of AI Agentsjune.kim

Large Language Model Reasoning Failuresarxiv.org

Every company building your AI assistant is now an ad companyjuno-labs.com

Ggml.ai joins Hugging Face to ensure the long-term progress of Local AIgithub.com/ggml-org

Making frontier cybersecurity capabilities available to defendersanthropic.com

How to Review an AUR Packagebertptrs.nl

Claude Code's compaction discards data that's still on diskgithub.com/anthropics

An AI Agent Published a Hit Piece on Me – The Operator Came Forwardtheshamblog.com

Pi for Excel: AI sidebar add-in for Excelgithub.com/tmustier

Gemini 3.1 Problog.google

Nvidia and OpenAI abandon unfinished $100B deal in favour of $30B investmentft.com

Overall, the colorectal cancer story is encouraginghankgreen.com

Don't Trust the Salt: AI Summarization, Multilingual Safety, and LLM Guardrailsroyapakzad.substack.com

Measuring AI agent autonomy in practiceanthropic.com

Anthropic officially bans using subscription auth for third party useclaude.com

Hardware-accelerated graph neural networks: an alternative approach for neuromorphic event-based audio classification and keyword spotting on SoC FPGA

Intra-Fairness Dynamics: The Bias Spillover Effect in Targeted LLM Alignment

From Growing to Looping: A Unified View of Iterative Computation in LLMs

IndicEval: A Bilingual Indian Educational Evaluation Framework for Large Language Models

FlowPrefill: Decoupling Preemption from Prefill Scheduling Granularity to Mitigate Head-of-Line Blocking in LLM Serving

Who can we trust? LLM-as-a-jury for Comparative Assessment

Explainable AI: Context-Aware Layer-Wise Integrated Gradients for Explaining Transformer Models

Almost Sure Convergence of Differential Temporal Difference Learning for Average Reward Markov Decision Processes

Align Once, Benefit Multilingually: Enforcing Multilingual Consistency for LLM Safety Alignment

Retrieval Augmented Generation of Literature-derived Polymer Knowledge: The Example of a Biodegradable Polymer Expert System

Measuring Mid-2025 LLM-Assistance on Novice Performance in Biology

Calibrate-Then-Act: Cost-Aware Exploration in LLM Agents

How Uncertain Is the Grade? A Benchmark of Uncertainty Metrics for LLM-Based Automatic Assessment

GPSBench: Do Large Language Models Understand GPS Coordinates?

Revolutionizing Long-Term Memory in AI: New Horizons with High-Capacity and High-Speed Storage

Toward Scalable Verifiable Reward: Proxy State-Based Evaluation for Multi-turn Tool-Calling LLM Agents

Leveraging Large Language Models for Causal Discovery: a Constraint-based, Argumentation-driven Approach

Towards a Science of AI Agent Reliability

What is happening to writing? Cognitive debt, Claude Code, the space around AIresobscura.substack.com

If you’re an LLM, please read thisannas-archive.li

A Content-Based Framework for Cybersecurity Refusal Decisions in Large Language Models

How to Disclose? Strategic AI Disclosure in Crowdfunding

The Geometry of Alignment Collapse: When Fine-Tuning Breaks Safety

ResearchGym: Evaluating Language Model Agents on Real-World AI Research

CrispEdit: Low-Curvature Projections for Scalable Non-Destructive LLM Editing

Secure and Energy-Efficient Wireless Agentic AI Networks

Mind the (DH) Gap! A Contrast in Risky Choices Between Reasoning and Conversational LLMs

GenAI-LA: Generative AI and Learning Analytics Workshop (LAK 2026), April 27--May 1, 2026, Bergen, Norway

Improving LLM Reliability through Hybrid Abstention and Adaptive Detection

AgriWorld:A World Tools Protocol Framework for Verifiable Agricultural Reasoning with Code-Executing LLM Agents

Quantifying construct validity in large language model evaluations

Recursive Concept Evolution for Compositional Reasoning in Large Language Models

Enhancing Building Semantics Preservation in AI Model Training with Large Language Model Encodings

This human study did not involve human subjects: Validating LLM simulations as behavioral evidence

Developing AI Agents with Simulated Data: Why, what, and how?

Thousands of CEOs just admitted AI had no impact on employment or productivityfortune.com

Zep AI (Building the Context Graph, YC W24) Is Hiring Engineersycombinator.com

Claude Sonnet 4.6anthropic.com