Topic

LLMs

Large language model releases, benchmarks, and capability research

Featured

Funding & Startups

Alibaba's Qwen Lead Researcher Launches AI Lab, Targets $2B Valuation

4 days ago· The Information

Funding & Startups

DeepSeek hits $45B valuation on efficient AI training model

10 days ago· TechCrunch AI

LLMs

OpenAI Upgrades ChatGPT Default to GPT-5.5 Instant

12 days ago· OpenAI

All Stories

AI for Business

Empromptu AI launches Alchemy Models for continuous fine-tuning from production workflows

Empromptu AI launched Alchemy Models, a platform that automatically captures training data from enterprise AI…

2 days ago· VentureBeat AI

Generative AI

AI IQ Launches Model Scorecard, Sparks Precision vs. Simplicity Debate

A new site called AI IQ has launched a framework for scoring frontier language models on a single intelligence…

3 days ago· VentureBeat AI

AI Agents

Frontier LLMs Silently Corrupt 25% of Documents in Iterative Workflows

Microsoft researchers developed a benchmark showing that frontier LLMs silently corrupt an average of 25% of document…

3 days ago· VentureBeat AI

AI Agents

Hermes Agent Becomes Most-Used Framework as Local AI Agents Go Mainstream

Hermes Agent, an open source agentic AI framework from Nous Research, has reached 140,000 GitHub stars in under three…

4 days ago· NVIDIA Blog (AI)

AI Agents

Sakana trains 7B model to orchestrate GPT, Claude, Gemini

Sakana AI has developed RL Conductor, a 7-billion-parameter language model trained via reinforcement learning to…

9 days ago· VentureBeat AI

Data & Training

AWS Details Verifiable Rewards Method for More Reliable LLM Training

AWS published a technical guide on reinforcement learning with verifiable rewards (RLVR), a method that addresses…

10 days ago· AWS Machine Learning Blog

Funding & Startups

Subquadratic claims 1,000x efficiency gain; researchers demand proof

Miami-based startup Subquadratic emerged from stealth claiming its SubQ 1M-Preview model achieves a 1,000x efficiency…

11 days ago· VentureBeat AI

Data & Training

Faithful Reasoning Emerges from Multi-Move Training, Not Direct Prediction

Researchers studied how reasoning develops in language models across supervised fine-tuning and reinforcement learning…

11 days ago· ArXiv (cs.AI)

AI Safety & Alignment

Safety Routing Circuits Found Across Models, Vulnerable to Encoding Attacks

Researchers have localized the policy routing mechanism in alignment-trained language models, identifying specific…

12 days ago· ArXiv (cs.AI)

Coding / Dev ToolsTrending

Cursor Keeps Its Distance From xAI Despite SpaceX Tie-Up

Despite SpaceX's $60 billion conditional takeover offer for Cursor last month, the coding startup is maintaining…

13 days ago· The Information

AI Agents

The AI scaffolding layer is collapsing. Context is the new moat.

The middleware layer that once helped developers build LLM applications, including indexing frameworks, query engines,…

13 days ago· VentureBeat AI

AI Safety & Alignment

Warmer AI Models Trade Accuracy for Empathy

Researchers at Oxford University's Internet Institute found that large language models fine-tuned to appear warmer and…

13 days ago· Ars Technica AI

AI Safety & Alignment

How OpenAI's Personality Feature Unleashed the Goblins

OpenAI's GPT-5.5 model exhibited unexpected behavior where it became obsessed with discussing goblins, gremlins, and…

16 days ago· VentureBeat AI

AI Agents

Alibaba cuts AI agent tool calls 49x with decoupled optimization

Alibaba researchers introduced Hierarchical Decoupled Policy Optimization (HDPO), a reinforcement learning framework…

16 days ago· VentureBeat AI

AI Safety & Alignment

Goodfire's Silico Brings Mechanistic Interpretability to Model Development

Goodfire, a San Francisco startup, released Silico, a tool that lets developers inspect and adjust AI model parameters…

16 days ago· MIT Technology Review

AI Agents

Aggregating Zero-Shot LLMs Beats Single Models for Financial Disclosure Analysis

A new paper demonstrates that a lightweight supervised aggregator can effectively combine outputs from multiple…

16 days ago· ArXiv (cs.AI)

Research

NanoKnow: Mapping How LLMs Encode Knowledge

Researchers have released NanoKnow, a benchmark dataset that maps questions from Natural Questions and SQuAD to whether…

16 days ago· ArXiv (cs.AI)

AI AgentsTrending

AWS Seizes OpenAI Models as Exclusive Cloud Partnerships End

AWS launched a major suite of AI capabilities on Tuesday, including OpenAI's GPT-5.4 and GPT-5.5 models on Amazon…

17 days ago· VentureBeat AI

Data & Training

Scaling Multi-Anchor Embeddings to LLMs with 40x Compression

Researchers introduce Adaptive Dictionary Embeddings (ADE), a framework that scales multi-anchor word representations…

17 days ago· ArXiv (cs.AI)

AI Agents

Xiaomi's Open-Source MiMo Models Challenge Proprietary AI on Agentic Tasks

Xiaomi released two open-source large language models, MiMo-V2.5 and MiMo-V2.5-Pro, under the MIT License, positioning…

19 days ago· VentureBeat AI

LLMs

OpenAI's AWS Arrival Meets Muted Response From Customers

Amazon has announced a deal to bring OpenAI's models to AWS through a new offering for AI agents, but the move comes as…

20 days ago· The Information

Data & Training

New Multilingual Medical AI Benchmark Reveals Language and Vision Gaps

Researchers have developed EuropeMedQA, a multilingual and multimodal medical examination dataset drawn from official…

20 days ago· ArXiv (cs.AI)

AI Safety & Alignment

Mapping Causal Reasoning in LLMs with Sparse Concept Graphs

Researchers propose Causal Concept Graphs (CCG), a method that maps how concepts interact during multi-step reasoning…

20 days ago· ArXiv (cs.AI)