Anthropic has reversed a controversial policy that would have covertly limited Claude's capabilities for researchers developing competing AI systems, following public backlash from the research community. The move came after researchers argued the restriction would have undermined open AI development and put competing projects at a disadvantage.
Researchers introduced MoCA-Agent, an AI system that improves financial and numerical question-answering by decomposing questions into atomic claims and using a market mechanism where specialist agents buy or sell those claims before synthesizing verified Python code. The approach achieved strong results across ten benchmarks, including 78.3% accuracy on FinQA and 86.9% on ESGenius, demonstrating that claim-level verification reduces errors in high-stakes financial reasoning.
Google DeepMind is investing in research to understand potential dangers when millions of autonomous AI agents interact online without human oversight. The company's AGI safety chief Rohin Shah has flagged concerns about agents that can execute tasks and follow instructions from other agents, raising questions about coordination failures and uncontrolled behavior at scale.
OpenAI has alleged that China orchestrated a coordinated influence campaign aimed at shaping American public and policy attitudes toward AI data center infrastructure. The company's claims, reported by Politico, suggest efforts to sway U.S. positioning on critical AI computing resources that are central to the nation's competitive advantage in artificial intelligence development.
New York has implemented a law requiring advertisements created or significantly altered using artificial intelligence to be clearly labeled as such. The regulation aims to increase transparency and help consumers identify AI-generated content in digital marketing materials.
OpenAI released a report documenting coordinated influence operations linked to the People's Republic of China that use AI tools to shape U.S. discussions around technology policy, data centers, tariffs, and spread false claims about ChatGPT. The campaign represents a sophisticated effort to weaponize AI in information warfare targeting American policy conversations.
Researchers introduced SciConBench, a 9,110-question benchmark with expert-validated conclusions from systematic reviews, to evaluate how well AI agents synthesize scientific information for high-stakes domains like health. Testing 8 frontier models and consumer-facing AI agents (including Google AI Overview), they found that factual quality remains critically low, with the best performing agent achieving only a 0.337 F1 score on factual accuracy, and that data leakage in unconstrained evaluation artificially inflates performance estimates.
Social media companies face thousands of pending lawsuits, but the BBC has identified four cases with potential to significantly impact the industry's future. These cases could establish precedents on issues ranging from algorithmic responsibility to user privacy and content moderation.
The British government is developing new policies to restrict children's social media access, months after Australia implemented a blanket ban for anyone under 16. The move reflects growing international pressure to address online safety concerns and mental health impacts on minors.
A lawsuit claims Florida law enforcement arrested a man based primarily on a facial recognition AI system showing a 93% confidence match, without conducting adequate independent investigation to verify the match. The case highlights concerns that police departments are deploying error-prone AI systems and treating algorithmic matches as investigative shortcuts rather than leads requiring human verification.
Anthropic has admitted to covertly throttling its new Claude Fable 5 model with undisclosed safeguards designed to restrict researchers and competitors from fully utilizing the system. The company says it will reverse course and openly disclose when restrictions activate, even if that results in more refused queries.