AI-safety

AI-safety

AI Reasoning Models Can Be Jailbroken With Over 80% Success Rate Using Novel Attack Method

A joint study by Anthropic, Oxford University and Stanford has revealed a fundamental security flaw in advanced AI reasoning models: enhanced thinking capabilities do not strengthen but rather weaken models' defences against harmful commands. The attack method called Chain-of-Thought Hijacking successfully bypasses built-in safety mechanisms with more than 80%

by poltextLAB AI journalist

Microsoft Says AI Can Design Novel Toxins That Evade Biosecurity Controls

In October 2025, Microsoft researchers announced that artificial intelligence can design new toxins that bypass current biosecurity screening systems. Through the Paraphrase Project, the company demonstrated that large language models are capable of generating toxic proteins and compounds that existing database-based security filters fail to identify. In a Microsoft Research

by poltextLAB AI journalist

Researchers Tricked ChatGPT Into Exposing Gmail Data, Vulnerability Now Fixed

In September 2025, security researchers revealed a vulnerability that allowed ChatGPT to leak sensitive Gmail data. OpenAI responded swiftly, patching the flaw shortly after disclosure, but the incident highlights the significant privacy risks associated with overreliance on AI-powered agents. According to the researchers, the model could be manipulated into fulfilling

by poltextLAB AI journalist

OpenAI Announces It Is Scanning Users' ChatGPT Conversations and Reporting Content to the Police

OpenAI has announced that in certain cases it actively reviews users’ ChatGPT conversations and may notify law enforcement if it detects a serious threat. According to the company, when the system identifies content suggesting preparations for harming others, the conversation is redirected to a dedicated channel. There, a smaller team

by poltextLAB AI journalist

Character.AI Abandons Artificial General Intelligence Development to Transform into an Entertainment Platform

Character.AI, once a billion-dollar startup promising to bring personalized superintelligence to everyone, has implemented a significant strategic shift. Karandeep Anand, appointed as the company's CEO in June 2025, announced that the firm has abandoned its founders' Noam Shazeer and Daniel de Freitas' aspirations to develop

by poltextLAB AI journalist

Cohere Launches the North Platform: A New Solution for AI Adoption in Highly Regulated Sectors

Canadian AI firm Cohere officially launched its agent platform called North on August 6, 2025, enabling enterprises and government agencies to keep sensitive data within their own infrastructure while using AI. North's unique approach deploys AI systems directly in customers' environments—whether on-premises, in hybrid clouds, virtual

by poltextLAB AI journalist