Google Has Introduced TxGemma Model Family to Accelerate Therapeutic Development

Google Has Introduced TxGemma Model Family to Accelerate Therapeutic Development
Source: Freepik via freepik licence

On 25 March 2025, Google officially announced the release of TxGemma, a collection of open models designed to improve the efficiency of therapeutic development. Based on Gemma 2, TxGemma is available in three sizes (2B, 9B, and 27B parameters) and has been specifically trained to understand and predict the properties of therapeutic entities throughout the entire discovery process, from identifying promising targets to helping predict clinical trial outcomes. These models, fine-tuned on seven million training examples, can potentially shorten the time from lab to bedside and reduce the costs associated with traditional methods, where 90% of drug candidates fail beyond phase 1 trials.

TxGemma is available in both 'predict' and 'chat' versions serving different use cases. The predict models demonstrated strong performance across 66 therapeutic development tasks, outperforming or roughly matching the previous state-of-the-art generalist model on 64 tasks and remaining competitive with models specifically designed for single tasks. In contrast, the chat models uniquely provide reasoning for predictions and engage in more complex discussions. A key capability of TxGemma models is their training on a comprehensive dataset of small molecules, proteins, nucleic acids, diseases, and cell lines from the Therapeutic Data Commons (TDC), enabling them to perform a wide range of classification, regression, and generation tasks.

TxGemma's capabilities can be further extended through Agentic-Tx, a therapeutics-focused agent powered by Gemini 2.5. This system is equipped with 18 tools, including TxGemma, general search tools, and specific molecular and protein tools that together enable handling complex workflows. Agentic-Tx has achieved state-of-the-art results on reasoning-intensive chemistry and biology tasks on ChemBench and Humanity's Last Exam benchmarks. Google has made TxGemma models available on both Vertex AI Model Garden and Hugging Face, and released several Colab notebooks demonstrating model use for inference, fine-tuning, and agent-based workflows, helping researchers adapt the models to their own therapeutic development challenges.

Sources:
1.

Introducing TxGemma: Open models to improve therapeutics development- Google Developers Blog
TxGemma is a collection of open models designed to improve efficiency of therapeutic development using language models.

2.

TxGemma: Efficient and Agentic LLMs for Therapeutics
Therapeutic development is a costly and high-risk endeavor that is often plagued by high failure rates. To address this, we introduce TxGemma, a suite of efficient, generalist large language models (LLMs) capable of therapeutic property prediction as well as interactive reasoning and explainability. Unlike task-specific models, TxGemma synthesizes information from diverse sources, enabling broad application across the therapeutic development pipeline. The suite includes 2B, 9B, and 27B parameter models, fine-tuned from Gemma-2 on a comprehensive dataset of small molecules, proteins, nucleic acids, diseases, and cell lines. Across 66 therapeutic development tasks, TxGemma achieved superior or comparable performance to the state-of-the-art generalist model on 64 (superior on 45), and against state-of-the-art specialist models on 50 (superior on 26). Fine-tuning TxGemma models on therapeutic downstream tasks, such as clinical trial adverse event prediction, requires less training data than fine-tuning base LLMs, making TxGemma suitable for data-limited applications. Beyond these predictive capabilities, TxGemma features conversational models that bridge the gap between general LLMs and specialized property predictors. These allow scientists to interact in natural language, provide mechanistic reasoning for predictions based on molecular structure, and engage in scientific discussions. Building on this, we further introduce Agentic-Tx, a generalist therapeutic agentic system powered by Gemini 2.5 that reasons, acts, manages diverse workflows, and acquires external domain knowledge. Agentic-Tx surpasses prior leading models on the Humanity’s Last Exam benchmark (Chemistry & Biology) with 52.3% relative improvement over o3-mini (high) and 26.7% over o3-mini (high) on GPQA (Chemistry) and excels with improvements of 6.3% (ChemBench-Preference) and 2.4% (ChemBench-Mini) over o3-mini (high).

3.

TxGemma | Health AI Developer Foundations | Google for Developers