AIREVOLUTION

Where Does Bias Come From? Exploring Dataset Imbalance, Annotation Bias, and Pre-existing Modelling Choices

Bias in artificial intelligence systems has become a critical concern as these technologies increasingly influence decision-making across domains such as healthcare, criminal justice, and employment. Bias manifests as systematic errors that lead to unfair or discriminatory outcomes, often disproportionately affecting marginalised groups. Understanding the origins of bias is essential for

by Miklós Sebők - Rebeka Kiss • Apr 30, 2025

GenAI textbook Part 12 Chapter 2

Generative AI and the Evolving Challenge of Deepfake Detection

Generative Artificial Intelligence (AI) has revolutionised digital media through its ability to synthesise highly realistic content, with deepfake technology standing as one of its most prominent and contentious applications. The term “deepfake,” derived from “deep learning” and “fake,” refers to synthetic media—typically videos or audio—that convincingly depict individuals

by Miklós Sebők - Rebeka Kiss • Apr 26, 2025

GenAI textbook Part 12 Chapter 1

Misinformation and the Role of Generative AI Models in Its Spread

Misinformation, defined as false or misleading information disseminated regardless of intent, poses significant challenges to societal trust and democratic processes (Wardle & Derakhshan, 2017). Unlike disinformation, which involves deliberate deception, misinformation encompasses a broader spectrum, including unintentional errors, rumours, and misinterpretations. The advent GenAI models, capable of producing human-like text,

by Miklós Sebők - Rebeka Kiss • Apr 22, 2025

GenAI textbook Part 11 Chapter 1

Types and Mechanisms of Censorship in Generative AI Systems

Content restriction in generative AI manifests as explicit or implicit censorship. Explicit censorship uses predefined rules to block content like hate speech or illegal material, employing keyword blacklists, pattern-matching, or classifiers (Gillespie 2018). DeepSeek’s models, aligned with Chinese regulations, use real-time filters to block politically sensitive content, such as

by Miklós Sebők - Rebeka Kiss • Apr 18, 2025

GenAI textbook Part 10 Chapter 2

Detecting, Evaluating, and Reducing Hallucinations

Detecting hallucinations involves distinguishing accurate outputs from those that deviate from factual or contextual grounding. One approach is consistency checking, where LLM outputs are evaluated against external knowledge bases to identify discrepancies. Manakul et al. (2023) propose SelfCheckGPT, a zero-resource method that uses the model’s internal consistency to detect

by Miklós Sebők - Rebeka Kiss • Apr 15, 2025

GenAI textbook Part 10 Chapter 1

Conceptual Contrasts Between Parroting and Hallucination in Language Models

Advancements in artificial intelligence (AI), particularly in natural language processing (NLP), highlight critical distinctions between parroting and hallucination in language models. Parroting refers to AI reproducing or mimicking patterns and phrases from training data without demonstrating understanding or creativity. Hallucination involves generating factually incorrect, implausible, or fabricated outputs, often diverging

by Miklós Sebők - Rebeka Kiss • Apr 10, 2025

GenAI textbook Part 9 Chapter 3

The Environmental Costs of Artificial Intelligence: A Growing Concern

The rapid integration of Artificial Intelligence (AI) into global economies has driven transformative advancements in sectors such as healthcare and agriculture. However, this technological revolution incurs significant environmental costs, particularly through substantial energy consumption and greenhouse gas (GHG) emissions. The carbon footprint of AI, stemming from energy-intensive processes like hardware

by Miklós Sebők - Rebeka Kiss • Apr 7, 2025

GenAI textbook Part 9 Chapter 2

Cost Optimisation Strategies: Token Usage Optimisation, Batch Processing, and Prompt Compression Algorithms

Contemporary researchers face unprecedented financial barriers when engaging with state-of-the-art language models, particularly through API-based services where costs are directly proportional to token consumption and computational resource utilisation. The challenge is compounded by increasing complexity of research tasks requiring extensive prompt engineering, iterative model interactions, and large-scale data processing operations.

by Miklós Sebők - Rebeka Kiss • Apr 3, 2025

GenAI textbook Part 9 Chapter 1

Costs of Generative AI Applications: Hardware Costs and Resource Requirements from the Issuer's Perspective

The emergence of large language models (LLMs) and generative AI applications has ushered in a new era of artificial intelligence capabilities, fundamentally altering the landscape of computational requirements and associated costs. Generative AI systems, built upon transformer architectures and trained on vast datasets, have demonstrated remarkable scalability and adaptability across

by Miklós Sebők - Rebeka Kiss • Mar 30, 2025

GenAI textbook Part 7 Chapter 5

Retrieval-Augmented Generation (RAG): Architecture, Mechanisms, and Core Advantages

Retrieval-Augmented Generation (RAG) represents a paradigm shift in natural language processing (NLP), integrating large language models (LLMs) with dynamic information retrieval systems to produce responses that are both contextually enriched and factually grounded (Lewis et al. 2020). At its core, the RAG architecture couples a conventional generative model—one that

by Miklós Sebők - Rebeka Kiss • Mar 25, 2025

GenAI textbook Part 7 Chapter 4

Comparing leading large language models: architectures, performance and specialised capabilities

Most contemporary LLMs employ a decoder‑only transformer architecture, which processes sequences in parallel via self‑attention. However, scaling dense transformers linearly in size increases computation and cost. Mixture‑of‑experts (MoE) approaches address this by activating only a subset of parameters per token. In the Switch Transformer, MoE routing

by Miklós Sebők - Rebeka Kiss • Mar 23, 2025

GenAI textbook Part 7 Chapter 3

Small Language Models (SLMs) and Knowledge Distillation

Small Language Models (SLMs) are compact neural networks designed to perform natural language processing (NLP) tasks with significantly fewer parameters and lower computational requirements than their larger counterparts. SLMs aim to deliver robust performance in resource-constrained environments, such as mobile devices or edge computing systems, where efficiency is paramount. The

by Miklós Sebők - Rebeka Kiss • Mar 20, 2025