GenAI textbook Part 3 Chapter 1 Natural Language Processing (NLP)

Fundamentals and Purpose of Natural Language Processing in Artificial Intelligence

Jan 22, 2025

4 min read

Fundamentals and Purpose of Natural Language Processing in Artificial Intelligence — Source: Unsplash - googledeepmind

Natural Language Processing (NLP) is a pivotal subfield of artificial intelligence (AI) that focuses on the interaction between computers and human language. By enabling machines to understand, interpret, and generate human language, NLP bridges the gap between human communication and computational systems. At its core, NLP combines principles from computer science, linguistics, and cognitive science to model and process human language. The foundational aim is to enable machines to parse, understand, and generate text or speech. According to Jurafsky and Martin (Jurafsky & Martin 2025), NLP encompasses several key tasks, including tokenisation, part-of-speech tagging, syntactic parsing, semantic analysis, and discourse processing. These tasks form a pipeline that transforms raw text into structured representations suitable for computational processing. Tokenisation, the process of breaking text into individual words or phrases, serves as the initial step in most NLP systems. Subsequent steps, such as part-of-speech tagging and syntactic parsing, assign grammatical roles and hierarchical structures to these tokens, enabling machines to grasp sentence structure (Chomsky 2002). Semantic analysis further enriches this understanding by mapping words and phrases to their meanings, often using resources like WordNet (Miller 1995). More advanced tasks, such as sentiment analysis and named entity recognition, rely on these foundational steps to extract context-specific information from text.

The evolution of NLP can be traced through distinct paradigms: rule-based systems, statistical models, and neural network-based approaches. Early NLP systems, as described by Winograd (1972), relied heavily on hand-crafted rules and linguistic knowledge bases. These systems, while effective for specific tasks, were brittle and lacked scalability due to the complexity and variability of human language. The advent of statistical NLP in the 1990s marked a significant shift, leveraging probabilistic models and large corpora to infer linguistic patterns (Manning & Schütze 1999). Techniques such as Hidden Markov Models and n-gram models enabled more robust language processing by modelling word sequences and their probabilities. However, these approaches struggled with capturing long-range dependencies and semantic nuances. The introduction of deep learning, particularly recurrent neural networks (RNNs) and transformers, revolutionised NLP in the 2010s. Vaswani et al. (2017) demonstrated the efficacy of transformer architectures, which rely on attention mechanisms to model relationships between words regardless of their distance in a sentence. Models like BERT (Devlin et al. 2019) and GPT (Radford et al. 2018; Brown et al. 2020) have since set benchmarks for tasks such as machine translation, text summarisation, and question answering, showcasing the power of neural NLP.

NLP serves as a critical enabler of advanced AI functionalities, addressing both theoretical and applied objectives. Firstly, it facilitates robust human-computer interaction by processing natural language inputs to infer user intent and generate contextually appropriate outputs. This capability supports dialogue systems that model conversational pragmatics, enhancing accessibility for diverse user groups, including those with motor or sensory impairments (Hirschberg & Manning 2015). Secondly, NLP drives knowledge extraction from unstructured corpora, transforming textual data into structured representations. In biomedical informatics, specialised models like BioBERT parse clinical narratives and biomedical literature to inform diagnostic reasoning and accelerate research (Lee et al. 2020), while in legal informatics, it synthesises case law to expedite jurisprudential analysis (Friedman et al. 2004; Ashley 2017). These applications underscore NLP’s role in augmenting domain-specific expertise through scalable information processing. Thirdly, NLP advances cross-linguistic interoperability through machine translation and multilingual text analysis. Transformer-based models enable high-fidelity translation, facilitating global knowledge dissemination in academic, commercial, and diplomatic contexts (Vaswani et al. 2017). Finally, NLP underpins advanced reasoning in AI systems by enabling machines to perform tasks such as question answering, text summarisation, and logical inference. Transformer models, trained on diverse corpora, generate coherent and contextually grounded outputs, supporting applications in automated research synthesis and decision support systems (Devlin et al. 2019). These capabilities enhance AI’s role in augmenting human intellect, particularly in data-intensive domains requiring rapid, accurate analysis.

References:

1. Ashley, Kevin D. 2017. Artificial Intelligence and Legal Analytics: New Tools for Law Practice in the Digital Age. Cambridge: Cambridge University Press. ^ Back

2. Brown, Tom, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D. Kaplan, Prafulla Dhariwal, Arvind Neelakantan et al. 2020. Language models are few-shot learners. Advances in Neural Information Processing Systems, 33: 1877–1901. ^ Back

3. Chomsky, Noam. 2002. Syntactic Structures. 2nd ed., with an introduction by D. W. Lightfoot. Berlin – New York: Mouton de Gruyter. ^ Back

4. Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. ‘BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding’. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 4171–86. ^ Back

5. Friedman, Carol, Lyudmila Shagina, Yves Lussier, and George Hripcsak. 2004. ‘Automated Encoding of Clinical Documents Based on Natural Language Processing’. Journal of the American Medical Informatics Association 11 (5): 392–402. ^ Back

6. Hirschberg, Julia, and Christopher D. Manning. 2015. ‘Advances in Natural Language Processing’. Science 349 (6245): 261–266. ^ Back

7. Jurafsky, Daniel, and James H. Martin. 2025. Speech and Language Processing. 3rd ed., draft (January 12, 2025). https://web.stanford.edu/~jurafsky/slp3/ ^ Back

8. Lee, Jinhyuk, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. 2020. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36(4): 1234–1240. ^ Back

9. Manning, Christopher, and Hinrich Schütze. 1999. Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press. ^ Back

10. Miller, George A. 1995. ‘WordNet: A Lexical Database for English’. Communications of the ACM 38(11): 39–41. ^ Back

11. Radford, Alec, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. ‘Improving Language Understanding by Generative Pre-Training’. Unpublished manuscript, OpenAI. ^ Back

12. Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. ‘Attention Is All You Need’. arXiv. doi:10.48550/ARXIV.1706.03762 – ^ Back

13. Winograd, Terry. 1972. ‘Understanding Natural Language’. Cognitive Psychology 3(1): 1–191. ^ Back