{"id":71454,"date":"2024-02-09T15:19:20","date_gmt":"2024-02-09T13:19:20","guid":{"rendered":"https:\/\/www.gebrsanders.nl\/?p=71454"},"modified":"2024-10-03T14:20:38","modified_gmt":"2024-10-03T12:20:38","slug":"your-guide-to-natural-language-processing-nlp-by","status":"publish","type":"post","link":"https:\/\/www.gebrsanders.nl\/your-guide-to-natural-language-processing-nlp-by\/","title":{"rendered":"Your Guide to Natural Language Processing NLP by Diego Lopez Yse"},"content":{"rendered":"

What is Natural Language Processing? Introduction to NLP<\/h1>\n<\/p>\n

\"natural<\/p>\n

Finally, we present a discussion on some available datasets, models, and evaluation metrics in NLP. We restricted the vocabulary to the 50,000 most frequent words, concatenated with all words used in the study (50,341 vocabulary words in total). These design choices enforce that the difference in brain scores observed across models cannot be explained by differences in corpora and text preprocessing. In machine translation done by deep learning algorithms, language is translated by starting with a sentence and generating vector representations that represent it.<\/p>\n<\/p>\n

More critically, the principles that lead a deep language models to generate brain-like representations remain largely unknown. Indeed, past studies only investigated a small set of pretrained language models that typically vary in dimensionality, architecture, training objective, and training corpus. The inherent correlations between these multiple factors thus prevent identifying those that lead algorithms to generate brain-like representations.<\/p>\n<\/p>\n

This progression of computations through the network is called forward propagation. The input and output layers of a deep neural network are called visible layers. The input layer is where the deep learning model ingests the data for processing, and the output layer is where the final prediction or classification is made. In total, we investigated 32 distinct architectures varying in their dimensionality (\u2208 [128, 256, 512]), number of layers (\u2208 [4, 8, 12]), attention heads (\u2208 [4, 8]), and training task (causal language modeling and masked language modeling). While causal language transformers are trained to predict a word from its previous context, masked language transformers predict randomly masked words from a surrounding context.<\/p>\n<\/p>\n