Meet ChatGPT: soulful NLP
December 8, 2022
Luca Mazzacane
(Pavia, Italy)
(Havana, Cuba)

Natural language processing, or NLP, is a branch of artificial intelligence in the field of computer science that works on giving computers the ability to understand natural language, spoken or written, similar to the way humans can. We’re now seeing huge advances in the development of NLP. 


ChatGPT, also known as chat Generative Pretrained Transformer, is an NLP tool that has been trained on large amounts of text data, enabling it to generate relevant, consistent and remarkably human-like responses to the prompts it’s given. It has the potential to greatly change the way we interact with technology, in everything from customer service and language translation to writing essays and developing computer programs. 


What is ChatGPT? 


Let’s let the chatbot describe itself: “ChatGPT is a natural language processing (NLP) model developed by OpenAI. It is designed to generate human-like responses to text input, allowing users to engage in natural, conversational interactions with the model… ChatGPT works by using machine learning algorithms to analyze and understand the meaning of text input, and then generating a response based on that input. The model is trained on large amounts of text data, allowing it to learn the patterns and structures of natural language.” 


“When I receive a query,” the bot tells us, I search through my training data to find patterns and information that match the query, and then I generate a response based on this information.” 


It’s an exciting development in the field of artificial intelligence, with the potential to revolutionize the way we interact with machines. 


How GPT relies on NLP 


To understand how ChatGPT works, it’s helpful to first understand the technology behind natural language processing (NLP). NLP is a subfield of artificial intelligence that focuses on interactions between computers and human languages. This includes parsing, generating and understanding natural languages, such as English, Spanish or Mandarin. 


NLP technology enables ChatGPT to understand the patterns and nuances of human language, which is essential for generating relevant and consistent responses. This is achieved through the use of machine learning algorithms, which are trained on large amounts of text data to understand the patterns and nuances of human language.


The chatbot explains how it’s been trained: “The data that was used to train me was sourced from a wide range of sources, including books, articles, websites, and other publicly available text data. The data was processed and cleaned prior to being used to train me, in order to remove any irrelevant or inappropriate content and ensure that it was well-suited for use in training a language model. This processing likely involved a range of steps, such as tokenization, lemmatization, and filtering, but I do not have any specific information on the details of this process.” 


Overall, the technology behind ChatGPT is complex and sophisticated, but the end result is a powerful tool for simply enhancing our interactions with technology. 


Background and Foreground 


ChatGPT was developed by OpenAI, a research institute and artificial intelligence (AI) lab founded in San Francisco in 2015. Original and still-current investors include Sam Altman (CEO of Open AI, and former president of startup accelerator YCombinator) and Elon Musk. 


Altman and Musk say they are motivated by both the extraordinary potential of artificial intelligence, and the existential risk it poses to humankind if not handled ethically. That raises a lot of questions as to the trust factor. 


Can we trust this? 


First question: is there potential bias in the design of ChatGPT? When asked about the methods used in selecting and preparing its training data, the chatbot responded that that information is “proprietary to OpenAI.” And when asked about which individuals at OpenAI contributed to shaping the data used to train ChatGPT, the bot responded: “I do not have any information on the specific individuals who were involved in shaping my training data and therefore my output.” So the jury’s still out on that. 


Next issue: real bias in the final product. The bot acknowledged the bias inherent in large language models. But it had some difficulty when asked about the tension between OpenAI’s expressed mission of benefiting humanity, and the inherent bias in the model: “It is not clear how biased automation could be considered a benefit to humanity.” It then admitted that AI systems trained on biased data, which necessarily give biased responses, are “likely to be counterproductive and harmful.” And it finally acknowledged that it is “unlikely that there is sufficient completely unbiased data available to train large language models exclusively on unbiased data.” 


That would make ChatGPT fun, interesting, and helpful for specific applications — but not sufficiently trustworthy for general use. 


The Road Ahead 


Altman expects the technology to eventually surpass human intelligence, which is when the real risks are being anticipated. Cosmologist Stephen Hawking, along with other scientists, have expressed the concern that sophisticated AI could reach a point where it’s continually re-designing itself at an accelerating rate, leading to what they call an “intelligence explosion” that eventually brings about human extinction. Hawking put it this way: “AI will be very good at accomplishing its goals; if humans get in the way, we could be in trouble.” 


It’s unclear whether ChatGPT carries those risks at all. Or whether, for the moment, our immediate concerns should be more about the bias inherent in the training of the model, and possibly the bias of those involved in choosing the training data and creating the algorithms. 


But it should be said that the issues now being seen with Twitter since Elon Musk’s acquisition make one want to ask good questions about bias in the reliability of ChatGPT.

This story is happening here


Log in to comment.

Deep Dive