In the Explained series of blog posts, we break down complex technologies incorporated in our AI, Aplysia. In this entry, we dive into Large Language Models (LLMs). Keep reading to follow an accessible overview, or jump to the section showcasing how LLMs are integrated into our solution:
A Large Language Model (LLM) is a deep learning algorithm that can recognise and interpret human language or other types of complex data. The “large” part of the name comes from LLMs training on massive data sets. Many LLMs are trained on data gathered from the Internet – thousands or millions of gigabytes’ worth of text.
How large is “large”?
The Library of Congress in the United States is one of the largest libraries in the world, with over 170 million items in its collection. It is believed that the library’s text collection alone holds 20 terabytes of data. Even this astounding amount of data can be exceeded, though, when it comes to LLMs. A large LLM’s training data can exceed the Library of Congress’s text collection by several orders of magnitude, demonstrating the enormous amount of data these AI models process in the course of development.
How LLMs work
Just like the human brain is composed of neurons that connect and send signals to each other, a deep learning model uses a network of connected nodes, known as an Artificial Neural Network (ANN). Neural networks learn to recognise data patterns by adjusting the weights of connections between neurons.
These weighted connections link neurons in adjacent layers, which transmit signals from one layer to the next. The strength of these connections, represented by weights, determines how much influence one neuron’s output has on another neuron’s input. During training, the network adjusts its weights based on examples from the dataset.
LLMs use a more complex architecture of neural networks called transformers, which differ from traditional neural networks in their ability to process entire sequences of data simultaneously rather than step-by-step. This allows transformers to capture long-range dependencies and contextual relationships more effectively. To better understand how these models work, let’s take a closer look at a step-by-step example.
An illustrative example
To better understand how these models work, let’s take a closer look at a step-by-step example using the sentence “The weather today is very” – it appears unfinished, but we will get there further on.
Step 1: Tokenisation
The sentence is first tokenised. Tokenisation is the process of breaking down text into smaller units, often words or subwords.
- Tokens: [“The”, “weather”, “today”, “is”, “very”]
Step 2: Word embeddings
Each token is then converted into a word embedding. Here, the process start to get more complex. This embedding is a high-dimensional vector that represents the token in a continuous vector space, capturing the semantic and syntactic meanings, often within a specific context.
For example:
- “The” might be represented as a vector: [0.1, 0.3, …, 0.2]
- “weather” might be represented as a vector: [0.7, 0.5, …, 0.8]
- …
Each of these vectors might have hundreds or thousands of dimensions. Related words are spatially closer to each other – words with similar meanings or usage patterns are positioned near each other in the vector space. Let’s step to the side for a moment to illustrate it with another set of words:
Step 3: Transformer architecture
The complexity of the process keeps on increasing. These embeddings are fed into a transformer model. A transformer has multiple layers of self-attention mechanisms and feed-forward neural networks. The self-attention mechanism helps the model focus on different parts of the input sentence to understand the context.
- Self-attention: Calculating the relevance of each word to every other word in the sentence. For example, it determines how much attention should be given to “weather” when considering the word “very”.
Step 4: Contextual understanding
Let’s see how the seemingly missing part of the sentence we use as the example is addressed. As the sentence passes through the layers of the transformer, the model builds a contextual understanding. This means that it adjusts the word embeddings based on their context in the sentence.
- For the word “very”, the context is understood based on “The weather today is”, indicating it likely needs an adjective that describes “weather.”
Step 5: Generating the next word
After processing the input, the model predicts the next word. It does this by generating a probability distribution over the vocabulary for the next token.
- Probability distribution: The model might output probabilities like: [“nice”: 0.4, “bad”: 0.2, “rainy”: 0.1, …]
The word with the highest probability is then chosen as the following word in the sentence. In this case, the selected word would be “nice.”
Step 6: Completing the sentence
The selected word “nice” is added to the sentence, and the process can be repeated for further words if needed.
- Completed sentence: “The weather today is very nice.”
Advantages of Large Language Models
- Versatility: LLMs can be adapted for various applications, from translation to content generation, providing flexibility across industries.
- Continuous learning: LLMs can be improved over time with additional data, allowing for ongoing refinement and performance enhancement.
Use cases of LLMs
- Content creation: Generating articles, stories, and reports.
- Translations: Adapting text from one language to another.
- Education: Assisting in learning by explaining and answering questions on various subjects.
- Help with programming: Writing and debugging code.
- Summaries: Processing long articles or documents to provide an overview.
- Chatbots and virtual assistants: Conversing with people, answering questions, providing information, and helping with tasks.
Limitations of LLMs
- Data bias: LLMs can only be as reliable as the data they ingest. If fed false information, they will give false information in response to user queries.
- Hallucinations: LLMs sometimes “hallucinate”; they create fake information when they are unable to produce an accurate answer.
- Security: User-facing applications based on LLMs are as prone to bugs as any other application. LLMs can be manipulated via malicious inputs to provide certain types of responses over others – including dangerous or unethical responses.
- Privacy: Users may upload secure, confidential data into them to increase their productivity. However, LLMs use the inputs they receive to train their models further, and they are not designed to be secure vaults; they may expose confidential data in response to queries from other users.
To explore the answer to this question better, it is important to understand the broader capacity of our solution, which goes beyond a simple chatbot. Some of the functionalities that need to be taken into account include:
- A range of integrations with various hospitality tools, including Booking Engines, Property Management Systems, Maintenance Management Systems, CRMs, Payment and Financing Gateways, Service Automation, and more.
- Providing performance reports for both the chatbot and the hotel team.
- Organising requests from multiple channels, including a webchat, social media, instant messaging apps, and others.
The chatbot aspect of our solution is more complex than redirecting requests to GPT, although it is often tempting to follow this thought shortcut during explanations. We consume knowledge from data provided to us by our clients, and then we curate the whole process to tackle LLMs’ limitations.
How our AI, Aplysia, addresses the limitations of Large Language Models
- Pattern recognition for hallucinations: Utilising algorithms or techniques to identify text generation patterns commonly associated with inaccuracies or fabrications, helping the model recognise when it is generating beyond its reliable knowledge base and, in other words, preventing LLMs from generating answers that are made up or inaccurate as it would be detrimental to both guests and hotel brands.
- Confidence metrics in generated content: Implementing metrics to assess the model’s confidence in its outputs, such as the likelihood of generated words or phrases, uncertainty measures, or indicators of how confident the model is about its statements. What it means is that the our AI is self-evaluating if what is being generated is good enough to share.
- Improved chance to answer: When Aplysia is not confident in its answer, it may still provide an answer. It can happen in a couple of instances:
- If the answer to the relevant Frequently Asked Question (FAQ) is empty and other FAQs have similar requests, the solution will show the top 3 most similar FAQs;
- If the answer to the relevant FAQ is not empty and Aplysia is not confident, the solution tries to answer using the legacy version of Aplysia (without LLMs). If none of the solutions are confident enough to respond, we show the top 3 most similar FAQs.
- Validation measures with a knowledge base: Comparing or validating the model’s output against a trusted knowledge base of the hotel. This step can help identify and correct misinformation.
- Point-of-view checker: A feature designed to ensure the alignment of generated content’s perspective with credible and established viewpoints, maintaining contextual accuracy and relevance.
- Guardrails: Avoiding code injection, “jail-breaking”, data leakage, and handling illegible or unclear content. In other words, ensuring our solution cannot be used for anything other than its intended purpose.
Examples of how our solution controls outputs of LLMs
1. Unrelated questions: since Aplysia was designed for hospitality, it doesn’t answer unrelated topics.
2. Anti-hallucination mechanisms: sometimes, the LLM generates an answer that is not aligned with what our clients provide. To deal with this, we have anti-hallucination mechanisms such as a point-of-view corrector, which maintains the point-of-view used in the FAQ. Let’s take a look at a practical example:
- Question: “Does the hotel have parking?”
- The answer to the FAQ provided by the hotel: “We do not have parking in the building. You may contact the reception to check spaces available in a Parking nearby.”
- Answer generated: “The hotel does not have parking in the building, but you can contact the reception to check spaces available in a parking nearby.”
- Answer shown: “We do not have parking available on-site, but we would be happy to assist you in checking for nearby parking spaces. Please contact our reception desk for more information.”
To use a specific example, here are the differences between the GPT-generated answer and one controlled by Aplysia when inquired about “Savoy Palace parking prices”:
The answer provided by HiJiffy’s Aplysia is the most accurate as it corresponds to the information provided to the solution by the hotel. GPT’s answer might have been based on other of Savoy Signature’s properties, might correspond to parking with extra services (valet, for example), or might be a made-up value.
In another example, GPT gave a made-up answer to a query about “Savoy Palace Press and Partnerships contacts”:
If you are interested in learning more about various technologies used in Aplysia, explore a section of our website dedicated to our artificial intelligence, follow HiJiffy on LinkedIn and subscribe to our newsletter in the footer.
Sources
This article is based on technical contributions by Vanda Azevedo from HiJiffy’s AI Team.