Vector Embeddings

It may be difficult to grasp while engaging with for example ChatGPT, how the AI assistant is able to understand and reply to what you are typing. That it somehow knows the meanings of not only the words you are entering (prompting), but its context within the sentence and the entire conversation. How is it doing what it's doing? A key to this is Vector Embedding and the Generative Pre-trained Transformer model.

Since we know we are dealing with computational processes, numerical representations of data - what vector embeddings do is assign a vector (value) to a word, then cluster (group) words with similar meanings together. Following this is the training of the model on millions of conversations that help it to extract context and understanding from the sequence of words you enter via the prompt. But please note, the model doesn't yet fully comprehend it's reply and as you may have already experienced, it's responses may read back in perfect English, but lack in logic. OR worse, be completely false. Although responses like this are growing increasingly rare, it is important to include a further validation step to assess accuracy and potential bias.

Here's a breakdown of the process:

So, vector embeddings don't just create a random vector for each word. They aim to capture the semantic meaning and relationships between words, resulting in a representation where similar words reside close together in the vector space. This allows machines to process and understand the nuances of human language more effectively.


So, what about Embedding for Images?

The concept and approach is similar, but with some key differences. Read more...

Digram courtesy of qdrant.tech/ - Read their full in depth article on Vector Embeddings