Skip to content

FAISS Datasource

FAISS Datasource uses the all-mpnet-base-v2 sentence transformer model based on the MPNet architecture (which is a variant of the BERT model that uses a more memory-efficient attention mechanism) to create embeddings. This model is particularly well-suited for short-to-medium length text embedding tasks. Embeddings are stored in a vector database created by FAISS.

FAISS Datasource creates a vector store, which contains sentence transformer model embeddings for the given documents.

A search query posed when employing a specific Framework is converted to a vector using the same sentence transformer model and the top k most similar vectors are selected, with the corresponding contents extracted from the original dataset.

The main benefits you get from using FAISS are:

  • Use of Semantic Similarity (top k results with most similar meaning).
  • On-prem implementation for use with confidential and sensitive information which cannot be uploaded to the cloud.