# Record Managers

***

Record Managers keep track of your indexed documents, preventing duplicated vector embeddings in [Vector Store](https://docs.flowiseai.com/integrations/langchain/vector-stores).

When document chunks are upserting, each chunk will be hashed using [SHA-1](https://github.com/emn178/js-sha1) algorithm. These hashes will get stored in Record Manager. If there is an existing hash, the embedding and upserting process will be skipped.

In some cases, you might want to delete existing documents that are derived from the same sources as the new documents being indexed. For that, there are 3 cleanup modes for Record Manager:

{% tabs %}
{% tab title="Incremental" %}
When you are upserting multiple documents, and you want to prevent deletion of the existing documents that are not part of the current upserting process, use **Incremental** Cleanup mode.

1. Let's have a Record Manager with `Incremental` Cleanup and `source` as SourceId Key

<div align="left"><figure><img src="https://823733684-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F00tYLwhz5RyR7fJEhrWy%2Fuploads%2Fgit-blob-6bc3f51ca7b4369b85eb5ede1eb016b92668fe26%2Fimage%20(4)%20(1)%20(1)%20(1)%20(1)%20(1)%20(1)%20(1)%20(1)%20(1)%20(1).png?alt=media" alt="" width="264"><figcaption></figcaption></figure> <figure><img src="https://823733684-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F00tYLwhz5RyR7fJEhrWy%2Fuploads%2Fgit-blob-ac1b4de9cf6bcacae3f54b629a70649d2474a946%2Fimage%20(5)%20(1)%20(1)%20(1)%20(1)%20(1)%20(1)%20(1)%20(1)%20(1)%20(1).png?alt=media" alt="" width="410"><figcaption></figcaption></figure></div>

2. And have the following 2 documents:

| Text | Metadata         |
| ---- | ---------------- |
| Cat  | `{source:"cat"}` |
| Dog  | `{source:"dog"}` |

<div align="left"><figure><img src="https://823733684-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F00tYLwhz5RyR7fJEhrWy%2Fuploads%2Fgit-blob-d040dbe83e5ea9c5e2e1b4bd7a1def041fb3f133%2Fimage%20(11)%20(1)%20(1)%20(1)%20(1).png?alt=media" alt="" width="202"><figcaption></figcaption></figure> <figure><img src="https://823733684-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F00tYLwhz5RyR7fJEhrWy%2Fuploads%2Fgit-blob-619c9b67b5c9723d3a66674160aab86415bdd72e%2Fimage%20(10)%20(1)%20(1)%20(1)%20(1)%20(1)%20(1).png?alt=media" alt="" width="563"><figcaption></figcaption></figure></div>

<div align="left"><figure><img src="https://823733684-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F00tYLwhz5RyR7fJEhrWy%2Fuploads%2Fgit-blob-d136f640ec9900fb5de4a69bcd2ee936ef743518%2Fimage%20(2)%20(1)%20(1)%20(1)%20(1)%20(2).png?alt=media" alt="" width="231"><figcaption></figcaption></figure> <figure><img src="https://823733684-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F00tYLwhz5RyR7fJEhrWy%2Fuploads%2Fgit-blob-15be701ecbf46d5327fca8fbf12abf0fa9051614%2Fimage%20(1)%20(1)%20(1)%20(1)%20(1)%20(1)%20(1)%20(2).png?alt=media" alt="" width="563"><figcaption></figcaption></figure></div>

3. After an upsert, we will see 2 documents that are upserted:

<figure><img src="https://823733684-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F00tYLwhz5RyR7fJEhrWy%2Fuploads%2Fgit-blob-68b63505584eac3d8284c0610f48810f68512a3f%2Fimage%20(9)%20(1)%20(1)%20(1)%20(1)%20(2).png?alt=media" alt="" width="433"><figcaption></figcaption></figure>

4. Now, if we delete the **Dog** document, and update **Cat** to **Cats**, we will now see the following:

<figure><img src="https://823733684-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F00tYLwhz5RyR7fJEhrWy%2Fuploads%2Fgit-blob-6a43418e6212eb27b15a979772b8d2a0defc1641%2Fimage%20(13)%20(2).png?alt=media" alt="" width="425"><figcaption></figcaption></figure>

* The original **Cat** document is deleted
* A new document with **Cats** is added
* **Dog** document is left untouched
* The remaining vector embeddings in Vector Store are **Cats** and **Dog**

<figure><img src="https://823733684-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F00tYLwhz5RyR7fJEhrWy%2Fuploads%2Fgit-blob-556cd567062208462709811056ead6738c0c8e5c%2Fimage%20(15)%20(1)%20(1).png?alt=media" alt="" width="448"><figcaption></figcaption></figure>
{% endtab %}

{% tab title="Full" %}
When you are upserting multiple documents, **Full** Cleanup mode will automatically delete any vector embeddings that are not part of the current upserting process.

1. Let's have a Record Manager with `Full` Cleanup. We don't need to have a SourceId Key for Full Cleanup mode.

<div align="left"><figure><img src="https://823733684-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F00tYLwhz5RyR7fJEhrWy%2Fuploads%2Fgit-blob-6bc3f51ca7b4369b85eb5ede1eb016b92668fe26%2Fimage%20(4)%20(1)%20(1)%20(1)%20(1)%20(1)%20(1)%20(1)%20(1)%20(1)%20(1).png?alt=media" alt="" width="264"><figcaption></figcaption></figure> <figure><img src="https://823733684-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F00tYLwhz5RyR7fJEhrWy%2Fuploads%2Fgit-blob-57bcd68d64f60e8564d6528498c4d87a6bb9f5ec%2Fimage%20(17)%20(1)%20(1).png?alt=media" alt="" width="407"><figcaption></figcaption></figure></div>

2. And have the following 2 documents:

| Text | Metadata         |
| ---- | ---------------- |
| Cat  | `{source:"cat"}` |
| Dog  | `{source:"dog"}` |

<div align="left"><figure><img src="https://823733684-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F00tYLwhz5RyR7fJEhrWy%2Fuploads%2Fgit-blob-d040dbe83e5ea9c5e2e1b4bd7a1def041fb3f133%2Fimage%20(11)%20(1)%20(1)%20(1)%20(1).png?alt=media" alt="" width="202"><figcaption></figcaption></figure> <figure><img src="https://823733684-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F00tYLwhz5RyR7fJEhrWy%2Fuploads%2Fgit-blob-619c9b67b5c9723d3a66674160aab86415bdd72e%2Fimage%20(10)%20(1)%20(1)%20(1)%20(1)%20(1)%20(1).png?alt=media" alt="" width="563"><figcaption></figcaption></figure></div>

<div align="left"><figure><img src="https://823733684-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F00tYLwhz5RyR7fJEhrWy%2Fuploads%2Fgit-blob-d136f640ec9900fb5de4a69bcd2ee936ef743518%2Fimage%20(2)%20(1)%20(1)%20(1)%20(1)%20(2).png?alt=media" alt="" width="231"><figcaption></figcaption></figure> <figure><img src="https://823733684-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F00tYLwhz5RyR7fJEhrWy%2Fuploads%2Fgit-blob-15be701ecbf46d5327fca8fbf12abf0fa9051614%2Fimage%20(1)%20(1)%20(1)%20(1)%20(1)%20(1)%20(1)%20(2).png?alt=media" alt="" width="563"><figcaption></figcaption></figure></div>

3. After an upsert, we will see 2 documents that are upserted:

<figure><img src="https://823733684-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F00tYLwhz5RyR7fJEhrWy%2Fuploads%2Fgit-blob-68b63505584eac3d8284c0610f48810f68512a3f%2Fimage%20(9)%20(1)%20(1)%20(1)%20(1)%20(2).png?alt=media" alt="" width="433"><figcaption></figcaption></figure>

4. Now, if we delete the **Dog** document, and update **Cat** to **Cats**, we will now see the following:

<figure><img src="https://823733684-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F00tYLwhz5RyR7fJEhrWy%2Fuploads%2Fgit-blob-c77749e2813034fdf4bcf296f4e9f804a0721fa0%2Fimage%20(18)%20(1)%20(1).png?alt=media" alt="" width="430"><figcaption></figcaption></figure>

* The original **Cat** document is deleted
* A new document with **Cats** is added
* **Dog** document is deleted
* The remaining vector embeddings in Vector Store is just **Cats**

<figure><img src="https://823733684-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F00tYLwhz5RyR7fJEhrWy%2Fuploads%2Fgit-blob-63b3b7ce19c1a265bfef0721599082b99f859918%2Fimage%20(19)%20(1)%20(1).png?alt=media" alt="" width="527"><figcaption></figcaption></figure>
{% endtab %}

{% tab title="None" %}
No cleanup will be performed
{% endtab %}
{% endtabs %}

Current available Record Manager nodes are:

* SQLite
* MySQL
* PostgresQL

## Resources

* [LangChain Indexing - How it works](https://js.langchain.com/docs/how_to/indexing/#how-it-works)
