Uploads
Learn how to use upload images, audio, and other files
Last updated
Learn how to use upload images, audio, and other files
Last updated
Flowise lets you upload images, audio, and other files from the chat. In this section, you'll learn how to enable and use these features.
Certain chat models allow you to input images. Always refer to the official documentation of the LLM to confirm if the model supports image input.
Image processing only works with certain chains/agents in Chatflow.
, , , ,
If you enable Allow Image Upload, you can upload images from the chat interface.
To upload images with the API:
In the Chatflow Configuration, you can select a speech-to-text module. Supported integrations include:
OpenAI
AssemblyAI
When this is enabled, users can speak directly into the microphone. Their speech is be transcribed into text.
To upload audio with the API:
You can upload files in two ways:
Retrieval augmented generation (RAG) file uploads
Full file uploads
When both options are on, full file uploads take precedence.
You can upsert uploaded files on the fly to the vector store. To enable file uploads, make sure you meet these prerequisites:
You must include a vector store that supports file uploads in the chatflow.
If you have multiple vector stores in a chatflow, you can only turn on file upload for one vector store at a time.
You must connect at least one document loader node to the vector store's document input.
Supported document loaders:
You can upload one or more files in the chat:
Here's how it works:
The metadata for uploaded files is updated with the chatId.
This associates the file with the chatId.
When querying, an OR filter applies:
Metadata contains flowise_chatId
, and the value is the current chat session ID
Metadata does not contain flowise_chatId
An example of a vector embedding upserted on Pinecone:
To do this with the API, follow these two steps:
To enable full file uploads, go to Chatflow Configuration, open the File Upload tab, and click the switch:
Note that if your chatflow uses a Chat Prompt Template node, an input must be created from Format Prompt Values to pass the file data. The specified input name (e.g. {file}) should be included in the Human Message field.
To upload files with the API:
Both Full and RAG (Retrieval-Augmented Generation) file uploads serve different purposes.
Full File Upload: This method parses the entire file into a string and sends it to the LLM (Large Language Model). It's beneficial for summarizing the document or extracting key information. However, with very large files, the model might produce inaccurate results or "hallucinations" due to token limitations.
RAG File Upload: Recommended if you aim to reduce token costs by not sending the entire text to the LLM. This approach is suitable for Q&A tasks on the documents but isn't ideal for summarization since it lacks the full document context. This approach might takes longer time because of the upsert process.
Use the with formData
and chatId
:
Use the with uploads
and the chatId
from step 1:
With RAG file uploads, you can't work with structured data like spreadsheets or tables, and you can't perform full summarization due to lack of full context. In some cases, you might want to include all the file content directly in the prompt for an LLM, especially with models like Gemini and Claude that have longer context windows. is one of many that compare RAG with longer context windows.
You can see the File Attachment button in the chat, where you can upload one or more files. Under the hood, the processes each file and converts it into text.
As you can see in the examples, uploads require a base64 string. To get a base64 string for a file, use the .