Json Lines File
Last updated
Last updated
JSON Lines (JSONL) is a text format where each line is a valid JSON value. This module provides functionality to load and process JSONL files, with support for pointer-based content extraction and dynamic metadata handling.
This module provides a sophisticated JSONL document loader that can:
Load single or multiple JSONL files
Extract specific values using JSON pointers
Handle dynamic metadata extraction
Process content with text splitters
Support base64 encoded files
Handle file storage integration
Customize metadata extraction
JSONL File: The JSONL file(s) to process (.jsonl extension)
Pointer Extraction: JSON pointer to extract content (e.g., "key" for {"key": "value"}
)
Text Splitter: A text splitter to process the extracted content
Additional Metadata: JSON object with additional metadata
Omit Metadata Keys: Comma-separated list of metadata keys to omit
Document: Array of document objects containing metadata and pageContent
Text: Concatenated string from pageContent of documents
JSON pointer extraction
Dynamic metadata handling
Text splitting support
Base64 file support
File storage integration
Error handling
Memory-efficient processing
For JSONL content:
With pointer "key", extracts: "value1", "value2"
You can extract values as metadata using JSON pointers:
Each document contains:
pageContent: Extracted content using pointer
metadata:
source: Original file path
line: Line number in file
pointer: Used JSON pointer
Additional dynamic metadata
Direct file loading
Base64 encoded content
Multiple file support
File storage system support
Organization-based storage
Chatflow-based storage
One document per JSONL line
Invalid JSON lines are skipped
Memory-efficient processing
Error handling for invalid pointers
Support for nested JSON structures
Dynamic metadata extraction
Flexible output formats