Microsoft Excel

Microsoft Excel is a spreadsheet program that features calculation tools, pivot tables, and a macro programming language. This module provides functionality to load and process Excel files using SheetJS.
This module provides a sophisticated Excel document loader that can:
Load multiple Excel file formats
Process multiple worksheets
Convert rows to structured documents
Handle various data types
Preserve cell formatting
Extract metadata per row
Support type inference
Inputs
Required Parameters
Excel File: The Excel file(s) to process (.xls, .xlsx, .xlsm, .xlsb)
Optional Parameters
Text Splitter: A text splitter to process the extracted content
Additional Metadata: JSON object with additional metadata
Omit Metadata Keys: Comma-separated list of metadata keys to omit
Outputs
Document: Array of document objects containing metadata and pageContent
Text: Concatenated string from pageContent of documents
Features
Multiple format support
Multi-sheet processing
Data type preservation
Metadata extraction
Type inference
Error handling
Memory-efficient processing
Supported Formats
Excel Binary (.xls)
Excel Workbook (.xlsx)
Excel Macro-Enabled (.xlsm)
Excel Binary Workbook (.xlsb)
Data Type Handling
Supported Types
Text (string)
Numbers (number)
Dates (date)
Booleans (boolean)
Formulas (calculated values)
Empty cells (null)
Document Structure
Each document contains:
pageContent: Formatted row content as key-value pairs
metadata:
worksheet: Sheet name
rowNum: Row index
Original column values
Additional custom metadata
Row Processing
Each row is converted to a document with:
Key-value pairs for each cell
Preserved column headers
Type information
Row position
Metadata Attributes
Default attributes include:
worksheet: Sheet or Worksheet Name (string)
rowNum: Row index (number)
Dynamic attributes based on column headers
Notes
Uses SheetJS for parsing
Preserves data types
Handles multiple sheets
Infers column types
Memory-efficient processing
Error handling for invalid files
Flexible output formats
Column type inference
Last updated