Google Drive

Google Drive is a cloud storage and file synchronization service. This module provides functionality to load and process files from Google Drive, supporting various file formats and Google Workspace documents.

This module provides a sophisticated Google Drive document loader that can:

  • Load multiple file types

  • Process Google Workspace documents

  • Handle folder-based loading

  • Support shared drives

  • Process files recursively

  • Customize file type filtering

  • Handle OAuth2 authentication

Required Parameters

  • Connect Credential: Google Drive OAuth2 credentials. Refer to #Google Drive

  • Select Files or Folder ID: Choose specific files or provide a folder ID

Optional Parameters

  • File Types: Types of files to load:

    • Google Docs

    • Google Sheets

    • Google Slides

    • PDF Files

    • Text Files

    • Word Documents

    • PowerPoint

    • Excel Files

  • Include Subfolders: Process files in subfolders

  • Include Shared Drives: Access files from shared drives

  • Max Files: Maximum number of files to load (default: 50)

  • Text Splitter: A text splitter to process the extracted content

  • Additional Metadata: JSON object with additional metadata

  • Omit Metadata Keys: Comma-separated list of metadata keys to omit

Outputs

  • Document: Array of document objects containing metadata and pageContent

  • Text: Concatenated string from pageContent of documents

Supported File Types

Google Workspace

  • Google Docs (application/vnd.google-apps.document)

  • Google Sheets (application/vnd.google-apps.spreadsheet)

  • Google Slides (application/vnd.google-apps.presentation)

Microsoft Office

  • Word (.docx)

  • Excel (.xlsx)

  • PowerPoint (.pptx)

Other Formats

  • PDF (.pdf)

  • Text Files (.txt)

Features

  • OAuth2 authentication

  • Multiple file type support

  • Folder processing

  • Shared drive access

  • File type filtering

  • Text splitting support

  • Metadata customization

  • Error handling

Loading Methods

File Selection Mode

  • Direct file selection

  • Multiple file support

  • File type filtering

  • Metadata preservation

Folder Mode

  • Recursive folder processing

  • Subfolder support

  • File type filtering

  • Batch processing

Document Structure

Each document contains:

  • pageContent: Extracted content from the file

  • metadata:

    • fileName: Original file name

    • fileType: MIME type

    • fileId: Google Drive file ID

    • source: File path/URL

    • Additional custom metadata

Notes

  • Requires OAuth2 authentication

  • Handles rate limiting

  • Supports large files

  • Temporary file management

  • Memory-efficient processing

  • Error handling for invalid files

  • Automatic token refresh

Last updated