Skip to content

Conversation

@bali0019
Copy link
Contributor

@bali0019 bali0019 commented Oct 6, 2025

This example demonstrates incremental document processing using:

  • ai_parse_document for extracting structured data from PDFs/images
  • ai_query for LLM-based content analysis
  • Lakeflow Declarative Pipelines for streaming incremental processing

Key features:

  • Incremental streaming pipeline with 3 stages
  • Visual debugging notebook with interactive bounding boxes
  • Error handling and variant support

This example demonstrates incremental document processing using:
- ai_parse_document for extracting structured data from PDFs/images
- ai_query for LLM-based content analysis
- Lakeflow Declarative Pipelines for streaming incremental processing

Key features:
- Incremental streaming pipeline with 3 stages
- Visual debugging notebook with interactive bounding boxes
- Error handling and variant support
- Production-ready DAB configuration
This example only contains SQL transformations and YAML configs - no Python package to install.
Table names are hardcoded in SQL files since DAB variables don't get substituted in SQL source files.
@bali0019 bali0019 force-pushed the add-ai-document-pipeline branch from 585cb40 to 7f2cd8e Compare October 7, 2025 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant