Content Extractor with Vision LLM
Author:
๐ Electrical-Two9833
Real discussions and feedbacks of Content Extractor with Vision LLM
๐ Join the Discussion on Forums
App Description
An open-source Python tool that extracts content from documents (PDF, DOCX, PPTX), describes embedded images using Vision Language Models, and saves the results in clean Markdown files.
Project Overview
Content Extractor with Vision LLM is an open-source Python tool designed for extracting text and images from documents and generating image descriptions using Vision Language Models. It supports multiple document formats, offers advanced image description capabilities, and outputs results in Markdown format. The project is modular, extensible, and includes a CLI interface for ease of use. It’s a promising tool for document content extraction and image analysis, with potential for further development and community contributions.
Links
๐ Website: https://github.com/MDGrey33/content-extractor-with-vision
Features & Benefits
โ
Multi-format support
โ
Advanced image description
โ
Two PDF processing modes
โ
Markdown outputs
โ
CLI interface
โ
Modular & extensible
โ
Detailed logging