PolyGlot Gemini lens for PDFs

Created by team PolyGlot Gemini on December 27, 2023

Problem Statement: 1) Over 70% of PDFs contain critical data in images like charts and tables, especially research articles 2) Gemini is released for English only today. Can we build a solution for 1) Answering natural language questions based on images in PDFs ? 2) Making Gemini accessible for non english speakers? By leveraging Spire, Open AI GPT 3.5, Gemini Pro Vision and Trulens, I have built an application that solves both problems - Spire for Image Extraction - Open AI for Translation to English (optional) - Gemini-Pro-Vision for the answer - TruLens for Monitoring

Category tags:

"excellent work. amazing and very useful idea"

avatar

Walaa Nasr Elghitany

Data scientist and doctor

"Great use of Gemini to make PDFs and images more accessible + use of trulens to make sure it's safe. Areas of improvement: - A narrower use case can often be more impactful than a general one, and bring a lot of value! Focus on selling to your first customers, not the whole market. - It would have been nice to see evaluations that validated the core capabilities of the app in addition to the harmlessness evaluations you completed."

avatar

Josh Reini

DevRel