3
1
Software Engineer and Volunteer Trainer
2 years of experience
Our goal is to develop a system that can take documents in any language and allow users from different parts of the world with different languages to interact with those documents. This can help spread knowledge without any language barrier. We are first testing it out with Quranic text, which is in Arabic, and building a chatGPT-style bot for question answering. In future, we would like to expand it to additional documents in any language. We first perform embedding of the text using Cohere's multilingual embed model and save those embeddings in a vector database, Pinecone DB in our case. We then take user queries and also embed them and perform a similarity-based search to provide the most relevant results based on the query. The app is currently deployed and publicly available. Streamlit app link: https://taqihaider7-tafsir-quran-sementic-search-llm-app-fwh2if.streamlit.app/