3
1
Student
India
3 years of experience
A student by profession but a blogger, and tech enthusiast by passion. I have a keen interest in coding, AIML, cyber security and everything related to tech... I enjoy learning by doing hands-on experiments and projects thus developing new skills. I like to keep my tasks, and goals organized and well-planned to maximize productivity and minimize errors and backlogs. I have flexibility in my planning to accommodate last-minute changes and thus solve all problems efficiently
Our website enhances online content accessibility for the visually impaired with a cost-effective text-to-speech service using contemporary AI tools. Current market solutions lack necessary amenities and are costly. Working on the website: > once the website loads, the user inputs the URL of the website to be analyzed > this website is parsed using Beautiful Soup to gather the meaningful text content available on the page > this content is passed to the OpenAI text-davinci-003 model as a prompt and a summary is generated for the same > this summary is read out to the user using Azure in natural human tone > next, the website is again parsed using Beautiful Soup with the aim to download relevant images on that website > these images are then analyzed using Google Cloud Vision API and feature labels describing the prominent objects/contents of that image are generated > these labels are passed as a prompt to the OpenAI text-davinci-003 model and a meaningful sentence is generated which describes the images > the prompt already includes a set of sample labels and outputs that the model can use to understand the format of the desired output. > the image description generated in the above step is then read aloud using Azure. For Redis: Redis caches URL results for up to 3 hours, if URL exists in cache, output is displayed/read aloud. Otherwise, website is processed for new output. Results are removed after 3 hours for possible content changes. It allows for fast data access making it suitable for high performance use cases. For voice control: > using space bar, user can ask queries regarding summary through available chatbot > above query is converted to text via speech recognition library of python > this text and the summary are given to the OpenAI text-davinci-003 model as a prompt and the query is resolved > the result is spoken out and if speech unrecognized, an error message stating to retry is read aloud