Chat with PDF: Your Go-to Website for Smarter Exam Prep with PDF Chat Support


Authors : Madhav Thigale; Aditya Kumar; Chetna Girme; Apurva Gargote

Volume/Issue : Volume 10 - 2025, Issue 4 - April


Google Scholar : https://tinyurl.com/bdzf43w7

Scribd : https://tinyurl.com/tu6xvnzk

DOI : https://doi.org/10.38124/ijisrt/25apr956

Google Scholar

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Note : Google Scholar may take 15 to 20 days to display the article.


Abstract : This project builds an interactive application where users can upload multiple PDF documents and ask questions about them, offering a dynamic way to explore and retrieve information from large texts. The system processes the PDFs by extracting their text, chunking it into smaller sections, and converting these sections into numerical embeddings using advanced language models. These embeddings are stored in a FAISS vector database, enabling efficient similarity search and fast retrieval of relevant information based on user queries. The project uses Stream lit as the frontend framework to create a user-friendly web app, enabling users to interact with the system, upload PDFs, and receive chatbot responses. Accessed via API, powers the conversational AI, generating responses by creating text embeddings for similarity search, which are stored in FAISS for efficient retrieval. Lang Chain orchestrates the interactions between the AI model, memory, and retrieval systems, while utilities like PyPDF2 extract text from PDFs, and dotenv manages environment variables. The chatbot uses Open AI embeddings for text conversion and Conversation Buffer Memory to maintain context throughout user interactions.

Keywords : PDF Interaction, Conversational AI, NLP, Text Extraction, Semantic Search, Intelligent Search, PyPDF2, User-Friendly Interface, Document Analysis, Information Retrieval.

References :

  1. “Massive Open Online Course Study Group: Interaction Patterns in Face- to-Face and Online (Facebook) Discussions” by Pin-Ju Chen and Yang- Hsueh Chen https://www.frontiersin.org/ journals/psychology/articles/10.3389/fpsyg.2 021.670533/full
  2.  “AN EVALUATION OF STUDENTS EXPERIENCES OF USING VIRTUAL STUDY SPACES” by UCL LIBRARY SERVICES with INFORMATION SERVICES DIVISION, FACULTIES and DEPARTMENTS https://discovery.ucl.ac.uk/id/eprint/10132327/1/An%20Evaluation%20of%20UCL%20Virtual%20Learning%20Spaces%20-%20Final%20Report%20July%202021.pdf
  3. “Web-based Collaborative Learning” by Fan Qing, Lin Li https://www.sciencedirect.com/science/ article/pii/S1878029611008528?ref=pdf_download&fr=RR-2&rr=8d7c43483bde3b4f
  4. “Exploring the role of social media in collaborative learning the new domain of learning” by Jamal Abdul Nasir Ansari and Nawab Ali Kha. https://slejournal.springeropen.com/articles/10.1186/s40561-020-00118-7

This project builds an interactive application where users can upload multiple PDF documents and ask questions about them, offering a dynamic way to explore and retrieve information from large texts. The system processes the PDFs by extracting their text, chunking it into smaller sections, and converting these sections into numerical embeddings using advanced language models. These embeddings are stored in a FAISS vector database, enabling efficient similarity search and fast retrieval of relevant information based on user queries. The project uses Stream lit as the frontend framework to create a user-friendly web app, enabling users to interact with the system, upload PDFs, and receive chatbot responses. Accessed via API, powers the conversational AI, generating responses by creating text embeddings for similarity search, which are stored in FAISS for efficient retrieval. Lang Chain orchestrates the interactions between the AI model, memory, and retrieval systems, while utilities like PyPDF2 extract text from PDFs, and dotenv manages environment variables. The chatbot uses Open AI embeddings for text conversion and Conversation Buffer Memory to maintain context throughout user interactions.

Keywords : PDF Interaction, Conversational AI, NLP, Text Extraction, Semantic Search, Intelligent Search, PyPDF2, User-Friendly Interface, Document Analysis, Information Retrieval.

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe