Skip to content

Latest commit

 

History

History
35 lines (24 loc) · 2.66 KB

README.md

File metadata and controls

35 lines (24 loc) · 2.66 KB

DocQuery

doc-query is a sophisticated AI-powered document query system designed to facilitate seamless interaction with PDF documents. Users can upload PDF files and ask questions related to the content of those documents. Leveraging state-of-the-art technologies, the system provides real-time responses to user queries, ensuring an intuitive and efficient user experience.

How It Works

  1. Upload Document: Users start by uploading a PDF document to the system. The document is stored securely in an Amazon S3 bucket.
  2. Indexing with OpenAI API: Upon upload, the system generates an index vector for the document using the OpenAI API. This index vector is a representation of the document's content and is crucial for accurate query responses.
  3. Storage in Pinecone Vector Database: The index vector is stored in the Pinecone vector database, ensuring efficient retrieval and comparison during query processing.
  4. User Query: Once the document is indexed, users can pose questions based on the content of the PDF. These queries are processed in real-time.
  5. Query Processing: To answer the user's query, the system first identifies similarities between the query and the indexed documents stored in the Pinecone database. This narrowed-down set of documents serves as the context for the subsequent step.
  6. AI Response Generation: The system utilizes a GPT model, taking into account the identified context (similar documents), previous conversations, and the original user query. The model generates a response tailored to the user's query and context.
  7. Storage and Presentation: The generated response from the AI model is stored in the database for future reference and is promptly displayed to the user. This ensures a seamless and efficient interaction flow.

Features

  • Real-time AI chatbot functionality for querying PDF documents.
  • Utilization of advanced AI technologies for accurate and context-aware responses.
  • Modern user interface for an enhanced user experience.
  • Secure document storage using Amazon S3.
  • Efficient indexing and retrieval using Pinecone vector database.

Usage

  1. Upload: Upload your PDF document through the provided interface.
  2. Query: Ask questions related to the uploaded document.
  3. Receive Response: Get real-time responses generated by the AI model based on the document's content and context.

Feedback

We welcome any feedback or suggestions for improving the doc-query system. Feel free to open an issue on GitHub with your thoughts and ideas.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.