A basic chatbot trained on Indian army details from Wikipedia.
This repository contains code for processing a text file containing information about the Indian Army. The text file is tokenized and sentences and words are extracted for further analysis.
numpy
nltk
string
random
Clone this repository to your local machine using git clone https://github.com/YOUR-USERNAME/Indian-Army-Text-Processing.git
Open the repository using your preferred Python IDE
Run the code
The first cell imports the necessary libraries
The second cell opens the text file 'Indian Army.txt'
The third cell reads the content of the text file
The fourth cell performs text pre-processing, such as converting all the text to lowercase, downloading necessary NLTK packages, tokenizing sentences, and tokenizing words
The final cell outputs the first 5 sentence tokens for checking purposes
Contributions are welcome. Feel free to open an issue or make a pull request.