Data analysis of Wine dataset, focusing into data preprocessing to build a optimal classifier using Kmeans.
First we will go in deep about the basic information of the dataset. Sometimes our dataset comes with useful information about itself saving us time. Then we have to do a stadistic analysis of our data and his general characteristics to treat the data with the necessary methods, cleaning out bad features, examples and redundant imformation which can have a big impact in the performance of our classifier.
We will do this analysis using python and jupyter notebooks using the most us libraries as numpy, matplotlib, pandas, and seaborn