ml-advanced-classifiers/naive_bayes_vs_logistic_regression at master · autistic-symposium/ml-advanced-classifiers

This repository has been archived by the owner on Sep 19, 2024. It is now read-only.

Name		Name	Last commit message	Last commit date
parent directory ..
OUTPUT.txt		OUTPUT.txt
README.txt		README.txt
lr.py		lr.py
nb.py		nb.py
run_problem_final.py		run_problem_final.py
test.data		test.data
test.label		test.label
train.data		train.data
train.label		train.label

README.txt

The data for this problem is drawn from the 20 Newsgroups data
set. The training and test sets each contain 200 documents, 100 from
comp.sys.ibm.pc.hardware (label 0) and 100 from comp.sys.mac.hardware
(label 1). Each document is represented as a vector of word
counts.

The data consists of four files: train.data, train.label, test.data
and test.label. The .data files contain word count matrices whose rows
correspond to document_ids and whose columns correspond to
word_ids. Each row of the .data files represents the number of times a
certain word appeared in a certain document, in the following three
column format:

<document_id> <word_id> <count>

The .label files simply list the class label for each document in
order. I.e., the first entry of train.label is the label for the first
document in train.data.

In Matlab, you can load the data set using the following code:

loaded_file = load('train.data');
training_data = spconvert(loaded_file);
training_labels = load('train.label');

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

naive_bayes_vs_logistic_regression

naive_bayes_vs_logistic_regression

README.txt

Files

naive_bayes_vs_logistic_regression

Directory actions

More options

Directory actions

More options

Latest commit

History

naive_bayes_vs_logistic_regression

Folders and files

parent directory

README.txt