A_Priori

This repository includes the implementation of the A_Priori algorithm, which was proposed by Agrawal and Srikant (1994), in Python.

A-Priori Algorithm

In several applications (study the consumer behavior, plagiarism identification, etc.) it is required to find frequent item sets. One way of achieving that is by applying the algorithm of Agrawal and Srikant (1994).

The baskets of items are included in a CSV file. Each line of such a file has the following structure:

item_1, item_2, ..., item_n

For example, one such file can be:

Cat, and, dog, bites
Yahoo, news, claims, a, cat, mated, with, a, dog, and, produced, viable, offspring
Cat, killer, likely, is, a, big, dog
Professional, free, advice, on, dog, training, puppy, training
Cat, and, kitten, training, and, behavior
Dog, &, Cat, provides, dog, training, in Eugene, Oregon
"Dog, and, cat", is, a, slang, term, used, by, police, officers, for, a, male-female, relationship
Shop, for, your, show, dog, grooming, and, pet, supplies

This program reads a file like the above one, finds the frequent item sets according to the threshold that is given by the user and extracts the results in CSV.

The script can be called like this:

python a_priori.py [-n] [-p] [-o OUTPUT] support filename

The parameter -n is optional. If given, the program deems the item sets as arithmetic (no strings).
The parameter -p is optional. If given, the program deems that the minimum value of support that is given through the parameter -s, is the percentage of the baskets in which the item set should be existing in order to be considered as important.
The parameter -o OUTPUT is optional. If given, the program saves the results in the file OUTPUT. Otherwise, the results are shown in the screen.
The parameter support is compulsory and is the lowest level of support which will be used by the script in order to deem an item set important.
The parameter filename is compulsory and is the name of the input file.

For example, the program can be called in the following way:

python a_priori.py -n 2 a_priori_example_2.csv

Indicative input and output files of the script:

example_1.csv, support: 3, will produce:

('a',):3;('and',):4;('cat',):5;('dog',):6;('training',):3
('and', 'cat'):3;('and', 'dog'):3;('cat', 'dog'):4

example_2.csv, support: 2 (considering the item sets as arithmetic), will produce:

(1,):2;(2,):3;(3,):3;(5,):3
(1, 3):2;(2, 3):2;(2, 5):3;(3, 5):2
(2, 3, 5):2

example_3.csv, support: 2 (considering the item sets as arithmetic), will produce:

(1,):3;(2,):6;(3,):4;(4,):5
(1, 2):3;(1, 4):2;(2, 3):3;(2, 4):4;(3, 4):3
(1, 2, 4):2;(2, 3, 4):2

The full program's description (in Greek) can be found here.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
examples.csv		examples.csv
src		src
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A_Priori

A-Priori Algorithm

About

Releases

Packages

Languages

License

nikiforosbotis/A_Priori

Folders and files

Latest commit

History

Repository files navigation

A_Priori

A-Priori Algorithm

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages