Data Reader

fast-mRMR uses a specific format (binary and columnar) to ease the subsequent process. To transform datasets in CSV-format to this new format, we have created a Data Reader program. This reads a CSV file (with header) and transforms it in a binary file called "data.mrmr" (a example of CSV and mRMR are included in utils folder). This format is only needed by CPU and GPU versions. Spark's version can read whatever dataset Spark can read.

Compilation

In order to compile the reader, we include a Makefile example in the same folder that generates a binary file called mrmr-reader (we also include a example of binary file).

cd fast-mRMR/utils/data-reader && make && chmod +x mrmr-reader

Example

The usage is as follows:

$ ./mrmr-reader
Usage: <inputfilename>

Passing as argument the input file, the program outputs the following:

$ ./mrmr-reader data.csv 
...0 / 48
Readed Samples: 48
Last 15 samples ignored.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data Reader

Data Reader

Compilation

Example

Clone this wiki locally