Skip to content

Writing the queries of the TPC-H benchmark using Logica.

Notifications You must be signed in to change notification settings

odanoburu/logica-tpc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Writing the queries of the TPC-H benchmark using Logica. See the blog post for background and more information. Do report any bugs (even documentation bugs), and contribute improvements!

DISCLAIMER: this is not a TPC benchmark, but a benchmark derived from it.

The sql directory has the SQL queries derived from the TPC-H benchmark, while the logica directory has our respective Logica translations. The src/benchmark.py script contains the code used to run the benchmark, and schemas has the table schemas for the TPC-H test database (so far, only a SQLite version).

Reproducing

  1. Download the TPC-H tools archive and unpack it in the root directory of this repository, or else generate or download the data some other way.
    • The tpch-dbgen directory should be at root directory of this repository
    • You must make a copy of the makefile.suite file there to a file named makefile, and tweak it for your platform and database choice.
  2. To benchmark Logica with sqlite, run make TPC-H.db to generate the sqlite database file.
  3. Run make all to generate the SQL queries from the Logica files and then benchmark those against the original SQL queries from the TPC-H benchmark, generating build/benchmark.csv with the data.
  4. You can run the results.awk script on the output csv to generate the ratio statistic we have reported on our blog post.
    awk -f results.awk build/benchmark.csv
        

Adding a new query

  1. Add the original SQL file to the sql directory.
  2. Add a corresponding Logica file (with .l extension) to the logica directory. It’s top-level predicate should be named Query for we to generate its SQL output automatically.
  3. Add the query to the Makefile under the QUERIES variable when it is ready to be benchmarked.
  4. Follow the steps for reproducing the benchmark in this README.

Benchmarking manually

Logica

To compile a Logica query to SQL, install Logica (see its repository for that) and run:

logica <query-file> print <query predicate>

You can also use Logica to run queries directly from the command line, or as a Python library.

By default, the main query predicate in this project should be named Query, so that we can build all Logica queries automatically with=make=.

Benchmarking code

To benchmark the Logica-generated queries against the original SQL ones, you use the Python module at src/benchmark.py. You can try:

python src/benchmark.py --check-equivalent --db-address ../TPC-H.db --dir-a ../queries/ --dir-b ../gen-queries/ -d sqlite3 q-02.sql

To see more options, run python benchmark.py --help.