Skip to content

The corpus_search module

Mark Fullmer edited this page Oct 10, 2020 · 1 revision

The corpus_search module is the workhorse for indexing & displaying results based on user queries. It was developed as a custom search indexing approach after initial versions based on Drupal's search_api and the ApacheSolr backend, which proved to be insufficient for the tokenizing, result counts, and search queries required by the corpus interface.

This page summarizes the business logic of that module.

TextMetadata (src/TextMetadata)

The constant facetIDs defines which Drupal node field names, corresponding to tables in the database, should be available for querying & filtering. The key is the Drupal field machine name, excluding the field_ prefix, and the value is the table alias used to query the database.

Making a new facet queryable and displayable in the front end simply requires adding it to the $facetIDs array.

The constant corpusSourceBundle defines the Drupal node type from which to query data. This defaults to text.