Skip to content

A repo that demonstrates how to build Blacklab corpus via Docker and Nginx.

Notifications You must be signed in to change notification settings

Taiwan-Social-Media-Corpus/blacklab-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

blacklab-demo

This repository demonstrates how to build Blacklab corpus via Docker and Nginx.

Overview of the Architecture

flowchart LR
subgraph Internet
    D[Client]
end

subgraph DOCKER [Docker]

  D --> N{"Load Balancer <br/> (Nginx)"}

  N <==> B["/corpus-frontend"]
  N <==> C["/blacklab-server"]


  C <==> I
  B <==> C


  subgraph Indexes
        I[("<div style='padding: 0rem 0.5rem;'>Indexes  <br/>(by Indexer)</div>  ")]
  end


end
Loading

Documentation

1. Installation

Clone the repository and make sure you are in the project directory

git@github.com:PTT-Corpus/blacklab-demo.git && cd blacklab-demo

2. Create indexes for Blacklab server

To index your data, you need to add your xml data to the folder /data (in ./indexer/data).

deployment\
 |-- ...
indexer\
 |-- formats\                  # custom blacklab index format
 |-- data\
 |   |-- dcard_mock_data.xml   # dcard mock data
 |   |-- ptt_mock_data.xml     # ptt mock data
 |-- ...
server\
 |-- ...

We assume here that you are familiar with the BlackLab indexing process; see indexing with BlackLab to learn more.

3. Build the server

To build the server for the first time:

docker compose up

Your index should now be accessible at http://localhost/corpus-frontend/.

Once the server builds successfully, a folder blacklab-indexes will be generated and used by the blacklab server (i.e. its corresponding Docker container).

Hereafter, if you want let the blacklab server add new indexes, you need to stop the blacklab server by:

docker compose down

Then add your new xml files to the folder ./indexer/data, and run:

docker compose up

Contact Me

If you have any suggestion or question, please do not hesitate to email me at philcoke35@gmail.com

About

A repo that demonstrates how to build Blacklab corpus via Docker and Nginx.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published