Skip to content

A Java NLP application that identifies names, organizations, and locations in text by utilizing Hugging Face's RoBERTa NER model through the ONNX runtime and the Deep Java Library.

Notifications You must be signed in to change notification settings

Ahwar/NER-NLP-with-ONNX-Java

Repository files navigation

Run HuggingFace NER (NLP) Model on Java using ONNX Runtime and DJL

A Natural Language Processing (NLP) Java application that detects names, organizations, and locations in a text by running Hugging Face's Roberta NER model using ONNX Runtime and the Deep Java Library.

Installation

Open the project folder in a Java IDE (recommended: IntelliJ IDEA Community) with Gradle support and build the project.

Requirements

  1. Java Development Kit (JDK) version 17
  2. Gradle version 8.9

Download Files

These files are required to run the project:

  1. ONNX model
  2. tokenizer.json file

Convert the ONNX Model

To convert the Hugging Face NER model to ONNX, open this Google Colaboratory Notebook, run the code as shown in the image below, and follow all the steps.

run colab code cell

(The code for this purpose is also saved in the Jupyter notebook file convert Huggingface model to ONNX.ipynb. You can run the code using Jupyter Notebook.)

After running one of the above codes, your ONNX model will be saved in the onnx/ folder.

Download tokenizer.json

The tokenizer file tokenizer.json was taken from this Hugging Face repository. Download the tokenizer.json from this link.

Move Files

Copy the files created from the above steps into the raw-files directory as shown in the image below.

raw-files path

Building the Project

Build the project using the button shown below.

how to build project

Run the Code

Open the Main.java file and click the play button as shown in the red box in the image below.

how to run project