Skip to content

Commit

Permalink
[!138][RELEASE] Direct models for Simultaneous Speech Translation and…
Browse files Browse the repository at this point in the history
… Automatic Subtitling (IWSLT 2023)

# Which work do we release?
Models and inference codes for the FBK participation to IWSLT 2023 SimulST and Subtitling tasks.

# What changes does this release refer to?
Commit 3d1408f0affffd9e898689623120228fe020d9fd
  • Loading branch information
sarapapi authored and mgaido91 committed Sep 27, 2023
1 parent 8cee29f commit 1300af2
Show file tree
Hide file tree
Showing 5 changed files with 78 additions and 6 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ Dedicated README for each work can be found in the `fbk_works` directory.
- [[INTERSPEECH 2023] **AlignAtt: Using Attention-based Audio-Translation Alignments as a Guide for Simultaneous Speech Translation**](fbk_works/ALIGNATT_SIMULST_AGENT_INTERSPEECH2023.md)
- [[INTERSPEECH 2023] **Joint Speech Translation and Named Entity Recognition**](fbk_works/JOINT_ST_NER2023.md)
- [[ACL 2023] **Attention as a Guide for Simultaneous Speech Translation**](fbk_works/EDATT_SIMULST_AGENT_ACL2023.md)
- [[IWSLT 2023] **Direct Models for Simultaneous Translation and Automatic Subtitling: FBK@IWSLT2023**](fbk_works/IWSLT_2023.md)
- [**Reproducibility is Nothing Without Correctness: The Importance of Testing Code in NLP**](fbk_works/BUGFREE_CONFORMER.md)

### 2022
Expand Down
2 changes: 1 addition & 1 deletion fbk_works/ALIGNATT_SIMULST_AGENT_INTERSPEECH2023.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ The output will be saved in `--output`.

```bash
simuleval \
--agent ${FBK_FAIRSEQ_ROOT}/examples/speech_to_text/simultaneous_translation/agents/simul_offline_alignatt.py \
--agent ${FBK_FAIRSEQ_ROOT}/examples/speech_to_text/simultaneous_translation/agents/v1_0/simul_offline_alignatt.py \
--source ${SRC_LIST_OF_AUDIO} \
--target ${TGT_FILE} \
--data-bin ${DATA_ROOT} \
Expand Down
4 changes: 2 additions & 2 deletions fbk_works/EDATT_SIMULST_AGENT_ACL2023.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Code for the paper: ["Attention as a Guide for Simultaneous Speech Translation"](https://arxiv.org/pdf/2212.07850.pdf) published at ACL 2023.

## 📎 Requirements
To run the agent, please make sure that [SimulEval v1.0.2](https://github.com/facebookresearch/SimulEval) is installed
To run the agent, please make sure that [SimulEval v1.0.2](https://github.com/facebookresearch/SimulEval) (commit [d1a8b2f](https://github.com/facebookresearch/SimulEval/commit/d1a8b2f0b13fe5204f3dcb4935cae9c73dbfc285)) is installed
and set `--port` accordingly.

## 📌 Pre-trained offline models
Expand All @@ -20,7 +20,7 @@ The output will be saved in `--output`.

```bash
simuleval \
--agent ${FBK_FAIRSEQ_ROOT}/examples/speech_to_text/simultaneous_translation/agents/simul_offline_edatt.py \
--agent ${FBK_FAIRSEQ_ROOT}/examples/speech_to_text/simultaneous_translation/agents/v1_0/simul_offline_edatt.py \
--source ${SRC_LIST_OF_AUDIO} \
--target ${TGT_FILE} \
--data-bin ${DATA_ROOT} \
Expand Down
71 changes: 71 additions & 0 deletions fbk_works/IWSLT_2023.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Direct Models for Simultaneous Translation and Automatic Subtitling (IWSLT2023)
Models and inference scripts for the paper: [Direct Models for Simultaneous Translation and Automatic Subtitling: FBK@IWSLT2023](https://aclanthology.org/2023.iwslt-1.11/).

## 💬 Simultaneous Speech Translation

We release the offline ST model used for the FBK participation to the Simultaneous Speech Translation task: [**model folder**](https://fbk-my.sharepoint.com/:f:/g/personal/spapi_fbk_eu/EnnwDZFnXJdNjlhrKPqtNm8BHPz2d0E316Pp-yBy-dBpTg?e=Vhdvaw).

### 🤖 Inference with AlignAtt and EDAtt
Please install [SimulEval v1.1.0](https://github.com/facebookresearch/SimulEval/) (commit [3c19e1c](https://github.com/facebookresearch/SimulEval/commit/3c19e1c5e5deee043ab938d9b51996d5578b626c)) to run the evaluation.

#### 📌 AlignAtt
Set the parameters as described in [AlignAtt README](fbk_works/ALIGNATT_SIMULST_AGENT_INTERSPEECH2023.md) and
run the following code:
```bash
simuleval \
--agent-class examples.speech_to_text.simultaneous_translation.agents.v1_1.simul_offline_alignatt.AlignAttSTAgent \
--source ${SRC_LIST_OF_AUDIO} \
--target ${TGT_FILE} \
--data-bin ${DATA_ROOT} \
--config config_simul.yaml \
--model-path ${ST_SAVE_DIR}/avg7.pt --prefix-size 1 --prefix-token "nomt" \
--extract-attn-from-layer 3 --frame-num $FRAMES \
--source-segment-size 1000 \
--device cuda:0 \
--quality-metrics BLEU --latency-metrics LAAL AL ATD --computation-aware \
--output ${OUT_DIR}
```

#### 📌 EDAtt
Set the parameters as described in [EDAtt README](fbk_works/EDATT_SIMULST_AGENT_ACL2023.md) and
run the following code:
```bash
simuleval \
--agent-class examples.speech_to_text.simultaneous_translation.agents.v1_1.simul_offline_edatt.EDAttSTAgent \
--source ${SRC_LIST_OF_AUDIO} \
--target ${TGT_FILE} \
--data-bin ${DATA_ROOT} \
--config config_simul.yaml \
--model-path ${ST_SAVE_DIR}/avg7.pt --prefix-size 1 --prefix-token "nomt" \
--extract-attn-from-layer 3 --frame-num 2 --attn-threshold ${ALPHA} \
--source-segment-size 1000 \
--device cuda:0 \
--quality-metrics BLEU --latency-metrics LAAL AL ATD --computation-aware \
--output ${OUT_DIR}
```

## 📺 Automatic Subtitling

We release the Automatic Subtitling models for the FBK participation to the Automatic Subtitling task:
- [**en-de model folder**](https://fbk-my.sharepoint.com/:f:/g/personal/spapi_fbk_eu/Es7feuTJ0phEqt450DN7clYBa_GdFfoZxpL5rBf-ix4ubQ?e=fxb01K)
- [**en-es model folder**](https://fbk-my.sharepoint.com/:f:/g/personal/spapi_fbk_eu/Emn1YEgB2iBIq2LhMY4lNUcBnriFPTaUmHgWEXtJmM89xQ?e=UePzIQ)

For instructions of use, please refer to the [Direct Speech Translation for Automatic Subtitling README](fbk_works/DIRECT_SUBTITLING.md).

## 📍Citation
```bibtex
@inproceedings{papi-etal-2023-direct,
title = "Direct Models for Simultaneous Translation and Automatic Subtitling: {FBK}@{IWSLT}2023",
author = "Papi, Sara and
Gaido, Marco and
Negri, Matteo",
booktitle = "Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)",
month = jul,
year = "2023",
address = "Toronto, Canada (in-person and online)",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2023.iwslt-1.11",
doi = "10.18653/v1/2023.iwslt-1.11",
pages = "159--168",
}
```
6 changes: 3 additions & 3 deletions fbk_works/SIMULTANEOUS_OFFLINE_ST.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Agent for the paper: [Does Simultaneous Speech Translation need Simultaneous Models?](https://arxiv.org/abs/2204.03783)

To run the agent, please make sure that [SimulEval](https://github.com/facebookresearch/SimulEval) is installed and set `--port` accordingly.
To run the agent, please make sure that [SimulEval 1.0.2](https://github.com/facebookresearch/SimulEval) (commit [d1a8b2f](https://github.com/facebookresearch/SimulEval/commit/d1a8b2f0b13fe5204f3dcb4935cae9c73dbfc285)) is installed and set `--port` accordingly.

Set `--source`, `--target`, and `--config` as described in the [Fairseq Simultaneous Translation repository](https://github.com/facebookresearch/fairseq/blob/main/examples/speech_to_text/docs/simulst_mustc_example.md#inference--evaluation).
`--model-path` is the offline ST model checkpoint,
Expand All @@ -12,7 +12,7 @@ The simultaneous output will be saved in `--output`.
## Fixed Word Detection
```bash
simuleval \
--agent ${FBK_FAIRSEQ_ROOT}/examples/speech_to_text/simultaneous_translation/agents/simul_offline_waitk.py \
--agent ${FBK_FAIRSEQ_ROOT}/examples/speech_to_text/simultaneous_translation/agents/v1_0/simul_offline_waitk.py \
--source ${SRC_LIST_OF_AUDIO} \
--target ${TGT_FILE} \
--data-bin ${DATA_ROOT} \
Expand All @@ -28,7 +28,7 @@ simuleval \
## Adaptive Word Detection
```bash
simuleval \
--agent ${FBK_FAIRSEQ_ROOT}/examples/speech_to_text/simultaneous_translation/agents/simul_offline_waitk.py \
--agent ${FBK_FAIRSEQ_ROOT}/examples/speech_to_text/simultaneous_translation/agents/v1_0/simul_offline_waitk.py \
--source ${SRC_LIST_OF_AUDIO} \
--target ${TGT_FILE} \
--data-bin ${DATA_ROOT} \
Expand Down

0 comments on commit 1300af2

Please sign in to comment.