Skip to content

Commit

Permalink
[!88][DEV] Refactor to isolate libraries specific for speech translation
Browse files Browse the repository at this point in the history
As people might need to work on projects not related to speech translation,
we want to avoid adding dependencies that are needed only for speech translation project
in the main setup file.
Conversely, we do not want to make difficult the installation for speech projects.

The patch creates a dedicated requirements txt file that contains the dependencies specific for ST projects and
moves the speech-only libraries to the new requirement file.

Existing CI
  • Loading branch information
mgaido91 committed Jul 10, 2023
1 parent 6dc8ee1 commit adeebdd
Show file tree
Hide file tree
Showing 7 changed files with 20 additions and 8 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ jobs:
git submodule update --init --recursive
python setup.py build_ext --inplace
python -m pip install --editable .
pip install torchaudio
pip install -r speech_requirements.txt
curdir=$(pwd) && cd ..
git clone https://github.com/facebookresearch/SimulEval.git/
cd SimulEval
Expand Down
2 changes: 1 addition & 1 deletion .gitlab-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ before_script:
- virtualenv venv
- source venv/bin/activate
- pip install -e .
- pip install torchaudio
- pip install -r speech_requirements.txt
- curdir=$(pwd) && cd ..
- rm -rf SimulEval
- git clone https://github.com/facebookresearch/SimulEval.git/
Expand Down
2 changes: 2 additions & 0 deletions FBK_HOW_TO_WORK.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ To start working on this code, first download the repository with `git clone`.
The master branch containing the up-to-date FBK MT Fairseq internal version is `internal_master`,
so you can access it entering into the cloned folder and running the command `git checkout internal_master`.
To install the repository, run `pip install -e .`.
If you plan to work on speech translation, complete the setup of you environment
by installing the required dependencies with `pip install -r speech_requirements.txt`.
We recommend installing the repository in a dedicated python virtual environment,
which you can create with PyCharm when importing the project or on command line.
In alternative, you can create a dedicated Anaconda environment.
Expand Down
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,14 @@ Dedicated README for each work can be found in the `fbk_works` directory.
If using this repository, please acknowledge the related paper(s) citing them.
Bibtex citations are available for each work in the dedicated README file.

To install the repository, do:

```
pip install -e .
pip install -r speech_requirements.txt # required for speech translation
```


Below, there is the original Fairseq README file.

--------------------------------------------------------------------------------
Expand Down
9 changes: 5 additions & 4 deletions examples/speech_to_text/scripts/from_srt_to_blocks.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,9 @@
# limitations under the License

try:
import pysrt
import srt
except ImportError:
print("Please install pysrt 'pip install pysrt'")
print("Please install srt 'pip install srt'")
raise ImportError
import re
import sys
Expand Down Expand Up @@ -47,11 +47,12 @@ def main():
and each newline inside that block will be substituted by an <eol>.
"""
srt_path = sys.argv[1]
subs = pysrt.open(srt_path)
with open(srt_path) as f:
subs = list(srt.parse(f))

with open(srt_path + ".blocks", 'w') as fp:
for sub in subs:
fp.write("%s\n" % add_eol_eob(sub.text))
fp.write("%s\n" % add_eol_eob(sub.content))


if __name__ == "__main__":
Expand Down
2 changes: 0 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -192,8 +192,6 @@ def do_setup(package_data):
"sacrebleu>=1.4.12",
"torch",
"tqdm",
"ctc_segmentation",
"pysrt"
],
dependency_links=dependency_links,
packages=find_packages(
Expand Down
3 changes: 3 additions & 0 deletions speech_requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
torchaudio
ctc_segmentation
srt

0 comments on commit adeebdd

Please sign in to comment.