Skip to content

hitblast/avro.py

Repository files navigation

avro.py

A modern Pythonic implementation of the popular Bengali phonetic-typing software Avro Phonetic.

Downloads Python Version License



Unit Tests Nightly Builds Linting Formatting


⚡ Overview

avro.py provides a fully fledged, batteries-included text parser which can parse, reverse and even convert English Roman script into its phonetic equivalent (unicode) of Bengali. At its core, it implements an extensively modified version of the Avro Phonetic Dictionary Search Library by Mehdi Hasan Khan.

The original project (pyAvroPhonetic) can only be used on versions up to Python 2.7 and doesn't contain proper support for Python's third major version AKA Python 3. It is noteworthy that Python 2 has officially been deprecated by the original maintainers and its usage is being discouraged overall.

✨ Inspirations

This package is inspired from Rifat Nabi's jsAvroPhonetic library and derives from Kaustav Das Modak's pyAvroPhonetic.


🔨 Installation

This package requires Python 3.8 or higher to be used inside your development environment.

# Install / upgrade.
$ pip install avro.py

📦 ...or you can try the CLI!

avnie is a newly developed CLI tool that uses avro.py under the hood. You can install it using:

# Install / upgrade avnie.
$ pip install avnie

🔖 Usage Guide

This small tour guide will describe how you can use avro.py back and forth to operate (cutlery!) on Bengali text. You can also check the examples directory for checking this whole snippet in action, as well as other use cases.

1. parse()

Let's assume I want to parse some English text to Bengali, which is "ami banglay gan gai.", so in this case to convert it to Bengali, we can use this snippet:

# Import the package.
import avro

# Our dummy text.
dummy = 'ami banglay gan gai.'

# Parsing the text.
avro_output = avro.parse(dummy)
print(output)  # Output: আমি বাংলায় গান গাই।

2. parse(bijoy=True)

Alternatively, I can also do it in Bijoy Keyboard format:

# Parsing in Bijoy.
bijoy_output = avro.parse(dummy, bijoy=True)  # Output: Avwg evsjvh় Mvb MvB।

3. to_bijoy()

Or, we can take the previous avro_output and convert it to Bijoy if we want to, like this:

# Converting to Bijoy.
bijoy_text = avro.to_bijoy(avro_output)  # Output: Avwg evsjvh় Mvb MvB।

4. to_unicode()

Conversely, we can convert the Bijoy text we got just now and convert it back to Unicode Bengali:

# Converting back!
unicode_text = avro.to_unicode(bijoy_text)  # Output: আমি বাংলায় গান গাই।

4. reverse()

Finally, we can just reverse back to the original text we passed as input in the first place:

# Reversing back!
reversed_text = avro.reverse(uncode_text)  # Output: ami banglay gan gai.

🛠️ Contributing

:octocat: Fork -> Do your changes -> Send a Pull Request, it's that easy!


Additional Developer Notes

In short, avro.py doesn't depend on any third-party libraries. However, if you'd like to contribute to the project, you'll need a handful of such useful tools.

Poetry has been used to manage the project's dependencies and virtual environment. You can install it by following the instructions here. The dependencies have been configured using the pyproject.toml file and doesn't require manual installation. Simply set up your developer environment using the following commands:

# Set up virtual environment and activate it.
$ python3 -m venv venv && source venv/bin/activate

# Setup project using Poetry.
$ make install  # same as `poetry install --sync --no-interaction`

# Perform updates on lockfile.
$ poetry update

Later, you can run the tests provided with the project using the following command. This option has already been configured in the "Testing" panel if you're using Visual Studio Code as your primary IDE.

# Run unit tests.
$ make test  # same as `poetry run pytest .`

# Build sdist and wheel packages for distributing the project.
$ make build  # same as `poetry build --verbose --no-interaction`

🐛 We're looking for bug hunters, by the way!

If you come across any kind of bug or wanna request a feature, please let us know by opening an issue here. We do need more ideas to keep the project alive and running, don't we? :P



👑 Acknowledgements

  • Mehdi Hasan Khan for originally developing and maintaining Avro Phonetic.
  • Rifat Nabi for porting it to Javascript.
  • Sarim Khan for writing ibus-avro which helped to clarify my concepts further.
  • Kaustav Das Modak for porting Rifat Nabi's JavaScript iteration to Python 2.
  • Md Enzam Hossain for helping him understand the ins and outs of the Avro dictionary and the way it works.

📋 License

Licensed under the MIT License.