Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output word timings #1974

Merged
merged 14 commits into from
Mar 30, 2019
Merged

Output word timings #1974

merged 14 commits into from
Mar 30, 2019

Conversation

dabinat
Copy link
Collaborator

@dabinat dabinat commented Mar 21, 2019

Letter timings are exposed on the API and the client then converts these into word-level timings. The client outputs CSV when used with the -e command-line argument.

@ghost
Copy link

ghost commented Mar 21, 2019

No Taskcluster jobs started for this pull request
The `allowPullRequests` configuration for this repository (in `.taskcluster.yml` on the
default branch) does not allow starting tasks for this pull request.

@lissyx
Copy link
Collaborator

lissyx commented Mar 21, 2019

@dabinat @reuben I triggered it on taskcluster:

$ git fetch upstream pull/1974/head:pr-1974
$ git push origin pr-1974

Then open PR and you can close it after.

@dabinat
Copy link
Collaborator Author

dabinat commented Mar 21, 2019

Ok, done: #1976

@lissyx
Copy link
Collaborator

lissyx commented Mar 21, 2019

Ok, done: #1976

Sorry, I was just documenting for posterity, not asking you to do it :)

native_client/client.cc Outdated Show resolved Hide resolved
native_client/client.cc Outdated Show resolved Hide resolved
native_client/client.cc Outdated Show resolved Hide resolved
native_client/client.cc Outdated Show resolved Hide resolved
native_client/deepspeech.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@reuben reuben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

native_client/deepspeech.h Outdated Show resolved Hide resolved
@dabinat
Copy link
Collaborator Author

dabinat commented Mar 22, 2019

I still can't get my branch to rebase so let me know when it's ready to be committed to master and I'll create a new branch and PR.

@reuben
Copy link
Contributor

reuben commented Mar 22, 2019

No need to rebase, extra commits fixing review comments are fine. We just need @kdavis-mozilla's review now.

native_client/client.cc Outdated Show resolved Hide resolved
native_client/client.cc Outdated Show resolved Hide resolved
native_client/deepspeech.cc Outdated Show resolved Hide resolved
native_client/deepspeech.cc Outdated Show resolved Hide resolved
native_client/deepspeech.cc Outdated Show resolved Hide resolved
native_client/deepspeech.cc Outdated Show resolved Hide resolved
native_client/deepspeech.cc Outdated Show resolved Hide resolved
native_client/deepspeech.cc Outdated Show resolved Hide resolved
native_client/deepspeech.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@kdavis-mozilla kdavis-mozilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've several questions/comments. See the inline comments.

native_client/client.cc Outdated Show resolved Hide resolved
{
std::vector<meta_word> word_list;

std::string word = "";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a UTF-8 string? If so it should start with a u8.

If not, then I'm wondering how general non-ASCII strings are represented in word.

word_list.push_back(w);

// Reset
word = "";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here about UTF-8 vs non-UTF-8 string.

native_client/deepspeech.cc Outdated Show resolved Hide resolved
native_client/deepspeech.cc Outdated Show resolved Hide resolved
native_client/deepspeech.cc Show resolved Hide resolved
{
vector<Output> out = ModelState::decode_raw(logits);

return strdup(alphabet->LabelsToString(out[0].tokens).c_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know strdup() was used previously, but I worry that passing this over the ABI boundary and asking the client to free it will cause crashes if the client uses a different C runtime library.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing this requires more changes to the API, so could it be left for a follow up PR? I filed #1979.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

native_client/deepspeech.cc Outdated Show resolved Hide resolved
native_client/client.cc Show resolved Hide resolved
native_client/client.cc Show resolved Hide resolved
@dabinat
Copy link
Collaborator Author

dabinat commented Mar 23, 2019

I've made the changes requested. Let me know if you find anything else.

Copy link
Contributor

@kdavis-mozilla kdavis-mozilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lissyx
Copy link
Collaborator

lissyx commented Mar 30, 2019

Running PR with the latest branch

@lissyx
Copy link
Collaborator

lissyx commented Mar 30, 2019

@dabinat I know you got issues with rebasing and squashing in the past, is that still the case on this PR ? 12 commits, it makes it a bit complicated, if you could rebase into one or two commits it'd be perfect.

native_client/client.cc Outdated Show resolved Hide resolved
native_client/deepspeech.cc Outdated Show resolved Hide resolved
Copy link
Collaborator

@lissyx lissyx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the remainings nits are fixed, and if possible, rebased and squashed. I'll ensure PR is green on TaskCluster.

@reuben
Copy link
Contributor

reuben commented Mar 30, 2019

I told them there's no need to rebase as it's insignificant to our workflow, and we can always use GitHub's squash and merge/rebase and merge functionality in the PR anyway.

@dabinat
Copy link
Collaborator Author

dabinat commented Mar 30, 2019

I fixed those two nits. If rebasing is a problem I could create a new branch and PR, but I would want to be certain there aren't any extra changes needed before I do so.

@lissyx
Copy link
Collaborator

lissyx commented Mar 30, 2019

No worries, I'll run a last taskcluster check and merge 😊

@lissyx
Copy link
Collaborator

lissyx commented Mar 30, 2019

All green, thanks for the hard work @dabinat ! Now that this gets merged, you can improve our existing bindings (Python, JavaScript, Java/Android) as well as the rust ones https://github.com/RustAudio/deepspeech-rs and go ones https://github.com/asticode/go-astideepspeech by exposing the new feature :)

@lissyx lissyx merged commit a009361 into mozilla:master Mar 30, 2019
@dabinat
Copy link
Collaborator Author

dabinat commented Mar 30, 2019

It depends how quickly you need them. I can take on Python and JavaScript but won’t be able to for a couple of weeks.

@lissyx
Copy link
Collaborator

lissyx commented Mar 30, 2019

No deadline, just an idea I was sharing :)

@lock
Copy link

lock bot commented Apr 29, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Apr 29, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants