Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow inference on multiple individual videos via sleap-track #1777

Closed
11 tasks
emdavis02 opened this issue May 18, 2024 Discussed in #1439 · 3 comments
Closed
11 tasks

Allow inference on multiple individual videos via sleap-track #1777

emdavis02 opened this issue May 18, 2024 Discussed in #1439 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@emdavis02
Copy link
Contributor

emdavis02 commented May 18, 2024

Discussed in #1439

Problem Description

Originally posted by roomrys August 4, 2023
Currently, to run inference on multiple videos via the sleap-track command, users need to call this command many times either manually or in a script. It might be a nice feature to allow multiple videos/inputs.

Feature Proposal

  • Add an option for a folder of files in the cli instead of only an individual file path
  • Later expand this to also include a csv of file paths

Implementation Details

Currently, sleap_track takes in an argument data_path from the command line that is the file path to a .slp file, this means that the user must run this function once for every video they wish to run an inference on either manually or through a script. We would like to allow the argument data_path to also take in a path to a folder of .slp files and run an inference on each of these files. This will all be acomplished in sleap/nn/inference.py

1. Create an optional flag to the cli to specify if data_path is a folder

  • Add this component to the _make_cli_parser() function
  • Also add a description to help docs

    sleap/sleap/nn/inference.py

    Lines 5051 to 5068 in 43a4f13

    def _make_cli_parser() -> argparse.ArgumentParser:
    """Create argument parser for CLI.
    Returns:
    The `argparse.ArgumentParser` that defines the CLI options.
    """
    parser = argparse.ArgumentParser()
    parser.add_argument(
    "data_path",
    type=str,
    nargs="?",
    default="",
    help=(
    "Path to data to predict on. This can be a labels (.slp) file or any "
    "supported video format."
    ),
    )

2. Make data_path a list to enable iteration

  • Check if the flag was called in args to communicate that args.data_path is a folder
  • If it is, enter the folder and add all compatible files to data_path
  • Else, data_path = [args.data_path]

Note: The "labels" cli argument has been deprecated and will not need to be edited to accomodate this new function.

sleap/sleap/nn/inference.py

Lines 5292 to 5296 in 43a4f13

labels_path = getattr(args, "labels", None)
if labels_path is not None:
data_path = labels_path
else:
data_path = args.data_path

3. Add a loop to file loading lines

  • Iterate through data_path. The loop will encompass the entire code section shown below.
  • Change provider to a list to store a value for each item in data_path

    sleap/sleap/nn/inference.py

    Lines 5304 to 5328 in 43a4f13

    if data_path.endswith(".slp"):
    labels = sleap.load_file(data_path)
    if args.only_labeled_frames:
    provider = LabelsReader.from_user_labeled_frames(labels)
    elif args.only_suggested_frames:
    provider = LabelsReader.from_unlabeled_suggestions(labels)
    elif getattr(args, "video.index") != "":
    provider = VideoReader(
    video=labels.videos[int(getattr(args, "video.index"))],
    example_indices=frame_list(args.frames),
    )
    else:
    provider = LabelsReader(labels)
    else:
    print(f"Video: {data_path}")
    # TODO: Clean this up.
    video_kwargs = dict(
    dataset=vars(args).get("video.dataset"),
    input_format=vars(args).get("video.input_format"),
    )
    provider = VideoReader.from_filepath(
    filename=data_path, example_indices=frame_list(args.frames), **video_kwargs
    )

4. Add a loop to main() for running inference and tracking

  • Iterate through data_path in the section of main shown below. The loop will start at line 5476, before we run the inference but after the predictor.tracker is set.

    sleap/sleap/nn/inference.py

    Lines 5473 to 5485 in 43a4f13

    if args.models is not None:
    # Setup models.
    predictor = _make_predictor_from_cli(args)
    predictor.tracker = tracker
    # Run inference!
    labels_pr = predictor.predict(provider)
    if output_path is None:
    output_path = data_path + ".predictions.slp"
    labels_pr.provenance["model_paths"] = predictor.model_paths
    labels_pr.provenance["predictor"] = type(predictor).__name__

  • transplate the following lines of code into the above loop. This will need to be run for each item in data_path

    sleap/sleap/nn/inference.py

    Lines 5510 to 5541 in 43a4f13

    if args.no_empty_frames:
    # Clear empty frames if specified.
    labels_pr.remove_empty_frames()
    finish_timestamp = str(datetime.now())
    total_elapsed = time() - t0
    print("Finished inference at:", finish_timestamp)
    print(f"Total runtime: {total_elapsed} secs")
    print(f"Predicted frames: {len(labels_pr)}/{len(provider)}")
    # Add provenance metadata to predictions.
    labels_pr.provenance["sleap_version"] = sleap.__version__
    labels_pr.provenance["platform"] = platform.platform()
    labels_pr.provenance["command"] = " ".join(sys.argv)
    labels_pr.provenance["data_path"] = data_path
    labels_pr.provenance["output_path"] = output_path
    labels_pr.provenance["total_elapsed"] = total_elapsed
    labels_pr.provenance["start_timestamp"] = start_timestamp
    labels_pr.provenance["finish_timestamp"] = finish_timestamp
    print("Provenance:")
    pprint(labels_pr.provenance)
    print()
    labels_pr.provenance["args"] = vars(args)
    # Save results.
    labels_pr.save(output_path)
    print("Saved output:", output_path)
    if args.open_in_gui:
    subprocess.call(["sleap-label", output_path])

5. Add an aditional loop to main() for just running tracking

  • Iterate through data_path for the following code. The loop will start after the elif and contain the rest of the attatched lines.

    sleap/sleap/nn/inference.py

    Lines 5487 to 5500 in 43a4f13

    elif getattr(args, "tracking.tracker") is not None:
    # Load predictions
    print("Loading predictions...")
    labels_pr = sleap.load_file(args.data_path)
    frames = sorted(labels_pr.labeled_frames, key=lambda lf: lf.frame_idx)
    print("Starting tracker...")
    frames = run_tracker(frames=frames, tracker=tracker)
    tracker.final_pass(frames)
    labels_pr = Labels(labeled_frames=frames)
    if output_path is None:
    output_path = f"{data_path}.{tracker.get_name()}.slp"

  • Again, we will have to transplant the following lines of code into the loop.

    sleap/sleap/nn/inference.py

    Lines 5510 to 5541 in 43a4f13

    if args.no_empty_frames:
    # Clear empty frames if specified.
    labels_pr.remove_empty_frames()
    finish_timestamp = str(datetime.now())
    total_elapsed = time() - t0
    print("Finished inference at:", finish_timestamp)
    print(f"Total runtime: {total_elapsed} secs")
    print(f"Predicted frames: {len(labels_pr)}/{len(provider)}")
    # Add provenance metadata to predictions.
    labels_pr.provenance["sleap_version"] = sleap.__version__
    labels_pr.provenance["platform"] = platform.platform()
    labels_pr.provenance["command"] = " ".join(sys.argv)
    labels_pr.provenance["data_path"] = data_path
    labels_pr.provenance["output_path"] = output_path
    labels_pr.provenance["total_elapsed"] = total_elapsed
    labels_pr.provenance["start_timestamp"] = start_timestamp
    labels_pr.provenance["finish_timestamp"] = finish_timestamp
    print("Provenance:")
    pprint(labels_pr.provenance)
    print()
    labels_pr.provenance["args"] = vars(args)
    # Save results.
    labels_pr.save(output_path)
    print("Saved output:", output_path)
    if args.open_in_gui:
    subprocess.call(["sleap-label", output_path])

Documentation Changes

Changes will be made to the sleap-track section of the documentation

positional arguments:
data_path Path to data to predict on. This can be one of the following:

  • A .slp file containing labeled data.
  • A folder containing multiple video files in supported formats.
  • An individual video file in a supported format.

optional arguments:
...
-o OUTPUT, --output OUTPUT The output filename or directory path to use for the predicted data. If not provided, defaults to '[data_path].predictions.slp'.

@eberrigan
Copy link
Contributor

eberrigan commented May 19, 2024

Great job @emdavis02!

  • We could try it with and without the additional argument to determine if the input is a directory. It might be sufficient to use Path.isdir() and Path.isfile() documented here.
  • There is either inference alone, or inference with tracking.
  • Make sure our current implementation of the CLI works so that these changes are backwards-compatible.
  • Please add examples of intended use cases and test that the new implementation behaves as expected. These examples will be added to the CLI documentation.
  • Add necessary tests for inference.py to make sure changes are covered.

@talmo
Copy link
Collaborator

talmo commented May 21, 2024

  • We could try it with and without the additional argument to determine if the input is a directory. It might be sufficient to use os.path.isdir() and os.path.isfile()` documented here.

Just jumping in to say: please use pathlib instead of os.path APIs!

@eberrigan
Copy link
Contributor

Thanks! Here is the correspondence between os.path and Path: correspondence to tools in the os module

@eberrigan eberrigan added the enhancement New feature or request label Jun 7, 2024
talmo pushed a commit that referenced this issue Jul 19, 2024
* implementing proposed code changes from issue #1777

* comments

* configuring output_path to support multiple video inputs

* fixing errors from preexisting test cases

* Test case / code fixes

* extending test cases for mp4 folders

* test case for output directory

* black and code rabbit fixes

* code rabbit fixes

* as_posix errors resolved

* syntax error

* adding test data

* black

* output error resolved

* edited for push to dev branch

* black

* errors fixed, test cases implemented

* invalid output test and invalid input test

* deleting debugging statements

* deleting print statements

* black

* deleting unnecessary test case

* implemented tmpdir

* deleting extraneous file

* fixing broken test case

* fixing test_sleap_track_invalid_output

* removing support for multiple slp files

* implementing talmo's comments

* adding comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants