Skip to content

Official code repository of the paper: Gullal S. Cheema, Judi Arafat, Chiao-I Tseng, John A. Bateman, Ralph Ewerth, and Eric Müller-Budack. 2024. Identification of Speaker Roles and Situation Types in News Videos. ICMR 2024

Notifications You must be signed in to change notification settings

TIBHannover/SRR_NSR_News_Videos

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Identification of Speaker Roles and Situation Types in News Videos

This is the official github page for the paper:

Gullal S. Cheema, Judi Arafat, Chiao-I Tseng, John A. Bateman, Ralph Ewerth, and Eric Müller-Budack. 2024. Identification of Speaker Roles and Situation Types in News Videos. In Proceedings of the 2024 International Conference on Multimedia Retrieval (ICMR '24). Association for Computing Machinery, New York, NY, USA, 506–514. https://doi.org/10.1145/3652583.3658101

Environment Setup

TODO

Dataset

  • Splits, features and speaker turn clip names used in the paper are available in dataset/
  • Drop an email for access to speaker turn clips and full videos
  • Structure of feature pickle files (bild_features_segmentbased.pkl or bild_features_windowbased.pkl):
    • Dictionary with keys as clip names (e.g., '20220105_Corona_Regeln_Unsere_Freiheit_gerät_0RM8KQi3Muk_1')
      • Each speaker type clip name key contains a dictionary with the following keys:
        • 'feature': Array containing the feature vector
        • 'label_0': Speaker role label for level 0 (anchor, reporter, external)
        • 'label_1': Speaker role label for level 1 (anchor, reporter, expert, politician, layperson, other)
        • 'start': Start time of the speaker turn in seconds
        • 'end': End time of the speaker turn in seconds
      • Each situation type clip name key contains a dictionary with the following keys:
        • 'feature': Array containing the feature vector
        • 'label': News situation label (talking-head, voiceover, interview, commenting, speech)
        • 'start': Start time of the speaker turn in seconds
        • 'end': End time of the speaker turn in seconds
      • Example:
        {
          '20220105_Corona_Regeln_Unsere_Freiheit_gerät_0RM8KQi3Muk_1': {
            'feature': array([7.32914681e-02, 6.23861488e-03, ...],  # Feature vector
            'label_0': 2,
            'label_1': 3,
            'start': 1.579,
            'end': 29.7
          },
          '20220120_Omikron_Welle_Diese_Impfpflicht_ist_pWZDF3rJ744_231': {
            ...
          },
          ...
        }
        

To-Dos

  • Features and splits from the paper
  • Raw videos to be shared via private link
  • Feature extraction code
  • Training and Evaluation code
  • Comparison methods code
  • Analysis plots

About

Official code repository of the paper: Gullal S. Cheema, Judi Arafat, Chiao-I Tseng, John A. Bateman, Ralph Ewerth, and Eric Müller-Budack. 2024. Identification of Speaker Roles and Situation Types in News Videos. ICMR 2024

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published