Panel_discussion_summarization

My Approach

The input of the file is revieved in .mp3 format

The file is converted to .wav format

An .rttm file is generated using 'pyannote/speaker-diarization@2.1' using hugging face to generate the timestamps of each speaker

This then converted to a csv format

Then the audio files are generated for those timestamps

The text is extracted from each audio file, speaker-wise

The overall summary of the file is then obtained using hugging face's "/knkarthick/MEETING_SUMMARY"

It is easy to execute the code in google colab by just uploading the .mp3 file and running all the cells

In CPU it took 45 minutes to execute the entire code for the audio uploaded

In GPU it took less than a minute to execute the entire code for the audio uploaded