Automatic audio transcription of plenary sessions and committee meetings

Transcription and translation

Brazil - Senate

Use case ID: 071

Author: Federal Senate of Brazil

Date: 15 October 2024

Objective:

Automatically provide audio transcriptions of legislative events, such as plenary sessions and committee meetings, using artificial intelligence (AI) models. The audio may also be subject to a process of diarization and, subsequently, speaker classification.

Actors:

Parliamentary record and editorial analysts
Escriba: an information system for managing the daily activities of the Parliamentary Record and Editorial Secretariat, such as audio transcription and text proofreading tasks

Prerequisites:

Integration with Escriba
AI models for automatic speech recognition (ASR)
AI models for audio diarization
Trained AI models for speaker classification using audio recordings from the 81 senators in office
AI robots that collect audio recordings and control the transcription, diarization and speaker classification processes

Scenario:

AI robot side:
1. A set of audio recordings is placed in a remote directory following a predefined hierarchy.
2. Each AI robot monitoring that remote directory collects an audio recording and begins the transcription process.
3. After transcription, the robot may also perform diarization and speaker classification, according to the system’s configurations as set by an admin user.
4. The transcribed text resulting from the AI processing is properly formatted as a JSON file and placed in a different remote directory.
5. Escriba reads the remote directory containing the JSON files as requested by the user.
User side:
1. A user accesses Escriba.
2. The user views the audio trails associated with them.
3. The user initiates the process of listening to and transcribing the audio recordings.
4. The user selects the option to use the audio transcripts produced previously by the AI robots.
5. The user views and, if necessary, adjusts the transcripts returned by the AI robots.

Alternate flows:

A user accesses Escriba to view their associated audio trails.
The user initiates the process of listening to and transcribing the audio recordings.
The user selects the option to use the audio transcripts produced previously by the AI robots.
The user notices that no audio transcripts are available, or that the results are of low quality.
The user discards the transcripts produced by the AI model and performs the transcription task as usual.

Expected results:

Acceleration of the overall transcription process performed by legislative analysts
Efficient transcription, diarization and speaker classification, providing timely results for legislative analysts to conduct their daily tasks
Effective transcription, diarization and speaker classification, with high-quality results

Potential challenges:

State-of-the-art ASR and diarization models require robust computer infrastructure (RAM and GPUs) in order to be used properly, i.e. to provide efficient and effective results.
AI models for ASR may be imprecise in some circumstances. For instance, when transcribing a senator’s name, the results might not be as expected.
AI models for ASR, especially multimodal generative ones, may return varying transcription results each time they are prompted to do so, lacking consistency.
AI models for ASR, especially multimodal generative ones, may block the returned results if they detect harmful content on the submitted audio.
Multimodal generative AI models for ASR require good prompts to ensure improved transcription results.
It is challenging to detect events of interest in the audio recording, such as background noises and bell rings.

Data requirements:

Audio samples from all members of the Federal Senate

Integrations with other systems:

Escriba

Success metrics:

Transcription quality as measured by the word error rate (WER) metric
Average transcription time

The Use cases for AI in parliaments collection is published by the IPU’s Centre for Innovation in Parliament as part of the Parliamentary Data Science Hub’s project to create guidelines for AI governance in parliaments.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International licence. It may be freely shared and reused with acknowledgement of the author and the IPU.

A use case describes how a system should work. It is used to plan, develop and measure implementation. A use case is not the same as a case study, which is a descriptive text of an actual project’s implementation. Please note that this use case is provided “as is” and neither the IPU nor the author accepts any responsibility for its use.

For more information about the IPU’s work on artificial intelligence, please visit www.ipu.org/AI or contact [email protected].

Impact

National Parliaments

Find a national parliament

Democracy and strong parliaments

Geopolitical groups

12th IPU Global Conference of Young Parliamentarians

Events

Knowledge

Discover the IPU's resources

Automatic audio transcription of plenary sessions and committee meetings