IBC2024 Tech Papers: AI for audio description: A natural voice for accessibility

This study describes the development and implementation of an AI-based natural voice synthesis and automated mixing workflow for audio description (AD) in Brazilian television drama content, with a real demonstration case of success.

The evolving landscape of media consumption underscores the crucial need for inclusivity, particularly for those with visual impairments. Audio description (AD) plays an indispensable role in making media accessible, providing a verbal representation of visual content that allows visually impaired individuals to experience films, television, and live performances in meaningful ways. As described by Audio Description provides narration of the visual elements - action, costumes, settings, and the like - of theatre, television/film, museum exhibitions, and other events. The technique allows patrons who are blind or have low vision the opportunity to experience arts events more completely - the visual is made verbal. AD is a kind of literary art form, a type of poetry. Using words that are succinct, vivid, and imaginative, describers try to convey the visual image to people who are blind or have low vision” (J. Snyder).

However, traditional methods of producing audio descriptions are fraught with challenges, including high production costs and significant time demands, which have historically limited the accessibility and timeliness of such services. According to the 2010 data from the Brazilian Institute of Geography and Statistics (IBGE) (MEC), there are approximately 6.5 million people in Brazil with significant or severe visual impairments. This statistic is supported by findings from the 2019 National Health Survey (PNS) (IBGE), which indicates that 3.4% of the population, or around 3.978 million people, experience some form of visual impairment. It is crucial to recognize that audio description benefits not only those who are completely blind but also those with partial and severe vision loss. Additionally, other groups, including individuals with intellectual disabilities and learning disorders, can greatly benefit from audio description as it serves as an alternative sensory channel that aids in quicker and more effective comprehension of visual content.

This paper offers...

Latest Technical paper

IET announce Best of IBC Technical Papers

The IET have announced the publication of The best of IET and IBC 2024 from IBC2024, once again showcasing the groundbreaking research presented through the papers. The papers have been selected by IBC’s Technical Papers Committee for being novel, topical, analytical and well-written and which have the potential to make a significant impact upon the media industry. 327 papers were submitted this year, and after a rigorous selection process this publication features the ten papers deemed by the judges to be the best.

Read more
Favourites:

Registered users only: Login

Share this:
Other themes: