in Words
ADLAB Audio Description guidelines
edited by A. Remael, N. Reviers and G. Vercauteren


The guidelines presented in this document are the result of a three-year (2011-2014) research project on audio description (AD) for the blind and visually impaired, financed by the European Union under the Lifelong Learning Programme (LLP). The basic motivation for the launching of the project was the need to define and create a series of reliable and consistent, research-based guidelines for making arts and media products accessible to the blind and visually impaired through the provision of AD.

These guidelines are intended for AD professionals and students to help them create quality services, but they also consider those people that have come into contact with AD in their personal or professional lives and who wish to better understand the challenges of the practice. The guidelines can be read in their entirety, but their structure also allows you to pinpoint a specific issue and browse the relevant item chapter. The chapters have been grouped in three sections: section 1 is an introduction to AD and introduces some related concepts. Section 2, AD scriptwriting, consists of the guidelines for writing audio descriptions for recorded AD, cinema and television more specifically. Section 3, Information on the AD process and its variants, provides a good insight into the various steps involved in the production of a finalised audio-described product. The chapters in section 3 are informative and have been designed to give you a maximum of insight and knowledge about the whole AD production process in a nutshell. To conclude, the guidelines also have a number of appendices: an example of an AD script and an Audio Introduction, a glossary with key terms and their definitions and finally a section with suggestions for further reading.

These guidelines have been edited by Aline Remael, Gert Vercauteren and Nina Reviers (University of Antwerp) and include contributions from the following authors: Iwona Mazur, Uniwersytet im. Adama Mickiewicza w Poznaniu, Gert Vercauteren, Aline Remael, University of Antwerp, Anna Maszerowska, Universitat Autònoma de Barcelona, Elisa Perego, Università di Trieste, Agnieszka Chmiel, Uniwersytet im. Adama Mickiewicza w Poznaniu, Anna Matamala, Pilar Orero, Universitat Autònoma de Barcelona, Chris Taylor, Università di Trieste, Bernd Benecke and Haide Völz, Bayerischer Rundfunk, Nina Reviers, University of Antwerp, Josélia Neves, Instituto Politécnico de Leiria.

We wish to take the opportunity here to thank all people involved in the ADLAB project, in particular: Manuela Francisco (Instituto Politécnico de Leiria) for her work on the e-book version of these guidelines, the members of the advisory board for their constructive suggestions, Alex Varley (Media Access Australia) and Jan-Louis Kruger (Macquarie University, Australia) and all the user associations, organisations, museums, providers, companies from all the partner countries that were so kind to contribute to the project, and everyone else who contributed in some way.

Additionally we would like to thank Pathé UK for letting us use the AD script of Slumdog Millionaire, thank Katrien Lievois, as we are indebted to her research on intertextuality in AD ('Audio-describing cinematographic allusions', paper presented at the International Media for All Conference, Dubrovnik, September 2013) and mention that contributions by Iwona Mazur and Agnieszka Chmiel have been partially supported by a research grant of the Polish Ministry of Science and Higher education for the years 2012-2014, awarded for the purposes of implementing a co-financed international project (agreement number 2494/ERASMUS/2012/2).


Aline Remael, Nina Reviers, Gert Vercauteren, University of Antwerp

This introduction provides the necessary conceptual framework regarding audio description that you will need to make optimal use of these guidelines. It first gives a short definition of AD and its production process. Then it explains how stories are told in audiovisual materials, and introduces the main topics that are explained in greater detail in section 2, AD scriptwriting. Next a description of the target audience is provided, followed by a discussion of two thorny issues in AD: equivalence and objectivity. Finally, there is a section on how to use these guidelines.

What is Audio Description: A definition

AD is a service for the blind and visually impaired that renders Visual Arts and Media accessible to this target group. In brief, it offers a verbal description of the relevant (visual) components of a work of art or media product, so that blind and visually impaired patrons can fully grasp its form and content. AD is offered with different types of arts and media content, and, accordingly, has to fulfil different requirements. Descriptions of "static" visual art, such as paintings and sculptures, are used to make a museum or exhibition accessible to the blind and visually impaired. These descriptions can be offered live, as part of a guided tour for instance, or they can be made available in recorded form, as part of an audio guide. AD of "dynamic" arts and media services has slightly different requirements. The descriptions of essential visual elements of films, TV series, opera, theatre, musical and dance performances or sports events, have to be inserted into the "natural pauses" in the original soundtrack of the production. It is only in combination with the original sounds, music and dialogues that the AD constitutes a coherent and meaningful whole, or "text". AD for dynamic products can be recorded and added to the original soundtrack (as is usually the case for film and TV), or it can be performed live (as is the case for live stage performances).

Depending on the nature of a production additional elements may be required to render it fully accessible. In the case of subtitled films, the subtitles need to be voiced and turned into what are called Audio Subtitles (AST). Some films or theatre productions require an introduction (called Audio Introductions, AI) for various reasons. In the case of museum exhibitions, descriptions may be combined with touch tours or other tactile information. In all cases, websites can be used to provide additional information about a production or exhibition, provided they are accessible too.

Overview of the process from start to end

The creation and distribution of ADs is a complex process that requires the collaboration of multiple professionals from different fields: audio describers, voice talents or voice actors, sound technicians and users. Even if each AD provider has its own best practice, the production process for film and TV series usually includes the following steps:

The AD production process

Figure 1: The AD production process

  • (1) Writing the AD script:
    • Viewing and analysing the source material (henceforth called “Source text", ST). This can include a blind viewing.
    • Writing the descriptions in what is called the AD script (or “target text", TT) and timing them so as not to cause overlap with the other channels on the soundtrack, especially the dialogues.
    • Reviewing the AD script while viewing the film. This can be done together with a blind or visually impaired collaborator.
  • (2) Rehearsing the descriptions with the voice talents and making final changes where appropriate. Sometimes the writer of the AD script and the voice talent(s) are one and the same person.
  • (3) Recording the AD with voice talents or synthetic voices.
  • (4) Mixing the AD with the original soundtrack in the appropriate format (different for DVD, cinema, festivals, etc.).

The main focus of these guidelines is on the AD scriptwriting phase (step 1). This process is more or less the same for all types of AD, from film to theatre and visual arts. But the production process for live AD obviously does not include steps 3 and 4. The recording process for audio guides with AD is also quite different from that for film.

What is a story and how is a story told

One of the basic theoretical principles underlying the approach taken in these guidelines is that many films, TV programmes, theatre plays, etc. want to offer their audiences an experience that is driven by a story or narrative. Given their inherently multimodal nature, the stories told in these and other audiovisual products will only be partly accessible to blind and partially sighted audiences and it will be the audio describer’s task to provide the information to which these audiences do not have access, so that they can reconstruct the story told in the ST in the fullest possible way. This task consists of two parts that are also reflected in the structure of the different chapters in these guidelines: first describers must analyse their ST to identify what story filmmakers want to tell and what principles and techniques they use to tell their story. Then describers must decide what narrative elements to include in the TT or AD script and how to formulate the description.

The better audio describers know how filmmakers tell stories and how audiences reconstruct them, the better they will be equipped to create their AD. Therefore the following paragraphs will briefly introduce the main principles of story creation as performed by the filmmaker and of story reconstruction as performed by the audience.

Navigate to Table of Contents

Story-creation by an author or filmmaker

It is clearly beyond the scope of this section to discuss all the narratological principles underlying story creation at length, but broadly speaking story-creation is a three-stage process in which authors or filmmakers combine elements from two main narrative building blocks. In the first stage the author decides 1) what characters to include in the story and what actions they perform or undergo, and 2) in what spatial and temporal settings these actions will take place. In the second stage, the author will decide how exactly the story will be told:

  • the author decides on the order in which actions will be presented. He can either present them chronologically or decide to change their order for various reasons: he can use a flashback to explain a specific character trait, or just to delay the action going on and hence create suspense. He can decide to present two different lines of action separately or interweave them to show parallels and differences between them. He also has to decide on the frequency of the actions: he can decide to show actions more than once (maybe from the point-of-view of different characters) or not show them at all, for example to raise questions in the audience. Finally, he has to decide on their duration, i.e. whether the actions are presented at their normal speed or in fast-forward (e.g. to omit unimportant parts of the narration or to create a comical effect) or in slow-motion (e.g. to reflect a character’s mental reaction to an action, or to slow down the narration to create suspense);
  • the author decides on the specificities of the characters: their physical characteristics, their mental properties and their behaviour;
  • the author finally has to fill in the spatio-temporal frame and its details. In other words, he has to decide in what time and place the actions will be set.

In the first two stages, the story is in fact an abstract construct and it is only in the third stage that the author decides how he will present it concretely. In the case of films and other recorded audiovisual products, this implies using different film techniques to decide what is shown (e.g. a close-up to depict a character’s emotional reaction to an event, or a panoramic shot to present a majestic landscape), how it is shown (e.g. a specific camera angle to represent the superiority of one character over another, a scene presented in black and white to signal this is a dream sequence) and what the relations between different shots are (e.g. a flashback to explain the movement of a character from one setting to another, etc.). One very important aspect with regard to these relations is that the author or filmmaker has to maintain continuity, i.e. that he must ensure that the techniques he uses to combine the various shots and scenes create a coherent and consistent whole. Not only do these techniques serve a narrative function, they are also used to determine the style of the ST and they ensure its cohesion.

In other words: story-creation is a highly complex process involving various stages and offering quasi endless possibilities when it comes to selecting and combining different elements. Only after a careful analysis of both content and style of the ST, can the describer create a description that mirrors this ST.

Story-reconstruction by the audience

Stories are never created in a vacuum. They are made to be read, listened to or watched by an audience that follows a path that is opposite to the one followed by the author, described in the previous section. In other words: audiences are presented with the concrete narrative, and they have to interpret and process it to arrive at the original, abstract (chronological) construct that the author started from. In part this is an individual process, dependent on the individual audience member’s knowledge and background. To a considerable extent, however, this story-reconstruction by the audience is a more or less universal process that we have all mastered. Audiences recreate stories according to general principles and better insight into these principles can help describers decide what to include in their descriptions and how to formulate them in order to make the story-recreation process easier for the blind and visually impaired.

Basically, audiences reconstruct stories by creating mental models of them, i.e. mental constructs of who did what with and/or to whom, where, when and why. Again this introduction does not allow us to fully elaborate on the construction of mental models by audiences, but they can be presented schematically as follows:

mental models in story-reconstruction

Figure 2: mental models in story-reconstruction

Central in this representation are the actions that drive the story forward. Those occupy an essential position in the model, to which all other aspects are related. In other words, when audiences process and interpret a story, they will look at the actions that are being performed and combine them with other information from the story: the characters that cause and undergo them and the spatio-temporal settings in which they take place. In addition, they will process the temporal relations between the different actions shown, in order to reconstruct their chronological order. As the story progresses, audiences continuously update their mental model of the story, adding new information to it, confirming what was already there or what they inferred, and changing existing information or assumptions based on information they have received later.

Just like story-creation, this story-reconstruction is a highly complex process that can, in very generalising terms, be seen as comprising two levels. On a first level, audiences create frames that serve as a context for every event in the story. In these frames information on the characters that are present and the spatial and temporal circumstances in which the event takes place, take the form of general labels, i.e. “John”, “London”, “1997””. On the second level, additional, more detailed information is added to these labels: John is a dark-haired man in his forties, he lives in a flat in a specific neighbourhood. It is a sunny day in the summer of 1997. An example of a “frame” could be a character’s office, and an example of a representation, a detailed description of a particular office chair. Such a seemingly secondary element in the mise-en-scène can be important if it has a symbolic function.

When a new story event is presented, the audience will check whether it can be attributed to an already existing frame, whether an existing frame has to be updated (for example because the spatio-temporal setting has remained the same but characters leave the setting or new characters enter it) or whether a new frame has to be created (a new location, for instance). These frames form the basis for the comprehension of the story: when audiences cannot link an event (e.g. a murder) to a frame (e.g. a future inheritance) or cannot create a new frame for a certain event (e.g. an inexplicable change in relations between characters), they will no longer be able to follow the story (temporarily or permanently).

Translated to the context of AD, describers first and foremost have to make sure that their AD (in combination with the original ST) contains all the necessary cues to allow the blind and partially sighted audience to create a context for every event taking place in the story. In a second stage they select the more detailed information or "entity representations" that fit that particular context.

Navigate to Table of Contents

Audio description: From visual to verbal narration

The mixed target audience

The primary target audience of AD consists of blind and partially sighted viewers. However, this audience is very diverse. Some people were born blind (a minority), some became blind early on or later in life (e.g. as a result of an illness). Others have different degrees and different types of visual impairment. In brief, the target group is composed of subgroups that are all composed of individuals with different visual experiences and a different knowledge of the world. For economic and practical reasons, AD today aims to cater for all of these, which means that part of the challenge is to find a golden mean that will make the ST accessible to all.

In addition, AD is being used by an increasingly large group of sighted viewers for an equally varied number of reasons. Immigrants may use AD to learn the language of their host country, children may use AD as they are acquiring languages, people with ADHD may use the information provided by AD to help them focus on the programme. All this means that some users will still rely on the visual information to some extent, whereas others might use the AD as a talking book. This has important implications for synchrony between AD, sounds/dialogues and images.

Story-reconstruction, equivalence and objectivity

Audio describers are also viewers. This means that the filmic story as told by audio describers, will always be their own interpretation of the film. Different audio describers will produce (somewhat) different audio descriptions. In this sense, AD is similar to other forms of translation.

In Translation Studies (TS), equivalence is a term that is still used to refer to the relation between ST and TT but that is also regarded as problematic because such "equivalence" is very difficult to define and is never absolute. Any translation will have deletions, additions, reformulations etc. when compared with its ST. In the case of AD the concept is even more problematic: when is the verbal rendering of an audio-visual production "equivalent" to its aural/visual ST? No watertight definition is possible. On the other hand, AD does strive to give its target audience an experience that is comparable to that of the sighted target audience, which is itself composed of individuals who will all see the film differently.

However, watching films and TV series is also a social given: experiencing a film gives the blind and visually impaired an inroad into the social world of the sighted. Consequently, the AD is expected to respect the ST genre and the specific story it tells, allowing the aural filmic channels to contribute. In addition, watching a film is often a leasurely activity. This means that the AD does not merely focus on giving the information that is deemed to be missing, it also aims to create a pleasant experience for its users without overburdening their information-processing capacities.

Although "objectivity" is an AD aim that recurs in many of the earlier AD guidelines, no one ever sees the same film, as you will know from discussions with your friends. This is no different for the blind and visually impaired audience since it is just as heterogeneous as the sighted one. AD too is always subjective to some extent since it is based on the interpretation of the audio describer. Moreover, rendering images into words often entails making a visual piece of information more or less explicit, depending on the needs of the story, which always results in minor shifts in meaning. This is inevitable. It is also inevitable that the AD will "guide" the VIPs to some extent. Finding a balance between a personal interpretation and personal phrasing (subjectivity) and more text-based interpretation and phrasing (objectivity) that leaves room for further interpretation by the blind and visually impaired users is part of mastering the AD decision-making process and writing skills discussed in the various chapters of the present guidelines.

Navigate to Table of Contents

How to use these guidelines

The previous sections have illustrated the importance of a detailed analysis of the text and the context in which it is produced in order to create a professional AD that is clear and engaging. This analysis consists of a close examination of the source material, including background research about the text and its production. This then feeds into the decisions that you as a describer need to make about what to include in the TT and how, keeping the specific but heterogeneous target audience in mind. All the decisions you make regarding the AD will always be co-determined by the particular context in which a given narrative event (e.g. the introduction of a character) occurs and often there will be more than one option regarding how to describe it. The purpose of section 2, AD scriptwriting, is to help you make your own decisions and identify appropriate strategies.

Each chapter in this section is dedicated to one particularly thorny issue regarding AD scriptwriting (Characters and action, Spatio-temporal settings, Genre, Film language, Sound effects and music, Text on screen, Intertextual references, Wording and style, Cohesion) and follows the same basic structure. First it defines its topic. Next a section called "Source Text analysis" helps you to ask the right questions about this topic. These questions will allow you to analyse what the productions communicate to their users and how they do it. In particular, they will help you to better understand how the film medium guides the audience’s attention to the core elements of the narrative. Such insights will help you distinguish those core elements from secondary elements of minor relevance; an essential skill since time for description is limited in AD. Finally, the section "Target Text creation" will offer possible strategies for deciding how to translate these findings into an AD script.

Then, you will be able to develop your own decision-making process that generally includes the following steps:

  • (1) Decide what narrative elements must ideally be included in the AD;
  • (2) Locate the "silent gaps" in the ST, determine how long they are and how much AD can be added;
  • (3) Decide what elements are also conveyed through other channels besides the visual, e.g. sound or dialogue;
  • (4) Based on steps 1, 2 and 3, decide whether a given narrative element will be omitted (no time or redundancy with other channels) or whether it will be described. If in doubt, make a note that testing with a blind or visually impaired collaborator may be required;
  • (5) Decide on an appropriate strategy for those elements that need description in terms of when to include the AD and how to formulate it.

There are a range of possible strategies for describing a narrative event with different gradations in the explicitness of description, that are explained and illustrated in each of the following chapters. Generally speaking, though, they imply a choice between "objectively" describing what you see on screen (a strategy located at one end of the scale), naming what can be seen more accurately (located somewhere in the middle of the scale) or explaining what the visual element means (located at the other end of the scale). An example:

“A flashback” versus “Back in 1930”
“Her eyes open wide” versus “She is amazed”
Or a combination of these:
“Her eyes open wide in amazement.”

The purpose of section 3 “Information on the AD process and its variants” is to better understand your role as an audio describer in the entire AD process. It will help you understand the decisions made by specialists regarding the other stages of the AD process, for instance, regarding how to edit or what voices to choose for the recording of the script. As it is good and even desirable to be aware of the process as a whole, this section includes chapters on technical issues such as preparing the script for recording or the supplying of audio subtitles. This will help you identify your place as the AD scriptwriter in the bigger picture and will get you acquainted with technical issues, the challenges of AD for multilingual productions and variants on recorded AD. In addition, section 3 contains chapters of an informative nature on the most important AD variants (Audio Introductions, Combining AD with audio subtitles, Audio describing theatre performances and Descriptive guides for museums, cultural venues and heritage sites). The appendices in section 4 provide you with a detailed glossary and reading list for further study.

Navigate to Table of Contents

Narratological building blocks

Characters and action

Iwona Mazur, Uniwersytet im. Adama Mickiewicza w Poznaniu
What are characters?

Characters and their actions and reactions are an essential part of a film narrative, moving the story forward. Characters have a physical body, but they also have traits, such as skills, attitudes, habits or tastes. If a character has only a few traits, then they are said to be one-dimensional, if they have many traits (sometimes contradictory ones), they are three-dimensional. In film, traits of characters are usually revealed quickly and in a straightforward manner.

Source Text analysis

We get to know characters through their physical appearance, actions, and reactions (manifested, for example, by means of gestures and facial expressions), as well as through what they say and how they say it. For instance, in The Devil Wears Prada (Frankel, 2006) the way Andrea dresses (and the metamorphosis she undergoes) is an important part of the narrative. In Inglourious Basterds (Tarantino, 2009) Col. Landa’s meticulousness is depicted through his actions: the way he neatly arranges writing materials on the table when interrogating LaPadite, the way he eats strüdel at a Paris café when he talks to Shosanna. In All About Steve (Traill, 2009), on the other hand, Mary’s emotional reactions reflected in her extensive grimacing and fidgeting are an important part of her characterisation. And lastly, in Annie Hall (Allen, 1977) the neurotic nature of Alvy Singer is mainly manifested through what he says and how he says it. But characters can also be revealed to us by the way others react to them as well as by their environment (see chapter 2.1.2 on spatio-temporal settings) or by means of film techniques. For example, in Away from Her (Polley, 2006), the main character’s developing Alzheimer’s is reflected by the "patchy" editing (for more information on film techniques see chapter 2.2.1).

When analysing your ST you can use the following checklist to identify the nature and role of the characters.

  • In a narrative there is usually at least one focal character (protagonist, antagonist). Focal characters push the story forward to the greatest extent. In addition, there may be supporting characters, whose role in a narrative is secondary, though their significance for the film, or a given scene, can be substantial. There are usually some background characters (played by the so-called extras). War films and epic films especially tend to employ a large number of background actors;
  • Characters can be new or known. If a character is new, they appear in the film for the first time, if they are known, they appear again, looking either the same (or similar) or different. For example, in Tootsie (Pollack, 1982) Michael Dorsey, the main character played by Dustin Hoffman, starts to dress as a woman and transforms into "Dorothy Michaels";
  • Characters may be used for the purposes of temporal orchestration (see chapter 2.1.2). For example, their changed looks can signify the lapse of time (e.g. Benjamin Button getting younger in The Curious Case of Benjamin Button (Fincher, 2008) or be used to signal a flashback or a flashforward. For example, in Slumdog Millionaire (Boyle & Tandan, 2008) Jamal Malik is 18 years old during the game show, but is presented as a 5-year old, 10-year old, etc. during the flashbacks, which do not correspond chronologically to Jamal's life, so the story switches between different periods of Jamal’s life (childhood, adolescence);
  • Characters can be related or linked to each other. For instance, in The Wedding Planner (Shankman, 2001) the titular wedding planner Mary meets and falls for a man called Steve, who later turns out to be her client’s fiancé “Eddie”;
  • Characters can be authentic or fictional. While fictional characters have been made up by the screenplay writer, authentic ones refer to actual persons. They can be played by actors (e.g. Virginia Wolf played by Nicole Kidman in The Hours (Daldry, 2002), or Julia Child by Meryl Streep in Julie and Julia, (Ephron, 2009)) or they can appear as themselves (the so-called cameo appearance). For example, in The Devil Wears Prada (Frankel, 2006) there is a cameo appearance of the fashion designer Valentino Garavani. Authentic characters can be either familiar or unfamiliar to the audience. With familiar authentic characters the assumption is that they are known to the majority of the target audience, while unfamiliar authentic characters may be unknown to them. This can be highly culture- or country-specific. For example, we can assume that the American chef and television personality Julia Child will be familiar to most of the American audience, but not necessarily so outside of the U.S. (also see chapter 2.2.4 on intertextual references);
  • Characters can be real or unrealistic. The latter are especially popular in science-fiction, fantasy and children’s films, for example E.T. in E.T. the Extra-Terrestrial (Spielberg, 1982), hobbits and gollums in The Lord of the Rings (Jackson, 2001-2003) or The Tin Man in The Wizard of Oz (Langley, 1939);
  • Characters can have a symbolic function: they may represent a certain group of people, social class, profession, a stereotype or an idea. For example, in Rock of Ages (Shankman, 2012) Stacee Jaxx is a prototypical rock star, both in terms of looks and behaviour.

Navigate to Table of Contents

Target Text creation

Having analysed the types of characters in a film and the functions they perform, you can now proceed to create your description.

  • Determine focal characters as well as any relevant supporting characters. These will most probably be the focus of your description. If there are any symbolic characters, decide what feature(s) make(s) them represent a given group or idea;
  • Determine through what means characters and their traits manifest themselves most (e.g. physical appearance, actions, reactions, dialogues). This should give you an indication as to what to prioritise in your description. However, if there is sufficient time between dialogues, insert a description of a character’s looks even if it is not their most determining factor. This will help the blind visualise the story better;
  • Determine whether there are any authentic characters in the film. If so, decide whether they may be familiar or unfamiliar to the target audience (see examples below). See if there are any cameo appearances relevant to the story;
  • Identify any unrealistic characters and determine their most prominent features (see examples below). Decide whether to use any similes or comparisons to describe them, e.g. rabbit-like ears.
  • Determine any relations or links between and among characters. Decide whether to name the links explicitly or let the audience infer them on their own, based on the dialogues or plot.
  • Determine whether a character is new or known;
  • If the character is new, decide whether to name them right away or wait until they are actually named in the film. When deciding, consider the moment they are named in the film or whether the character’s identity needs to be kept secret. If you decide not to name the character right away, use a short consistent description to identify them (see below). If the character is known, decide whether their looks have changed in a way that is relevant to the story or its temporal orchestration, signifying the lapse of time. If the character is hard to recognise at first, decide whether to say explicitly who they are or describe their changed looks and let the viewers infer their identity on their own, based on the context or dialogues;
  • If the character is new, decide how to describe their looks. Determine the features that are the most unique about the character: a scar or a white beard. You can then use those features to consistently identify characters that are not named right away (see above), for example "a man with a white beard";
  • You may also decide to describe a character gradually, adding a feature or two when the character reappears on the screen. It may be necessary due to time constraints or you may not want to overload the audience with a too lengthy and detailed description at one go, which could make their concentration lapse;
  • Determine what actions and reactions of a character move the story forward to the greatest extent. Decide what words will most succinctly and vividly convey the character’s actions (see chapter 2.3.1 on wording and style). Identify gestures and facial expressions that best reflect their reactions and decide which of them to describe and which to leave out (see examples below);
  • Determine what a character’s environment, as well as reactions of others towards them or the use of specific film techniques tell us about the character. For example, a character’s pedantic nature can be emphasised by describing how all items in their apartment are meticulously arranged. Decide which elements to describe and how (see chapter 2.1.2 on spatio-temporal settings). As for the reactions, instead of saying that a woman is beautiful, you could describe how men respond to her with awe and admiration. And finally, with film techniques (see the Away from Her (Polley, 2006) example above), decide whether and how to render them in your description (for more information see chapter 2.2.1 on film language).

An example of authentic characters from The Hours (Daldry, 2002):

  • “Virginia Woolf” (name)
  • “The English writer Virginia Woolf”(gloss + name)
  • “A middle-aged woman with a slightly hooked nose and hair pulled back in a bun” (describe)
  • “A middle-aged woman with a slightly hooked nose and hair pulled back in a bun, Virginia Wolf “(describe and name)

An example of unrealistic characters from The Lord of the Rings (Jackson, 2001-2003):

  • “A hobbit” (name)
  • “A short human-like creature with slightly pointed ears and large fur-covered feet” (describe)
  • “A hobbit: a short human-like creature with slightly pointed ears and large fur-covered feet” (name and describe)

An example of gestures from Inglourious Basterds (Tarantino, 2009)

  • “He makes the characteristic Italian hand gesture” (gloss)
  • “He makes the ‘What do you want’ gesture” Or: “He makes the ‘annoyance’ gesture” (name the gesture)
  • “His fingertips pressed together, turned towards his face” (describe)
  • “He makes the Italian ‘What do you want’ gesture: his fingertips pressed together, turned towards his face” (name and describe)

An example of facial expressions from Inglourious Basterds (Tarantino, 2009)

  • “Shosanna’s eyes are wide open. She’s gasping for breath” (describe)
  • “Shosanna is petrified” (name the emotion)
  • “Shosanna’s eyes are wide open in terror” (describe and name the emotion)

Navigate to Table of Contents

Spatio-temporal settings and their continuity

Aline Remael, Gert Vercauteren, University of Antwerp
What are spatio-temporal settings?

All stories take place in particular spatio-temporal settings (in the remainder of this section referred to as "settings"), which comprise both a temporal and a spatial dimension. These settings are intrinsically linked to the characters and their actions (see chapter 2.1.1) as they take place in the story (i.e. there can be no actions without a setting). Settings are therefore one of the basic narrative building blocks (see chapter 1 introduction) and as such require specific attention in the description. In addition, the importance and function of time and setting may change in the course of the story These changes are signalled in the text by cues, i.e. various film techniques (see chapter 2.2.1), which describers must pick up when analysing the ST.

The different settings of a filmic story are also linked to each other through editing (see chapter 2.2.1) and the way they are linked can reflect different temporal relations between them. They can follow each other either chronologically or in flashback or flashforward. The time period that has elapsed between scenes will vary. We will refer to this time factor as that of temporal orchestration.

Source Text analysis

When analysing your ST for spatio-temporal settings and the connections between them, you can use the following checklist of the major spatio-temporal features to identify their precise nature.

  • Settings can be global (e.g. a mountain range, as in the opening scene of Brokeback Mountain (Lee, 2005) or local (e.g. an interrogation room, as in in the opening scene of Slumdog Millionaire (Boyle & Tandan, 2008). During the film, and even within scenes, these settings can move from global to local, or the other way around, as in the opening scene of The Forgotten (Ruben, 2004), starting with a global bird’s eye view of a city, and moving to a more local or specific location, namely a playground in a park;
  • Settings can be background to the action or serve a narrative/symbolic function. A police office, for instance, may simply be the backdrop for part of the action of a crimi. A castle, by contrast, may symbolise the power of its inhabitant, whereas a dilapidated castle may indicate the downfall of that person. An island, or an asylum on an island, may suggest confinement, as in Shutter Island (Scorsese, 2010);
  • Settings can be real or imagined. A setting that is real, is experienced as such by the characters in the fictional world of the story, a setting that is imagined may exist only in the mind, memory or dreams of one of the characters (e.g. in the film versions of Alice in Wonderland (Burton, 2010)).
  • Settings can be new or known. A new setting is one that is introduced for the first time. A known spatio-temporal setting is one to which a character or the action returns in the course of the story. This known setting can be precisely the same as when it was shown first, or it may have changed (e.g. in Ransom (Howard, 1996)) the penthouse where the protagonist lives, is a warm, welcoming place in the opening scene, but becomes a much colder environment after his son is kidnapped and the struggle to get him back, intensifies);
  • Settings can be presented explicitly or implicitly. In the opening scene of Slumdog Millionaire (Boyle & Tandan, 2008) the text on screen "Mumbai, 2006" tells us explicitly when and where the story is taking place, while in Girl with a Pearl Earring (Webber, 2003), the 17th century setting is presented implicitly by means of the buildings and the characters’ clothing.
  • Settings can be well-known or unfamiliar and how they are identified, in this case, will depend on the background and knowledge of the viewer, including the describer. St Paul’s Cathedral in London will be recognised as St Paul’s by some, as a large church or cathedral by others. However, the location may be mentioned explicitly by the film (see previous item);
  • Settings are related to each other and/or to characters and their actions. In The Hours (Daldry, 2002), for example, the three main characters are each linked to one specific setting: Virginia Woolf to London, 1923; Laura to Los Angeles, 1951 and Clarissa to New York, 2001. As such, a setting will sometimes be identified by a character, because it is known that this character is connected with that setting.
  • All the different functions identified above can be signalled by film editing, film techniques, dialogue and/or sound effects/music, that is, by the different communication channels of the film (see chapters 2.2.1 and 2.2.2 respectively).
Target Text creation

Having analysed a given setting and its relations to the preceding one(s) and to the other narrative elements in it (such as the characters, see example The Hours (Daldry, 2002)), you proceed to create your description. First decide what must be included. For more information on how to describe, see chapter 2.3.1 on wording and style as well as chapter 2.3.2 on cohesion.

  • Determine through what channel the spatio-temporal information is provided (i.e. is it presented visually and does it have to be described, or can it be derived from the dialogue or a sound effect). Decide whether to describe it or not. However, if there is sufficient time between dialogues, include the information in your description. Information that is repeated through different channels may help your target audience identify it;
  • Determine whether a setting is new or already known;
  • If a setting is new, determine whether it is global or local, real or imagined, serves as a background or has a narrative-symbolic function. Decide what and how much to describe: backgrounds need minimal description that allows the audience to identify a place. Settings that have more symbolic functions need more detailed descriptions. Global settings will be described in more general terms, descriptions of local settings will include more detail. The difference between real and imagined settings can be indicated, for example, by describing how a character stares at a photograph, while his mind wanders (in Nights in Rodanthe (Wolfe, 2008), the protagonist looks at a photo and then his mind wanders to a scene from his past). Identify which aspects of the setting are most important for the story and should get priority in the description. Describe features that are typical and that you will be able to use later for reference when a setting returns (think of the user’s mental model of the story discussed in chapter 1);
  • If a setting is known, determine if it is identical (repetition), if something has been added (accumulation) or if the setting has undergone a complete transformation. Decide how to signal this: identical settings can be described briefly with one or two of the features used to describe it before (e.g. Back at the office); if something has been added this new information needs to be described (e.g. The office. The painting has disappeared); if the setting has undergone a complete transformation, describe what has changed (e.g. from a castle to a dilapidated castle);
  • Determine the relations between settings and the characters in them and/or their actions. If a setting is new, make sure that you identify these relations (e.g. A single-storey home in a Los Angeles suburb, 1951, Laura is reading Mrs. Dalloway, The Hours (Daldry, 2002). If the setting is repeated you can simply refer to one of its features (e.g. Back in Los Angeles; Back in 1951; Back with Laura) and the rest of the setting will also be "activated" in the user’s mind;
  • Determine whether a spatio-temporal setting is presented explicitly (e.g. through text-on-screen, see chapter 2.2.3, indicating a specific place or through a secondary element, like a clock indicating time) or implicitly, and whether it is familiar or unfamiliar (e.g. Eiffel Tower vs. Charles river). In the case of an explicit reference, your description can just mention it (e.g. London, 1923). In the case of an implicit reference, decide if you want to keep the reference implicit, describe it and name it, or only name it, depending on the (recurring) function of the setting in the story and the time available for the AD.

Navigate to Table of Contents

  • “A large white spiral-shaped building” (implicit reference/description)
  • “The white spiral-shaped building of the Guggenheim Museum in Manhattan” (describe and name)
  • “The Guggenheim museum in Manhattan” (name)


Anna Maszerowska, Universitat Autònoma de Barcelona
What is genre?

Genre is a way of classifying films, of identifying them according to specific repetitive formal, aesthetic or narrative features. There are many different genres in cinema: comedy, melodrama, action, thriller, western, etc. The label of a particular genre can help the audience formulate their general expectations of a film: in a musical, (part of) the dialogues will be expressed (and/or replaced) by songs, in a horror movie, there will be threatening music, startling scary moments, and ambiguous focalisation, whereas drama will most likely feature a confused and torn character. Nowadays however, it is more and more difficult to ascribe a given film to a particular genre. Most films mix elements belonging to different genres, thereby creating new definitions and hybrid categories (e.g. romantic comedies, science fiction horrors).

One genre that still is clearly recognisable and usually differs considerably from other, more narrative genres or fiction films, is the documentary genre, usually considered to be non fiction. Even documentaries have subgenres, but, generally speaking, they tend to be more informative, include more or less objective accounts of facts, historic events, social issues or natural phenomena. They often have an entertaining dimension too but this is usually secondary. From a formal point of view, documentaries more often rely on off screen narration and interviews.

Source Text analysis

When analysing your ST from the genre-related point of view, you can use the following checklist:

  • Genre is visible on many narrative levels. As you watch the film, pay attention to elements of iconography. Symbolic objects or settings and iconic items can be used to strengthen the character of the image. For example, a science-fiction film is bound to feature a space ship or a time capsule (Back to the Future (Zemeckic, 1985)), in a war film military uniforms and equipment will abound (Saving Private Ryan (Spielberg, 1998)), and in a horror movie there will be a blood-stained axe or saw (Cold Creek Manor (Figgis, 2003)). Other props may be indicative of the epoch or historical period in which the action of a film is set. This applies above all to costume films e.g. Pride and Prejudice (Langton, 1995) where pieces of clothing or furniture situate the film in a more or less specific time period;
  • The film’s visual style or film language may be an important indication (see chapter 2.2.1 on film language). Pay attention to lighting, editing, mise-en-scene, and mise-en-shot. They are important when creating coherent descriptions that match the visual feel of the image to the narrative genre. For example, a comedy will often be shot in high-key lighting to add to the light-hearted atmosphere (Hitch (Tennant, 2005)), an action film can feature rapid shot changes and short average shot lengths to heighten the pace of the events (e.g. Deja Vu (Scott, 2006)), a horror movie will make extensive use of off-screen space and indefinite and/or ambiguous focalisation to convey suspense and tension (e.g. The Ring (Verbinski, 2002)), and a documentary will feature many close-ups and medium shots to better focus on the specialists sitting in front of the camera during an interview (The Imposter (Layton, 2012)), or long shots to render the vastness of a landscape;
  • Some "pure" genre films will also depict certain prototypical or even stereotypical character types, who represent specific personalities and patterns of behaviour. In a crime film there will be a villain and a good person (e.g. a police officer as in The Brave One (Jordan, 2007)), a romance will feature a couple in love (The Notebook (Cassavetes, 2004)), and an adventure movie can star, for example, a group of friends setting out on a journey together (The Lord of the Rings: The Fellowship of the Ring (Jackson 2001).
Target Text creation

In terms of preparing the actual AD script, the ability to classify a given film as representative of a particular genre is of relative importance. Genre will be more important for determining global strategies rather than local ones and explicit references to genre in AD are rare, except in cases of intertextuality (see chapter 2.2.4). Nevertheless, establishing that your source film belongs to a specific genre can help you to set priorities. When creating your AD, you may want to consider the following checklist:

  • Determine what elements of iconography appear on the screen and observe whether they are only presented visually or are also referred to in the dialogue, off-screen narration, voice-over, etc. Decide whether you will describe the element or not (see also chapter 1 Introduction), and what strategy you will use: a general description of the element that implicitly refers to the genre (e.g. a blood-stained axe, a futuristic spaceship), or a more explicit and explanatory one if there is time. If the element is already referred to verbally, you may wish to omit it from the AD except if the item is particularly relevant for the genre of the film and you want to render it more explicit;
  • Determine what genre-specific props are used and if you know their names. Decide if you want to describe them and what strategy you will use. You can, again, simply name the object, give a generalising description or name and explain the item. What seems to be an unimportant prop can be important in a specific genre, even though at first sight it seems to be a secondary element. The last option allows you to provide additional information on objects that could otherwise remain ambiguous;
  • Determine what elements of the visual style or film language of the film (see chapter 2.2.1) refer to its genre. For example, a lot of the action of a crime film will take place at night, on the streets, perhaps in a bad part of town (The Brave One (Jordan, 2007)). A specific mood can be rendered through the use of lively or bleak colours (Contagion (Soderbergh, 2011)). If these features are crucial for the story, use words that faithfully render the feel of such scenes (see also chapter 2.3.1 on wording and style);
  • Determine the way characters are portrayed and see if there are any direct relations to the film’s genre. Often, villains are shot in dim lighting, surrounded by dingy settings, with their faces obscured by the shadows (The Lovely Bones (Jackson, 2009)). Decide whether you express those qualities directly, e.g. by describing the character’s movements, gestures, or facial expressions, or whether you render the relation to the genre indirectly, for example by describing the grim setting (See also chapter 2.1.1 on characters and chapter 2.1.2 on spatio-temporal settings);
  • Determine whether there are specific aspects of focalisation and shot representation relating to the genre. Unclear focalisation in a crime film may leave the audience in doubt as to who is seeing what, thereby creating suspense. Decide if and how this will be described, for example, indicating that whoever is watching can only see part of the scene;
  • For the AD of documentaries, additional decisions may have to be taken. First, you will have to determine if and where description is still needed, as much of the visual information is probably already contained in the off-screen narration or interviews. It may suffice to use text-to-speech AST, given the informative nature of the verbal part of the ST (see chapter 3.3 on AST and 3.1 on technical issues for information relating to choosing and recording voices).

Navigate to Table of Contents

Film techniques

Film language

Elisa Perego, Università di Trieste
What is film language?

The accepted systems, methods, or conventions through which a film’s story comes to the audience, are known as film language. Film language is flexible and is based on the more or less conventional quality, form and combination of shots. It serves to communicate with the audience, to guide their expectations, to shape their emotions, etc. Film language also gives a film its distinctive shape and character, i.e. its style and its aesthetic value.

Film language is the sum of a combination of various film techniques that are all used simultaneously and that can be grouped into three broad categories: mise-en-scène, cinematography and editing. Mise-en-scène refers to what is being filmed in a shot and includes setting, costume and makeup, and staging. Cinematography deals with how shots are filmed and comprises their photographic qualities, framing and duration. Editing refers to the relations between different shots, which include a graphic, rhythmic, spatial and temporal dimension. In other words, film language determines the form in which the story is told.

Film techniques can serve four different functions: a denotative function (showing what is important for the narrative), an expressive function (rendering a character’s emotions or eliciting a mood or emotion in the audience), a symbolic function or a purely aesthetic function.

Source Text analysis

Film techniques usually coexist and a careful analysis is needed to identify and isolate them and their respective meanings. Not only do film techniques show the audience what is important in an image, they can also guide or confound the viewer's expectations depending on how clearly, consistently, coherently and conventionally they are used. They can be used to generate suspense or surprise and to elicit more longstanding moods in the audience. In other words, they determine both what is told and how it is told, and are therefore just as important as the actual narrative building blocks of the film. When analysing the film language of your ST you can use the following checklist to determine what film techniques are used and what their specific meaning is.

  • Film techniques can determine what is presented to the audience in a specific shot, how it is presented and what the relations between successive shots are;
  • When a film technique determines what is presented, i.e. when it belongs to the broader category of the mise-en-scène, there are three basic possibilities. It can deal with the setting of a specific shot, i.e. how the different elements in the shot are organised. An analysis of this composition may indicate what is more important (usually central elements in the shot) and what is, both literally and figuratively, peripheral. Second, it can deal with costume and makeup. Again, these can be used to guide the audience’s attention to the most significant elements in the image. Costume, in particular, can also be used to indicate the general time period in which a film is set. Finally, mise-en-scène deals with staging, i.e. the movement and the performance of the actors (see chapter 2.1.1 on characters and action for more information on this aspect);
  • When a film technique determines how the information is presented, i.e. when it belongs to the broader category of cinematography, you will have to pay attention to three main issues, each encapsulating several meaning-making practices. First, there are the shot’s photographic qualities which comprise colour, speed of motion, lighting, camera angle and focus. In Women in Love (Russel, 1969) for instance, the bright colours of the opening scene give way to the pale and softer hues of the film's middle portion and to the film's last section's predominantly black-and-white scheme that represents the characters' cooled ardor. In Déjà Vu (Scott, 2006) slow motion is used to render the main character’s mental reaction to the dramatic aftermath of a terrorist attack. On a more general level, focus is another technique that is used to show the audience what is most important in the image. Second, there is the framing of the shot, i.e. the technique that determines what is presented within the film frame, in other words what you see and how you see it. Framing is intrinsically linked to the various types of shots, ranging from extreme long shots to extreme closeups. In The Shining (Kubrick, 1980), a helicopter shot opens the scene thus emphasizing the contrast between the majesty of the landscape and the insignificance of the protagonist's car. On the other hand, in Inglourious Basterds (Tarantino, 2009) Shosanna's eyes fill the entire screen in an extreme closeup, which isolates this character from the ongoing action and emphasises the emotional intensity of the moment. Finally, cinematography deals with a shot’s duration. A long take can be used to allow the audience to appreciate a certain landscape, or to reflect boredom. Short shot lengths, on the other hand, can be used to create suspense or to reflect a character’s restlessness;
  • When a film technique determines the relations between different shots, i.e. when it belongs to the broader category of editing, you have to pay attention to four different types of relations. First, the relation between successive shots can be graphic. In Memoirs of a Geisha (Marshall, 2005), a shot showing cherry blossoms being carried away by the wind slowly gives way to a following shot in which the cherry blossoms are graphically matched to snowflakes, indicating a leap forward in time. Successive shots can also be related rhythmically, when different shot lengths in a scene are combined to form a specific pattern. In action scenes for example, successive shots will become shorter to reflect the increasing suspense and to arouse tension in the audience. Third, there are so-called spatial relations between successive shots: a filmmaker can start by showing a general space by means of a general establishing shot and zoom in on a detail within this space in the next shot. Finally, shots are also related temporally: two successive shots can either follow each other chronologically, or they can constitute a flashback or flashforward.

When you have analysed what film techniques are used in a certain shot/scene you can proceed to determine their function. First of all, a technique can have a denotative function, i.e. it can be used to guide the audience’s attention to the most important elements in the frame (e.g. a woman in a white dress surrounded by men in black tuxedos). Film techniques can also have an expressive function. Specific colours can be used to reflect the mood of the characters (cf. the Women in Love (Russell, 1969) example above) or they can be used to generate a certain emotion or mood in the audience (e.g. fast editing to create suspense). Film techniques can also have a symbolic function. In Away from her (Polley, 2006) discontinuous editing is used to symbolise the protagonist’s advancing Alzheimer’s disease. Finally, film techniques can serve an aesthetic function, for example when particular colour schemes are used because they are pleasing to the eye.

Navigate to Table of Contents

Target text creation

Having analysed the film language and the film techniques used in a given shot or scene, you proceed to create your description. However, keep in mind that most cuts from one shot to another are left undescribed in ADs, especially when scene changes do not have a particular added meaning. In a short excerpt from The Hours (2002) which includes five scene changes, none is made explicit in the AD, which simply juxtaposes the description of the different scenarios: "As the woman’s head sinks beneath the water, the man drops the letter to the floor and runs towards the back door. The woman’s body, face down, is carried by the swift current through swaying reeds along the murky river bed, her gold wedding band glinting on her finger, a shoe slipping off her foot".

First determine what category the techniques you encountered in a shot or scene belong to:

  • if a technique belongs to the category of mise-en-scène, determine whether it deals with the setting of that shot or scene, with costume and makeup or with the staging;
  • if a technique belongs to the category of cinematography, determine whether it deals with the shot’s or scene’s photographic qualities, with the framing or with the duration of the shot;
  • if a technique belongs to the category of editing, determine whether it organises the graphic, rhythmic, temporal or spatial relations between two shots or scenes.

Next determine the function the techniques serve. It is important to realise that a technique can never be dissociated from the function it serves and that this function will determine to a large extent if and how you will describe the technique:

  • if a technique serves a denotative function, i.e. when it wants to draw attention to narratively significant information, decide whether the information it highlights needs to be described, or whether this is already known from previous scenes or other channels. Costumes, for example, can indicate that a scene is set in a specific century, but this can also be signalled through a text on screen. In this last case, the describer can decide not to describe the costumes and give priority to other information, such as actions;
  • if a technique serves an expressive function, determine whether it expresses an intra-diegetic emotion or mood of one or more of the characters, or whether it wants to generate an emotion or a mood in the audience. In the case of an intra-diegetic expressive function, decide whether the emotion it wants to render can be derived from other information, such as a line of dialogue. If so, you may decide not to repeat the emotion and prioritise other information. If not, the emotion can be expressed in the description. If the technique wants to create a certain mood in the audience, the decision-making process will be somewhat different: as the technique does not give any narrative information, you will not have to decide what you will describe (e.g. “the dark colours want you to feel sad”). Rather, you will have to determine whether the mood is also generated through other channels, such as the music, and decide if you want to repeat it in your description and how: creating a certain mood in the audience through an AD can for example be achieved by using a specific type of language or by voicing (see chapter 3.1 Technical issues) the description in a way that reflects the mood created by the filmmaker;
  • if a technique serves a symbolic function, determine what it symbolises and whether the information symbolised can be derived from other channels or earlier descriptions. If it is already clear, decide whether to repeat it or give priority to other information. If it is not clear, decide if and how to include the symbolic information in the description. Again, this will imply deciding how to render this information: you can decide to explain the symbolic meaning, i.e. describe it in an explicit way, or render it in an implicit way and leave it to the audience to extract the symbolic meaning from your description;
  • if a technique serves an aesthetic function, the decision you have to make is again if and how you will render it in your TT. You can decide to focus exclusively on narrative content and leave the aesthetic function aside, or you can render the narrative content by using a specific language (see chapter 2.3.1 on wording and style) or by voicing the description in a way that reflects the aesthetic function of the technique used.

Finally, decide how you will describe the technique. Basically you can decide to name the technique (“now in close-up”), to name it and describe its function (“a close-up reflects the fear in her eyes”) or only describe the function or meaning of the close-up (“fear is reflected in her eyes”). The decision of when and how to describe a technique will also depend on the film’s (director’s, genre’s, studio’s) style. If the technique is not significant, you can decide not to describe it. If on the other hand, a technique is very significant, occurs frequently, contributes greatly to the style, you might want to make sure that you convey that in your AD. If you need to mention the same technique more than once, use the same linguistic formulation throughout the AD text. Coherence and cohesion (see chapter 2.3 The language of AD) are important and can be maintained in AD also through a consistent use of cinematic language.


An example of cinematography from Déjà Vu (2006) rendering ATF agent Doug Carlin’s reaction when he sees the body bags on the quay after an explosion on a ferry kills dozens of people:

  • "Now in slow motion. Doug walks past the body bags lined up on the quay." (name the technique)
  • "Now in slow motion. Taking in the emotional scene, Doug walks past the body bags lined up on the quay." (name the technique + describe its meaning/function)
  • "Taking in the emotional scene, Doug walks past the body bags lined up on the quay." (name the function/meaning of the technique)

Another example of cinematography from The lady vanishes (Hitchcock, 1938): the properties of the shot, which determine the style of the film, could be described in various ways.

  • "Now, in black and white footage, a mountain top view looks down over a village nestled in foothills" (name the technique).
  • "In a 30s movie, in black and white footage, a mountain top view looks down over a village nestled in foothills" (describe the function/meaning of the technique + name it).
  • "In a 30s movie, a mountain top view looks down over a village nestled in foothills" (describe the function/meaning of the technique)

An example of Editing from Nights in Rodanthe (Wolfe, 2008): a man thinks about the day that drastically changed his life. He is lying on a bed and looks at a photograph that triggers different memories. The flashback could be described in various ways:

  • "Lying on his bed, Paul studies an old photograph. A flashback" (dialogue) (name the technique)
  • "Lying on his bed, Paul studies an old photograph and he thinks back to that last surgery. A flashback" (describe the function/meaning of the technique + name it)
  • "Lying on his bed, Paul studies an old photograph as his mind starts wandering." (describe the meaning of the technique).

Navigate to Table of Contents

Sound effects and music

Agnieszka Chmiel, Uniwersytet im. Adama Mickiewicza w Poznaniu
What is film sound?

Sound in film comprises speech (in the form of film dialogues, voice-over narration and lyrics), sound effects and music. Sound effects and music may be used in film to create a mood, indicate a temporal or local setting (see chapter 2.1.2), enhance or diminish realism, create suspense, define a point of view (by manipulating the sound volume). Sound can guide the viewers’ attention and co-create the narration by highlighting specific visual elements that might otherwise seem secondary elements. The AD will become part of the soundtrack and rely on the information conveyed by its different components. It is important to ensure cohesion between these components (see also chapter 2.3.2 on cohesion).

Source Text analysis

In your analysis, it may be advisable to (also) listen to your film without watching the images to identify sounds that might otherwise escape you. When analysing your ST for sound to determine its usefulness for your AD, you can use the following checklist.

  • Sounds can be diegetic (i.e. belong to the reality depicted in the film) or non-diegetic (i.e. be external to it). A diegetic sound is, for instance, a car engine whir in the opening scene of Inglourious Basterds (Tarantino, 2009) when a military column approaches a cottage. A non-diegetic sound is usually music or an off-screen voice (e.g. of a narrator, as in Vicky Cristina Barcelona (Allen, 2008);
  • Diegetic sounds can be internal (e.g. the protagonist’s thoughts audible to the viewer) or external (audible to other protagonists, as well);
  • Sounds can be on-screen, when their source is visible on the screen, or off-screen, when the source is not visible. Steps are an example of the former when a walking protagonist is visible in the scene or of the latter when instead we see another protagonist’s reaction to the as yet invisible approaching person;
  • Sounds can be individual (e.g. the thud of an object falling) or form a soundscape (e.g. sounds of sirens, engines and walkie talkies in The Girl with a Dragon Tattoo (Fincher, 2011) when emergency services are in attendance at the scene of a car accident).
  • Sounds can be simultaneous or not simultaneous with the visuals. A sound flashback is an example of the latter, when we see a protagonist in the present and hear a conversation from the past that the protagonist is remembering;
  • Sounds can have an illustrative or symbolic function. Illustrative sounds enhance realism and are congruent with the depicted settings or objects. Symbolic sounds may, for instance, symbolise the protagonist’s dreams. In a Polish TV series Londyńczycy [The Londoners] (Zglinski, 2008), a 10-year old boy, Stas, is visiting his mother, who works in London. He skips language classes one day and goes to see Wembley Stadium. The place is empty but we can hear football match sounds (national anthem sung by football supporters, cheering crowds) that symbolise Stas’s dreams of becoming a successful player;
  • Music in film can be instrumental or with vocals. The lyrics of the latter are usually meaningful for the narrative, especially in musicals.
Target Text creation

When analysing the sounds in your film, you can use the following checklist to determine whether and when they have to be mentioned in the AD, then how (much) to describe.

  • Regarding dialogues or speeches with very short pauses, determine if the visuals are important enough to be audio described. If not, the dialogue is always a priority in AD. However, sometimes the AD will be given priority, as e.g. in The 39 Steps (Hitchcock, 1935). During a public speech by the main protagonist, a woman reports him to agents who are waiting for him to finish in order to arrest him. The description cannot fit in the pauses and has to cover the speech as the visual information is more important for the story than the words the protagonist is speaking;
  • Determine if the sound effect is easy to identify (either aurally or through a reference in a dialogue) or not. If it is not, the best option may be to name the sound or its source in the AD;
  • Determine if it is important to identify the source of the easily identifiable sound (e.g. who punches whom) or whether it is important to know just what produces the sound (e.g. a squeaking door or a squeaking bird), and decide if AD rendering the source explicit is required;
  • Determine whether the sound is diegetic or non-diegetic. If it is non-diegetic (e.g. off-screen voices) determine whether it is clear who is speaking and whether it is important to know that, and in that case decide whether a reference in the AD is required. If the sound is diegetic, determine whether it is external or internal. If it is external, use the guidelines above, if it is internal, determine if it requires a special mention in the AD, and if in doubt, identify the sound;
  • If the sound is individual and requires AD (because it is difficult to identify unequivocally: see above), determine when to describe it. Sometimes there will be no choice because a pause will be available only before or only after the sound. According to some, it is better to offer AD after the sound in order to maintain a dramatic effect. According to others, it is better to offer AD before the sound because it will help the audience to recognise the sound more quickly and reduce the effort required. Decide in each instance, on the basis of the context, what works best (see also chapter 2.3.2 on cohesion);
  • Determine the function of the sound. If it is illustrative, decide if it must be mentioned in AD explicitly and how much description it requires (e.g. in the case of the sounds characteristic of a restaurant, the AD of the spatial setting of the scene might be sufficient, such as "in a restaurant". Alternatively, you may decide to add more details, "in a restaurant, people laughing and talking"). If the sound is symbolic and not congruent with the visuals, decide how much AD is needed to highlight the incongruence;
  • Determine if the sound is non-simultaneous and decide if it requires an explicit mention in the AD. If you decide to mention a sound flashback in AD, you can do it either directly (e.g. "a flashback" or "in a flashback") or indirectly (for instance by describing it as the protagonist reminiscing);
  • If dealing with music with vocals, determine how meaningful the lyrics are and whether describing the visuals is important for the narrative. If the lyrics are not meaningful and the visuals not important, decide how much description is needed. If the lyrics are not meaningful and the visuals are important, describe what is happening during the song. If the lyrics are meaningful and the visuals important, decide to what extent the AD may overlap with the lyrics. Use either synchronous description (i.e. describe the visuals as they appear on the screen) or asynchronous description (i.e. describe them earlier or later so as to leave the lyrics of the first verse and the first chorus audible).

Navigate to Table of Contents

  • “Charles rolls the door closed.” (describe the source of a non-recognisable sound)
  • “Andy chuckles at the similarity of the two blue belts.” (name the sound and source required in a group scene)
  • “Shouts off-screen.” (idenitfy off-screen sounds)
  • “Jamie’s thoughts.” (identify an internal diegetic sound)
  • “Mary looks at a photo of her son.” (identify an internal diegetic sound: the protagonist’s thoughts about her son)
  • “She looks through the window, deep in thought. A flashback.” (directly refer to a sound flashback)
  • “She frowns as she remembers an earlier conversation with John.” (indirectly refer to a sound flashback)

Text on screen

Anna Matamala and Pilar Orero, Universitat Autònoma de Barcelona
What is text on screen?

Text on screen refers to any type of written text that appears on the screen. Text on screen includes opening credits and end credits, titles, intertitles, and other superimposed titles. Other non diegetic elements such as logos and diegetic elements which are part of the scene (a letter, a text message or a wall poster, for instance) may also contain written language. Subtitles can also be considered as text on screen: they can either appear as part of the original film (especially in multilingual productions) or they can be a translation of original dialogues for other audiences (see chapter 3.3 on AST).

Source Text analysis

When analysing your ST, you can use the following checklist to identify the function and relevance of each text on screen.

  • Text on screen can belong to various categories and fulfil different functions. Determine what type of text on screen you are dealing with (opening or end credits, title, intertitles, other superimposed titles or inserts, logos with text, text on other diegetic supports, subtitles, other) and its function and relevance within the narrative. For instance, opening credits may be used not only to list the main film crew members but also to convey additional intertextual meanings (see chapter 2.2.4 on intertextual references) through a specific typeface. The opening credits of Sin City (Miller et al., 2005), for instance, copy the typical typeface of comic books. Superimposed titles may be used to identify the name of a character on screen;
  • Text on screen can be part of the action (diegetic) or may have been added later, during the editing process (non-diegetic);
  • Text on screen can bring in information which is unique or redundant. For instance, a superimposed title may be referring to a spatio-temporal setting (see chapter 2.1.2) that a dialogue or an off-screen narrator already makes explicit, hence providing the same information via an oral and a written channel;
  • Inserts or superimposed titles can help identify many elements such as the spatio-temporal settings or a character (see chapter 2.1.1). For instance, in the film Inglourious Basterds (Tarantino, 2009), an intertitle reads "Once upon a time… in Nazi occupied France" and is followed by a caption that says "1941". In the same film, another caption helps the audience identify a character when she is shown as an adult for the first time ("Shosanna Dreyfus. Four years after the massacre of her family").
Target Text creation

Having analysed a given text on screen, you proceed to create your description and may consider the following elements.

  • Determine what sort of visuals and meaningful sounds and/or lyrics appear simultaneously, if any, and determine if the information provided by the text on screen is already offered by other means (for instance, dialogue). Based on the previous elements, decide whether or not it is necessary to render the text on screen orally in your AD and, if so, decide what strategy you will use to render it, e.g. literal rendering, or a (condensed) paraphrase. Additionally, establish whether you will render the text on screen synchronically, before it actually appears on the screen or afterwards, taking into account the available silent gaps (see also chapter 2.3.2 on cohesion);
  • Determine whether or not the typographical features used have a narrative function or contain an intertextual reference (see chapter 2.2.4), and decide whether they should be included in the AD;
  • Decide how you will indicate that text on screen appears. Possible strategies include adding a word or an explanation before reading it ("A caption reads"), changing the intonation, using another voice (male/female), using earcons (sound indicators) or integrating the content in the AD (for instance, "In 1941, a man…");

The description of subtitles or AST is discussed in much more detail in chapter 3.3.

Finally, keep in mind that certain countries have laws and regulations concerning the use of credits and logos. This is particularly important when a certain text on screen, such as credits, is left undescribed or is paraphrased.


An example of a logo from from The History Boys (Hytner, 2006)

  • “Against a background of the hills and street lights of Los Angeles, searchlights illuminate a huge art-deco sign of sculptured letters spelling out Fox Searchlights Pictures, a News Corporation Company”. (a literal rendering with description and announcing that it is text on screen)
  • "Against a background of the hills and street lights of Los Angeles, the logo of For Searchlight Pictures, a News Corporation Company appears" (literal rendering and announcing that it is text on screen)

An example of a title from A Lot like Love (Cole, 2005)

  • “White and yellow titles appear against a black background. A lot like love” (A literal rendering with description and announcing that it is text on screen)
  • "A Lot Like Love" (simple literal rendering)
  • "Text on screen: A lot like love" (name and announce text on screen)

An example of opening/end credits from The Ladykillers (Coen & Coen, 2004)

  • “Touchstones Pictures presents Tom Hanks, Irma P. Hall, Marlow Wavans, J.K. Simmons, Tzi Ma, Ryan Hurst, George Wallace, Diane Delano, Stephen Root, Greg Grunberg in The Ladykillers” (a non-synchronic literal rendering)
  • “Miramax Films presents a Film Colony production. Johnny Depp, Kate Winslet, Julie Christie, Radha Mitchell and Dustin Hoffman. Based upon the play The Man Who Was Peter Pan, by Allan Knee. And inspired by true events”. (a non-synchronic condensed rendering)

An example of superimposed titles, from Finding Neverland (Forster, 2004)

  • “The curtain rises, it is London, 1903” (content of the caption integrated in the AD)
  • “A caption (reads): London, 1903” (explicit reference to text on screen)
  • “London, 1903” (literal rendering, possibly with different intonation)

An example of a diegetic text on screen from The Counselor (Scot, 2013)

  • “A motor cycle passes a road sign (reading): EL PASO. City Limit. Pop. 665568” (explicit reference to text on screen)
  • At a very high speed, a motor cycle enters the city of El Paso.” (integration in the AD)
  • “EL PASO. City Limit. Pop. 665568” (literal rendering, possibly with different intonation)

Navigate to Table of Contents

Intertextual references

Chris Taylor, Università di Trieste
What are intertextual references?

Intertextual reference refers to the fact that practically all texts, including films or television programmes, contain elements that can be traced to other texts. Readers’ or viewers’ understanding of one text therefore not only depends on the world knowledge that they bring to it, but also on their knowledge of how texts work and on their knowledge of specific texts. Text producers often include more or less explicit references to previous texts deliberately to generate an additional layer of meaning, which is activated by the readers or viewers who recognise the link. For the blind and visually impaired audience, these links may not be immediately accessible.

In the case of screen products, intertextuality is to be found in both aural and visual form, sometimes in a combination of the two. In all cases a relation is established between the marker, a textual allusion or reference in the text being read or viewed, and an element or elements "alluded to" or "referred to", the marked, in another text or series of texts. Film viewers who spot the marker of the allusion will relate it to their knowledge of other texts, and, more specifically, to the text with the marked element. Noticing the reference gives viewers a form of "intellectual" pleasure. The importance for audio describers is that they may need to enhance such connections.

Source Text analysis

A thorough analysis of your ST may reveal the presence of aural (e.g. music, lyrics, dialogue) and/or visual (e.g. shots, mise-en-scène, film techniques) intertextual markers. References may be to an extra-filmic item (e.g. a book), a film genre, or a specific film. You will have to decide how important the links are and whether or not you want to assist the visually impaired audience in recognising them. For suggestions on how to deal with references to extra-filmic space and time, see also chapter 2.1.2.

Aural references

Verbal references, for instance in the dialogue, may be accessible to visually impaired audiences and may therefore not have to be described. For example, verbal intertextuality with reference to another film and book can be seen in the famous line "The name is Bond, James Bond", which appears in all the 007 books and films. Another example involving famous sayings appears in Col. Landa’s comment to Aldo Raines in Tarantino’s Inglourious Basterds (Tarantino, 2009): "Lt. Raine, I presume", referring to Henry Morton Stanley’s "Dr. Livingstone, I presume?". This is a well-known line with which the audience is presumably familiar, however, that will not always be the case.

Musical allusions may also be important. Some musical accompaniments re-occur in a series of films, e.g., the James Bond theme, and do not present problems, but the connection may also be between two films of different type, separated in time and harder to spot. Even so, the challenge facing the sighted audience is similar in such cases, except if their interpretation is supported by visual markers.

Visual references

When dealing with visual references, it is important to determine what the visual marker is and how it refers to the marked. For example, visual intertextuality can be seen in parodies. In the film Love Actually (Curtis, 2003), Hugh Grant is clearly meant to represent Tony Blair as his car pulls up outside No. 10 Downing Street and his clothes and mannerisms ape the erstwhile British Prime Minister. This is an extra-filmic reference to "real life". However, references to other film genres and to scenes from previous movies are also common. For example, in the cartoon series Family Man, the aeroplane scene from Hitchcock’s North by Northwest (Hitchcock, 1959) is re-evoked in animated form. In addition, specific visual elements can also have an intertextual function, for instance, in a feature film that places "real" characters in a comic book setting, such as in Sin City (Miller et al., 2005).

Visual/aural references

In the case of intertextuality involving both aural, verbal and visual referents there is always a form of interaction between the two modes. An example of visual-verbal intertextuality with reference to an entire film genre occurs, for instance, in an episode of the comedy series Friends (episode 283, 1995), when Chandler finds his pal Joey dressed in a cowboy outfit, and greets him with "Howdy", harking back to hundreds of classic westerns.

When analysing your ST for any of these forms of intertextuality you can use the following checklist to determine the nature and role of the intertextual reference in order to decide whether it is desirable to make the link between marker and marked more explicit in your AD:

  • determine the reference (the marker and marked);
  • determine the type of reference (aural, visual, aural-visual);
  • determine whether the marker (or all of the marker) is accessible for your audience (e.g. is visual information also repeated in the dialogue?)
  • determine how important the reference is for your film, in terms of genre (e.g. is the film a comedy that relies on parody) or for the understanding of the narrative;
  • determine how well-known the reference is (e.g. a "classic" Hitchcock film);
  • determine if there is time for additional information between the dialogues.

Navigate to Table of Contents

Target Text creation

Limited research indicates that AD tends to explicate the marker in order to enhance its recognisability rather than to make the link between marker and marked explicit unless the film or scene becomes incomprehensible without it. However, having determined the types of intertextuality that occur in your film, you can now make a number of different context-based decisions regarding the best-suited strategy.

Aural references
  • Decide whether the marker needs to be enhanced or not, since purely aural references rely on a channel that is also accessible to your audience.
  • Decide whether or not a brief reference to the speaker or genre referred to ("Speaking like a cowboy") is a good option, if a reference involving a more or less literal citation or register is important for your film,
  • Decide if it may be desirable to render an aural reference more explicit for didactic purposes. For instance, (visual)-verbal intertextuality linking literary texts to films can be seen in the title of the Coen Brothers’ film No Country for Old Men (Coen & Coen, 2007) which comes from a poem by W. B. Yeats ("Sailing for Byzantium"). It may be possible to add a reference to the poem during the reading of the film credits. In doing this, you will, however, be giving the blind audience an edge over the sighted audience and deprive them of the pleasure of finding out for themselves.
  • Decide whether the language(s) used in your film constitute an extra challenge. Foreign language films or multilingual films may require a special treatment if the aural reference is rendered not through specific and recognisable phrasing ("Lt. Raine, I presume") but through accent, and the dialogues of your film are rendered through AST, read by a voice actor (see chapter 3.3 on AST). If, in that case, the film actor’s speech (marker) mimics, let’s say, a famous politician (marked), you may opt to specify: "Speaking like Obama", followed by the audio subtitle.
Visual references and aural-visual references

If you have determined that the marker in your ST cannot be retrieved aurally and the other conditions (cf. above) are fulfilled, you may decide to render the intertextual link more explicit. In cases of verbal-visual intertextuality (involving dialogue) or aural-visual intertextuality (involving music) make sure your AD covers the information from the visual channel that is lost to your target audience. The following checklist may be useful for TT creation.

  • Decide whether you want to preserve your target audience’s intellectual pleasure in detecting the reference: enhance only the marker;
  • Decide whether you want to give priority to rendering meaning more explicit: link marker and marked;
  • Decide whether you want to reckon with the didactic function of AD: link marker and marked.

Different degrees of explicitness are always possible. The one you choose is linked to your interpretation of the ST analysis checklist above.


An example of verbal intertextuality:

  • "Medicaire is no longer working!" (leave it as it is)
  • "Medicaire is no longer working!" with an enhanced AD: "Speaking like an American politician."
  • "Medicaire is no longer working!" Linking marker and marked in the AD: "Speaking like Obama."

Less explanation may be required if a fictional character says "Yes, we can!", since this is a more explicit reference to Obama.

Another example of verbal intertextuality from Inglourious Basterds (Tarantino, 2009):

  • "Lt. Raine, I presume" (leave it as it is)
  • "Lt. Raine, I presume" with enhanced AD: "Mimicking colonial British intonation"
  • "Lt. Raine, I presume" Linking marker and marked in the AD: "Mimicking African explorer Stanley’s address ‘Dr. Livingstone, I presume’”

Examples of visual intertextuality from Stand Up Guys (Stevens, 2012):

  • “Al Pacino and Christopher Walken come out with their guns ready” (simple description)
  • “Al Pacino and Christopher Walken come out with their guns ready to face hundreds of armed law officers” (enhanced AD)
  • “Al Pacino and Christopher Walken come out with their guns ready to face hundreds of armed law officers, as in the final scene from Butch Cassidy and The Sundance Kid”. (AD linking marker and marked)

Examples of verbal/visual intertextuality from Hitchcock (Gervasi, 2012)

  • "Unfortunately I find myself once again bereft of all inspiration. I do hope something comes along soon." (A BIRD LANDS ON HIS ARM)
  • “A bird lands on his arm” (simple description assuming audience sees the connection with Hitchcock’s film The Birds (Hitchcock, 1963)
  • “A bird lands on his arm, as a sign of things to come” (enhanced AD)
  • “A bird lands on his arm, alluding to his next film The Birds” (AD linking marker and marked)

Navigate to Table of Contents

The language of AD

Chris Taylor, Università di Trieste

Wording and style

What is wording and style?

Wording refers to the ability to choose the right words in the right places and to use them in the appropriate style in a given context. It is connected to how an author produces the most appropriate turn of phrase or finds the best way of "putting it". Style is the result of the word choice of authors, along with their choice of sentence structure and the appropriate use of figurative and idiomatic language. AD requires attention to both wording and style in order to fulfil its aim of making the visual both verbal and understandable to a blind or visually impaired audience.

Source Text analysis

An AD is a text type with its own features regarding wording and style that distinguish it from other texts. These features are determined by the following characteristics:

  • The wording and style of an AD will depend on the time constraints imposed by the dialogue, musical background and other possible sounds and on their connections with the AD. This is discussed in chapter 2.3.2 on cohesion;
  • Even if the AD script is prepared in written form, it is meant to be spoken and listened to. Such texts have different requirements than written communication in terms of sentence length, structure and vocabulary.

The particular features to be found in a particular AD, however, will depend on the contextual nature of the ST. So, when analysing your ST, use the following checklist to determine what needs to be taken into account.

  • Determine which genre the ST belongs to. Genres are defined in part by their typical, even obligatory, features. For more see chapter 2.1.3 on genre;
  • Determine whether the time period and the place in which the film is set requires a particular style or indeed, whether the year the film was made requires such considerations (should a remake of an old Hercule Poirot movie, for instance, be described in the same style as crime series like N.C.I.S.?);
  • Determine whether the film maker has imprinted a certain style on the work and how this is expressed. Some directors adopt a particular visual style, such as Quentin Tarantino. Others are known for their verbal idiosyncrasies (think of Woody Allen);
  • Determine who your audience is likely to be. Audiences of long-running TV series have different backgrounds and expectations than audiences for an art-house film, particularly a foreign film also requiring audio subtitles (see chapter 3.3 on AST). Children are a special case, as well.
Target Text creation

After you have identified the contextual features of your ST, decide on an adequate wording and style. First, keep in mind the following general features of AD regarding lexis, grammar and syntax, that form the framework around which an AD is worded.

  • Clear language and concrete vocabulary, unencumbered by jargon, unnecessary pomp and obscure vocabulary help with information processing and visualisation;
  • Precision and detail can be expressed by the use of colourful adjectives and adverbs or adverbial phrases (see example from Hero (Zhang, 2002), below);
  • A vivid language engages the listener and can be expressed, for instance, in verb variation (chat, gossip, confer rather than just talk, depending on the context);
  • The visual nature of a film can be reflected in the use of verbs of movement and simile, metaphor or other figures of speech (see example below).
  • Descriptions are usually written in the present tense; past tenses are limited to referring back to previous descriptions (as in Memoirs of a Geisha (Marshall, 2005): “She stops in front of the shrine and drops in the coins the man had given her as an offering”);
  • Descriptions predominantly use third person pronouns, as they reflect the voice of an omniscient narrator. Second person pronouns occur, for example, in indirect speech for AST (see chapter 3.3 on AST) or descriptions of gestures (as in the AD of the Dutch film Aanrijding in Moskou, (Van Rompaey, 2008): “Johnny gestures to Werner, go ahead, Werner gestures, no, you go first.”).
  • Time limitations and the need for an intelligible style, promote the use of short sentences: a pace of 160 words/minute is a good starting point. For instance, the use of simple sentences in AD is more frequent than subordination. Simple noun phrases with no verb at all are also common for describing time and spatial settings ("Later that night", "London, 1887") or to pinpoint objects ("A military ambulance", “A full shopping basket"). Though, if time permits, more variation in sentence structure can be pleasant and engaging. Subordination in AD is often expressed through conjunctions of time like "as" and "while", or through the use of non-finite clauses ("Dressed in blue overalls…", “Walking with a limp…”);
  • The order of information in the sentence influences how easily it can be processed. Starting with known information in sentence-initial position provides the best framework for what follows, but starting with new information in sentence-initial position, on the other hand, can draw attention to a specific element (“With a knife in his hand, he walks up to her”). For more on information order see also chapter 2.3.2 on cohesion;
  • Consider the spatio-temporal logic of the image. Settings and characters, for instance, are usually described from the general to the specific, from distant to near and from left to right, but specific settings or film techniques might require alternative approaches (such as the zooming out in the opening sequence of It’s Complicated (Meyers, 2009): “The red tiled roof of a large Mediterranean-style house with white walls. Palm trees grow around the house, which sits by the ocean.”).

Then, on top of these basic elements regarding lexis, grammar and syntax, decide on an adequate wording and style specific to the nature of your specific ST:

  • Decide what additional lexical choices are imposed by the genre, time and place of the story. Different genres mandate a specific jargon, such as Westerns, crime series or sci-fi films. Some ADs adopt creative strategies, such as East is East (O’Donnell, 1999) and Sexy Beast (Glazer, 2000), which adopt the Northern variant of British English that is also spoken by the characters;
  • Decide how the style of the film maker can be reflected in the style and wording decisions in the AD. Strategies can include the incorporation of film language (see chapter 2.2.1 on film language), the use of an Audio Introduction to describe filmic features or explain terminology (see chapter 3.2 on AIs), the use of erudite or literary language to reflect the aesthetics of the image (as in the example of Hero (Zhang, 2002) below);
  • Decide if and how wording and style can be adapted to the specific target audience. In the case of long-running TV series, a simple factual description may suffice whereas you may decide to use a higher register for more serious products. Children are a special case and require short, simple sentences with undemanding, though colourful, vocabulary.

Navigate to Table of Contents


Example from The English Patient (Minghella, 1996) of frequent AD features (simple sentences, present tense, third person pronouns):

  • “A young French-Canadian nurse, Hana, adjusts the belt of her uniform. She walks into a carriage where wounded soldiers lie. She stops beside a young man. She bends over him. She moves on between the bunks. She joins her colleagues.”

Example from The Hours (Daldry, 2002) of a more complex style (subordinate structures and higher register for literary film):

  • “Later, the little girl, Angelica, fairy wings tied to her dress lays the dead sparrow into the nest her mother and two brothers have made at the base of a large tree. Virginia sits down beside her with a bunch of freshly cut yellow roses in her hands.”
  • “Later, the little girl, Angelica, fairy wings tied to her dress lays the dead sparrow into the nest her mother and two brothers have made at the base of a large tree. Virginia sits down beside her with a bunch of freshly cut yellow roses in her hands.”

Example from Spy Kids (Rodriguez, 2001) of succinct and precise language where time is limited and, immediately afterwards a more thorough description where time permits:

  • “He presses a button. A keyboard flips up.”
  • “The shelf unit over his desk slides away to reveal a large screen. At her dressing table Ingrid presses some buttons, converting the mirror to a computer screen.”

Example from Hero (Zhang, 2002) of visual storytelling, simile and metaphor and complex sentence structures:

  • “Nameless vaults onto the roof of the school building. Snow notices and spins round, the white sleeves of her robe spreading like wings, levitating her on the roof beside him. Indoors the calligraphy master and his students carry on writing. In his room, Broken Swords dips a long-handled brush into a pale of red ink and begins to paint the pictogram he’s been practicing. On the roof, side by side, in a wild, swirling ballet, Nameless and Snow strike out with sword and billowing sleeves and the King’s arrows fall harmlessly around them. Holding the brush in both hands, Broken Sword swirls it across a huge white banner on the floor, his feet dancing, his long black hair flying. As the onslaught continues, Snow and Nameless spin and leap across the arrow-strewn roof in continuous motion.”


Chris Taylor, Università di Trieste
What is cohesion?

Cohesion is a textual property that helps the receiver of a message to understand it with reasonable ease and find continuity of sense in it. It refers to the implicit and explicit links that hold together the different parts of the text. In the case of films and series, these links exist between words and sentences, but also between the spoken dialogue, the visual elements and the sounds and musical score. In some texts cohesion is abundant, in others more sparsely arranged. Without cohesive links, a text is difficult to follow and text receivers have to rely more on their background knowledge and inferences based on other textual cues to make sense of it. The importance of cohesion for audio describers is that they need to recreate these textual links in their description and must make decisions as to where to insert their descriptions to further support cohesion.

Source Text analysis

When analysing your ST to identify relevant cohesive links, you can use the following checklist of features to identify their type and nature.

  • Links can be intramodal, when they exist between signs of the same semiotic mode: between visual elements, between sound elements or between verbal elements. The opening lines of American Beauty (Mendes, 1999) are an example of verbal cohesive links with personal pronouns: "My name is Lester Burnham. This is my neighbourhood; this is my street; this is my life. I am 42 years old. In less than a year I will be dead. Of course I don't know that yet, and in a way, I am dead already";
  • Links can be intermodal, when they exist between signs belonging to different semiotic modes. Film dialogues, for example, are linked to on-screen characters and settings as in the example from American Beauty (Mendes, 1999), where the repeated pronoun "this", refers to the on-screen houses and tree-lined streets. Dialogues frequently refer to sounds or link up with facial expressions and gestures (Also in American Beauty (Mendes, 1999) Lester Burnham’s neighbour at one point yells "Hush Bitsy", pointing an accusing finger at his barking spaniel). Films are also filled with realistic sounds produced by visual cues and actions, such as the ringing of an alarm clock on a bedside table. Another example is music underscoring on-screen action (in In the Name of the Father (Sheridan, 1993) a chase through the streets of Belfast is accompanied by heavy Jimi Hendrix rock music) (see also chapter 2.2.2 on sound effects and music);
  • Intra- and intermodal links can be explicit, such as the repeated use of pronouns and demonstratives in the American Beauty (Mendes, 1999) example. Film techniques are also used for explicit visual linking, such as the split screen in a telephone conversation in Down with Love (Reed, 2003) (for more see chapter 2.2.1 on film language). The co-occurrence of certain (illustrative) sounds and images are another example, such as a character climbing a staircase combined with the sound of footsteps (see chapter 2.2.2 for more on sound);
  • Intra- and intermodal links can also be implicit. Two subsequent sentences imply that one happens after the other ("Mary hurries into the bedroom. She stops and listens for a sound. She looks concerned"). The mere co-occurrence of two non-verbal signs in film often implies a link, too. In Bride Flight (Sombogaart, 2008), for instance, an older man in a hat stands in front of a picture of a younger man, in the same position, with a similar hat. This composition suggests that the man in the picture is a younger version of the old man. The identification of such implicit links is highly dependent on the receivers’ interpretation and contextual knowledge;
  • Cohesive links can be congruent or incongruent. Contradicting signs, for instance, are frequent for comedy or suspense. In Grown Ups (Dugan, 2012), a group of women starts laughing when a young stud with a six-pack starts speaking in an unexpectedly squeaky voice. Symbolic sounds or sound flashbacks can create incongruence with the image as well (as in the example from chapter 2.2.2, where cheering crowds are heard in an empty stadium).

In your analysis, focus on those links that need to be recreated in the AD to ensure that the "continuity of sense" is maintained. Intermodal links can be translated into links between description and sound or dialogue. Intramodal visual links can be translated in cohesive links between blocks of description. Exclusively aural links (between dialogues and sounds, or dialogue and music) are likely to be directly accessible to your target audience.

Navigate to Table of Contents

Target text creation

After you have identified the relevant cohesive links in your ST, determine whether they have to be recreated in the description or not (see also chapter 1 introduction on this decision-making process). Keep in mind that an essential part of a cohesive description is identifying speakers and sources of unclear/contradicting sounds (see chapter 2.2.2 on sound effects and music).

Next, determine whether the link in the ST is implicit or explicit. If it is an implicit link, decide whether it has to be rendered more explicit in the description or not. In some cases explicitation can be an appropriate strategy, particularly for intramodal visual links, that would otherwise demand too much inference from the target audience (in the example above from Bride Flight (Sombogaart, 2008) the Dutch AD renders the link between the man and the picture explicit: "(…) the man stands in front of the poster with his much younger image on it"). But keep in mind that explicitation can become patronising and risks giving away too much information. Next, proceed to formulate your description. The following checklist can help you decide on an appropriate strategy.

  • Decide if your description will precede or follow the dialogue line, sound or music event it refers to, keeping in mind that the closer two connected elements are, the more clear their cohesive link for the audience. Preceding descriptions set up events in the audience’s mind and make it easier to make the link with the ensuing sound, music or dialogue. But, descriptions that follow their referent contribute to suspense, surprise and comic effect;
  • Decide to describe synchronically or not. It is preferable to describe as synchronously with the images as possible, as partially sighted audiences will use specific visual cues (such as colour or brightness) to complement the description. In some cases, however, it will be more appropriate in terms of cohesion to anticipate a visual element whereas in others it will be more appropriate to describe it when it has already disappeared from the screen. In Loft (Van Looy, 2008), for instance, the sound of police sirens is described right before it gradually becomes louder and as such becomes narratively important, rather than when it appears for the first time in the background. The hat of a man can be described when he tips it and greets someone, rather than when the man (and his hat) first appears on screen.
  • Words and expressions from the same semantic field, that comply with the genre, visual style and the spatio-temporal setting of the film (such as synonyms, hyponyms, hyperonyms and metaphors, see also chapter 2.3.1 on wording and style) can help audiences identify links through inference more easily. The words "peak hour", "urban" and "skyscraper" disambiguate the sound of car horns, hasty footsteps and cell phones, so that these sounds do not necessarily require a (detailed) description;
  • Elements that re-occur regularly throughout the film are more easily identified when their description is consistent (think of the users’ mental model of the story discussed in chapter 1). Decide on a few characteristic formulations, keeping in mind that repetition is more cohesively tight than synonymy. Noun phrases to refer to locations are a case in point: "Back to the bedroom", "In the bedroom again."
  • Personal and demonstrative pronouns are a common cohesive device. In the case of AD they are appropriate within an uninterrupted block of description, but can cause confusion when the description is interrupted by sound (including long silences), music or dialogue. In the example below, from The English Patient (Minghella, 1996), the name "Hana" is used to identify the woman in question on her first appearance, to be replaced by "she" in the next sentence, but the name is repeated in the third sentence following a piece of dialogue that has interrupted the reference chain: "The young nurse, Hana, now in army fatigues, is making her way towards a hospital tent. Later she walks between the rows of wounded with her American friend, Jan. A hole has been ripped in the library wall. (DIALOGUE) Hana stumbles over piles of books scattered underfoot";
  • Visual elements or sounds that are referred to in dialogues by pronouns, can best be included in the AD, if they cannot otherwise be identified (e.g. "this" in the first example above from American Beauty (Mendes, 1999)).
  • As cohesion is strongest when the description remains close to the sound, the dialogues or the speaker it refers to, carefully consider the information order in the sentence. The subject (usually at the beginning of a sentence) provides the context of the sentence, so it is the preferred position for framing information, such as speakers or sources of sounds ("Paul reads the Spanish medical textbook" from Nights in Rodanthe (Wolfe, 2008), or "A police van speeds along a city street" from Ransom (Howard, 1996)). However, sometimes prominent information must appear at the end of the sentence, so as to be close to the ensuing dialogue line or sound event or because it is the most important information that must remain in the listener’s mind ("He"s summersaulted away from the pavilion, spiralling down into the water [WATER SPLASHING] from Hero (Zhang, 2002). Sometimes you may have to resort to alternative phrasings, such as passive constructions to move parts of sentences to the back and enhance cohesion (“Jane's eye is caught by a sexy young brunette woman" from It’s complicated (Meyers, 2009)), or sentences opening with an adverbial or prepositional clause ("With a wave, Jane walks off").
  • Conjunction is another common cohesive device and in AD temporal cohesion is essential for the narrative. Frequent devices are:
    • Parataxis, i.e. the succession of simple clauses for a series of actions, especially when time is a factor. As in: "Mary hurries into the bedroom. She stops and listens for a sound. She looks concerned. The room suddenly and spookily grows dark and Mary sits on the bed."
    • To make your description more varied, however, you can resort to conjunctions and adverbs of time to link phrases ("In the projection room reel four plays on, next to the dead bodies of Shoshanna and Zoller. On the screen, Zoller keeps on shooting. Hitler is overjoyed when Zoller makes three more victims. Then, Zoller looks into the camera", from the unpublished AD of Inglourious Basterds, (Tarantino, 2009)).
    • To indicate simultaneity of action, English describers often resort to the conjunctions “as” and "while" or the present continuous ("Snow notices and spins round, the white sleeves of her robe spreading like wings, levitating her on the roof beside him", from Hero (Zhang, 2002).

Finally, cohesion is all about striking the right balance. Synchronising the description with actions, settings and sounds, and at the same time avoiding overlap with the film dialogue and the soundtrack, is often tricky. But this is the key to effective AD.

Navigate to Table of Contents

Information on the AD process and its variants

Technical issues

Bernd Benecke and Haide Völz, Bayerische Rundfunk

What technical issues are relevant?

The purpose of this chapter is to give some insight into the importance of technology-based steps in the AD process, which is useful for AD scriptwriters. It is not meant to allow the audio describer to carry out all these tasks, but to make them aware of the whole process, which may influence the text and improve the product. Technical issues in the AD production process (see chapter 1 for an overview of the process) are:

  • Formal things that have to be kept in mind when writing the AD script;
  • Formal things that have to be kept in mind for the editor of the AD script;
  • Technical aspects that are important for the voicing and recording the AD;
  • Technical aspects that are important for the sound editing of the recorded AD;
  • Technical aspects that are important when the recorded AD is mixed with the original soundtrack.

In this chapter we will focus on the writing of the AD script and the issues that play a role during the process of transforming the written AD script to an audible AD soundtrack.

Target Text creation

As you may write AD scripts for different clients make sure that in every case your deliverable contains all the information the client wants (in addition to the AD script) in the client’s preferred format. Maybe the client will tell you in which mode of operation to write your text: alone or in a team with sighted and blind or visually impaired colleagues.

In some cases you will be expected to work with specialised AD software, but often subtitling software is used and in some cases audio describers will simply use the time codes (usually consisting of four pairs of digits showing the hour, the minutes, the seconds and frame number of the actual image) generated by the software you use to watch the film. Be aware: some popular Internet video players like VLC may jump when you do fast-forward or backward and your image gets a different time code when returning.

Your AD script has to fit in between the dialogues, main music and sound effects. To make the recording efficient, the writer of the AD script is required to measure the time available for the AD. This is called "spotting" (you may measure the length of the gap between the dialogues and effects with a stop watch or just test if your sentence is fitting by reading it out loud). In some cases you may want to go over a sentence of the dialogue or some music or sound effects (which is possible for mixed AD, but more problematic for AD in cinema or live AD in theatre or opera, as the sound-volume cannot always be adapted in the latter cases).

An AD script that has been spotted usually looks like a long row of sometimes-short paragraphs or even sentences that all contain the following (see the example of an AD script in appendix 4.1):

  • Time code: The time code is used to indicate when the voice talent reading your AD must start speaking. You may ignore the frame number, but make a note of the minutes and seconds. For a longer film you may have to write down the hours too. Some of your clients may want the outtakes written down in the AD script as well;
  • Some clients like to work with keywords. The purpose of a keyword is to indicate when the voice talent reading your AD must start speaking. It can be the last sentence said in the film dialogue or music or a sound effect that is your starting point for the AD text. If there is no dialogue or sound effect preceding the AD, the keyword will be left out;
  • You may have a legend at the beginning of your script were you define you instructions, e.g. an F for “fast” in front of a block of description, when the voice talent has to speed up a little;
  • If you want the voice talent to steer clear of a sound effect or some music or short dialogue, in order to leave it audible, this can also be indicated in the AD script, in brackets, for instance. And note, on the other hand, where the voice talent has to go over the dialogue or some music or sound effects;
  • Sometimes names of characters or places have a special pronunciation: You may write down in the AD script (again in brackets) the time code where the pronunciation is given in the soundtrack, e.g. "She leans over to Gasparow (pronounced in dialogue at TC 01:14:12) and kisses him”;
  • You may give some advice for the intonation depending on the source material: more empathetically for an emotional fiction film, more newscaster style for a documentary, or adapted for text on screen (see chapter 2.2.3);
  • Subtitles may have to be read in between the AD. For more information on this consult chapter 3.2.2 on AD with AST.

Navigate to Table of Contents

From Target Text to soundtrack

In the next phase, your AD script is transformed from a written text into an oral one: you or your client will choose a voice talent, whose voice qualities match the film's genre and style. There is little or no research on which voices fit which film genre best, but often a voice talent will be chosen whose voice contrasts the voices of the dialogues (e.g. in a film with many male roles you may want a female AD voice) or is thought to fit the genre and style of the source material. If subtitles or text on screen must be rendered as AST as well, one or more voice talents may be called upon to read them out loud (see chapter 3.3 on AST). There are experiments with synthetic AD voices under way and the results vary depending on the software used and the language of the AD. There are proponents and opponents of synthetic voices and according to some they may be good for documentaries but not for fiction films.

The recording is often done at a recording studio or sound studio with a recording booth and besides the voice talent, a sound director and a sound designer (or sound technician) present. The sound director (this might also be a blind or visually impaired colleague) listens to the voice talent reading the script and decides if the AD is presented in a correct way with regard to intonation, speed and so on. The sound designer or technician is responsible for the correct technical handling of the recording, e.g. avoiding any disturbing noise or sound over modulation.

The sound designer will then clean the recorded AD takes from any disturbing sounds (e.g. smacking und harrumphing of the voice talent and rustling from paper sheets) and place it exactly in between the dialogue gaps of the source material.

The cleaning of the recording is the last stage of the process in the case of AD for the cinema. The sound designer will create a file with only the AD takes in it and this can be presented in the cinema through headphones synchronised with the original soundtrack coming out of the loudspeakers. The transmitting might be done through a wire system in the cinema or from the smartphones of the blind or visually impaired audience with a special AD transmitting application.

For AD on TV or on DVD another step is required: the mixing of the AD takes and the original soundtrack of the film. During the moments where there is AD, the technician may lower the loudness of the original soundtrack and together with the sound director find the balance between having a good audible AD and keeping the original soundtrack present. The result is a whole new soundtrack that is audible parallel to the film on DVD, where you choose it as one of the audio options or on the second audio channel of your TV set.

Audio Introductions

Nina Reviers, University of Antwerp

What is an Audio Introduction?

An Audio Introduction (AI) is a continuous piece of prose, providing factual and visual information about an audiovisual product, such as a film or theatre performance, that serves as a framework for blind and visually impaired patrons to (better) understand and appreciate a given ST. It can be created to enhance the AD of that ST, or it can be made to stand alone. The AI can be recorded and made available well before the viewing of the ST (on CD, via a website, etc.) or it can be delivered live, as is often the case in the theatre. The introduction can be spoken by a single voice or it can be a combination of voices and sound bites.

Source Text analysis

Before analysing your ST in view of writing an AI, it is important to decide the following:

  • Will the AI be used in combination with an AD or will it be a stand-alone AI?
  • If it is used with an AD, has the AD already been drafted?
  • Will the AI be recorded and made available beforehand or will it be delivered live, close to the performance or screening?

Next watch the audiovisual product in its entirety and note down all the relevant information that needs to be included in the introduction. If the AD of the product has already been drafted, use this in your analysis. If necessary, look up background information about the production or, if possible, contact a member of the crew/theatre group/AD team for extra information.

As a framework for deciding what is relevant, consider the following possible functions of an AI:

  • Informative function. An AI usually contains information that corresponds broadly with the informational framework that the sighted viewers have, based on the programme booklet, for instance, in the case of theatre or opera, or the information available online or in reviews for film, i.e. directors, cast and credits, genre, author, synopsis, running time, etc.
  • Foreshadowing function. This function aims to provide content about characters, locations or other general features that cannot be included in the AD because it is a stand-alone AI or because there is no time to include it in the AD delivered during the production. This function includes:
    • more detailed information about characters (dress, physique) and locations (global and local) that will allow you to be more succinct during the AI, or allow the visually impaired to identify these aspects more easily without AD (based on sound and dialogue);
    • aspects that need explanation, such as intertextual references (see chapter 2.2.4);
    This foreshadowing function will only work, however, if the information load of the AI is well balanced, and therefore only includes essential information that would otherwise confuse or disturb the recipients.
  • Expressive-aesthetic function. This function provides more information about the style and nature of the material. In the case of theatre, for instance (see chapter 3.4.1 on theatre AD), it might be necessary to explain in greater detail how certain theatrical techniques are used. This function is also highly relevant for film, as it might be necessary to provide information about the visual style of the film, which might otherwise be lost in translation. Examples are films that are shot in black-and-white or films that combine both (such as Sin City (Miller et al., 2005)), films that employ uncommon visual techniques, such as Inglourious Basterds (Tarantino, 2009) (e.g. the use of on-screen arrows to indicate who a character is) or films that use visual techniques to indicate unchronological storytelling, like The Curious Case of Benjamin Button (Fincher, 2008), or Slumdog Millionaire (Boyle & Tandan, 2008) (see also chapter 2.2.1 on film language).
  • Instructive function. This function informs users on any technical/practical issues, such as the functioning of the AD and AI. This can include announcing how certain characters will be addressed in the AD (if they have several names or nicknames in the film or performance), how a prop or location will be referred to, what a recurring word means, etc. In the case of theatre or cinema, for instance, this might include explaining how and when to increase the volume of the headset in scenes where there is loud music, asking audience members to switch off cell phones or even explain the number and location of bathrooms, the price of drinks in the bar, the location of the ticket counter, etc.

The relative weight of each function depends on the type of product (film, theatre), the genre (see chapter 2.1.3 on genre) of the audiovisual product (complex historic films might have a stronger informative function) and on the type of AI (stand-alone AI usually have a stronger foreshadowing function).

Navigate to Table of Contents

Target Text creation

There exists no template for the creation of an effective AI and putting together the types of information identified during the ST analysis depends on a whole range of factors, most importantly whether or not there is an AD and whether it has already been drafted. When the AD has not been drafted yet, make sure to finalise the AI after the AD has been finished, so you know what it does or does not include. Keep the following issues in mind when writing the introduction.


Order the information in the most logical way, depending on the genre and nature of the production. Try to find a narrative thread to centre the information and to make the sections follow each other smoothly. The following elements can guide you:

  • Begin with a welcome word, present yourself/the speaker and say the running time of the introduction;
  • Put factual details at the start and more descriptive elements at the core of the introduction;
  • Combine the description of the plot and characters, as it helps to remember characters when they are placed in a narrative;
  • Consider using musical extracts to separate sections, in the case of recorded AD;
  • Finish with the instructive function, if there is one.
  • In some cases, especially in theatre, the introduction is read out twice; the second reading being a shorter version of the first, repeating the most essential information.
  • When providing background information, consider the possibility of adding quotations from cast and crew, reviewers or even sound bites or pieces of interviews, in the case of recorded AD;
  • Strike a right balance: give enough information to create a framework for the viewing of the production, but be sure not to overload your audience, as essential information will be lost. For live AIs, for theatre performances for instance, try to limit the introduction to 10 minutes. For recorded AD provided in advance, this can go up to 15 minutes.
  • Avoid revealing too much of the plot or giving away surprising or humorous elements.

AIs are written for the ear, just like ADs. Make sure your text is engaging and vivid and holds your audience’s attention till the end. Keep in mind that AIs are dense texts that contain a lot of information to process and remember. Therefore, your audience will appreciate a clear and straightforward writing style, with simple sentences, clear conjunctions and specific vocabulary. See chapter 2.3.1 on wording and style for more information.


For an example of an AI, see appendix 4.2.

Combining Audio Description with audio subtitling

Aline Remael, University of Antwerp

What is audio subtitling?

Audio subtitling (AST) is the spoken rendering of the written (projected) subtitles or surtitles with a filmed or live performance. It makes productions that are not dubbed and in which foreign languages are spoken accessible to the blind and visually impaired. In the case of recorded AD for film or television, the spoken subtitles are mixed into the sound track with the AD. The original on-screen subtitles can be read by a computerised voice or by a voice talent or voice actor. If there is more than one person who speaks a foreign language in the production, two or more voices may be used for the AST to help the target audience differentiate between speakers. In the case of low-budget films, TV series or documentaries, producers may opt for only one voice reading the subtitles but using a different intonation for each speaker. The original subtitles are sometimes read as they are, sometimes they are expanded and/or adapted to resemble spoken language more closely and to include information from the dialogues that had been left out in the subtitling process. This is sometimes required for good cohesion between AD and AST. However, in some countries subtitles are protected by copyright and cannot be changed. The two most common ways for recording AST are voice-over and a form of dubbing. In the case of voice-over the AST starts a few seconds after the original dialogue, which remains audible in the background. This allows the target audience to identify speakers. In the dubbed mode, the AST replaces the original dialogues completely. This mode often involves more "acting" on the part of the voice talent.

Source Text analysis

The following checklist helps you to determine the possible scenarios for AST:

  • (1) The film that you are audio describing is monolingual and the language it speaks is also the language of your audience. In this case AST is not required;
  • (2) The film that you are audio describing is almost entirely monolingual and its dominant language is the language of your audience but it contains a few short exchanges in one or more other languages that do not take up more than two to four turns of speech. In this case AST may not be required but is an option that you may need to discuss with your client. The alternative, which will be cheaper and simpler in terms of the sound mix, is that you incorporate what the characters say into the AD: analyse the exchanges using a foreign language for their length and complexity;
  • (3) The film that you are audio describing does not speak the language of your target audience and will not be dubbed or voiced-over into their language;
  • (4) The film that you are audio describing is multilingual and contains one or several scenes with dialogue spoken in a language or languages that are not the language(s) of your target audience and these will not be dubbed or voiced-over into their language.

In both the cases 3) and 4), AST is required. The producer will probably decide what form the AST should take (see definition).

Today, the way in which AST is provided with audio-described films is not regulated and will vary from country to country. In other words, if you have received no instructions or translation brief for the AST and you decide that AST is required, you need to obtain information about the way the production team of which you are part has been organised.

Use the following checklist to determine what the production-set-up is.

  • If the film you have been given to audio describe already has interlingual subtitles in all the foreign language scenes, determine whether these subtitles will be used as they are for the AST, or whether they will be adapted for the spoken version, and if so, whether you are expected to do this.
  • If the film you have been given has no subtitles for the foreign language scenes, determine whether you are also expected to provide subtitles or whether a subtitler is going to create them, and also whether these subtitles will then be used as they are for the AST or whether they will be adapted for the spoken version, and if so, whether you are expected to do this.
  • If the film you have been given has already been audio described and you are to write new subtitles or adapt existing ones, determine whether a written AD script is available. You will probably not be able to adapt the AD to interact with the subtitles, but must ascertain what exactly is covered by the existing one and what information from the dialogues, or even from the visuals (e.g. who is speaking) must be included in the AST.

Navigate to Table of Contents

Target text creation

In the case of situation 2), you are going to integrate the AST into your AD:

Rewrite the dialogue as narration fitting it into your AD and indicate the speaker. Make sure you incorporate all information required for your target audience to reconstruct the scene(s). For example: "Mark sits down at the table opposite his lawyer and they exchange greetings" ("they exchange greetings" replaces the dialogue "Bonjour!" and "Comment allez-vous?" ["Good morning" and "How do you do?"].

In the case if situations 3) and 4), you are the subtitler: determine your strategy based on the translation brief that you have been given, and time and subtitle the dialogues accordingly:

  • If standard subtitling is required, follow the subtitling instructions of your client: subtitling instructions will vary greatly. When writing your AD ensure that you incorporate all information that has been deleted from the subtitles in your AD script if it is information that is otherwise inaccessible to your target audience. Such subtitles are often recorded as voice-over so the audience will probably still have access to the sound of the original dialogues for character identification. If there is no time to indicate who is speaking this may not be a problem in that case.
  • If expanded subtitling is required, made-to-measure for the AST, translate the ST dialogues as dialogue without applying the condensation that is usual for subtitling and write your AD accordingly. If the expanded subtitling is going to be recorded dubbing style, make sure that your AD identifies who is speaking: the original voices will no longer be audible and all the characters will be speaking the same dubbed language. The fewer voices are used for the recording, the more attention may need to be paid to character identification in the AD.
  • If you are not the subtitler and you are given standard subtitling with which to combine your AD: do the same as under "standard subtitling" above.
  • If you are not the subtitler and you are given expanded subtitling with which to combine your AD: do the same as under "expanded subtitling" above.
  • If you are the subtitler but your film has already been audio described: make sure that the AST contains all the information required to interact with the existing AD and that together, they cover the information that the sighted audience would get from combining visuals and dialogue.

Introduction to other forms of AD

Audio describing theatre performances

Nina Reviers, University of Antwerp
What is live Audio Description for the theatre?

AD for the theatre resembles AD for film and television since theatre performances also tell stories and theatre audiences create story worlds in their minds, based on cues from the performance about its content, characters and spatio-temporal setting (see chapter 2.1). The main differences with film and television are:

  • Theatre has its own range of theatrical techniques that serve as a framework for constructing the narrative, some of which differ considerably from film techniques (see chapter 2.2);
  • AD for theatre is most often delivered live, based on a pre-prepared script. Consequently, the audio describer is usually the person voicing the AD during the performance as well;
  • Live AD for the theatre is often, but not always, combined with an AI (see chapter 3.2 on AIs).
Source Text analysis

When analysing your ST, keep in mind the features described in previous chapters for film AD: 2.1.1 characters and action, 2.1.2 spatio-temporal setting, 2.2.2 sound effects and music, 2.1.3 Genre, 2.2.4 intertextual references, 2.3.1 wording and style and 2.3.2 cohesion, but use the following features typical of theatrical productions as a framework for adapting the strategies wherever necessary:

  • Theatre performances do not mimic reality as directly as film, but they represent it. As a result, theatrical signs (costume design, sound, props and scenery) can be particularly minimalistic, static and even artificial, also due to the obvious time and space constraints of the stage. For instance, four wooden chairs and the gesture signaling the opening of a door, can be used to represent a car;
  • Just like film, theatre is highly semiotised, in that every sign is charged with denotative and connotative meaning. In theatre, however, these signs more often have several meanings. An object or even a character can represent several things/people at different moments during the performance (the chairs that represent a car in one scene, can be used in another scene as a park bench). Likewise, an object can have additional connotative meanings revealing something about the style and emotions related to it (an antique Louis XIV, gold-trimmed chair, says something about the style and status of its owners). It is even quite common for the connotative meaning of a sign to outweigh its denotative meaning so that the sign becomes an abstract and symbolic one;
  • As a result, theatre strongly relies on inference by the audience, because the relation between a sign and its meaning can be very implicit at times. What is more, many (necessary) story elements are implied and only exist in the minds of the audience, but are never explicitly visualised (they are simply assumed). This is often the case for the global setting of the story, for instance when only one or two rooms of a house are visible on stage. The rest of the house, the garden, the street, the city - often referred to in the dialogues - only exist in the minds of the audience. In those cases, background knowledge and/or a close reading of the dialogues is necessary for the interpretation.

The use of sound and lighting provide good illustrations of the specificities of theatre. Whereas in film, light and sound are often diegetic, the result of realistic elements within the story on the screen (a lamp, the sun, a car passing by), sound and light in theatre are more frequently extradiegetic, that is, used in isolation to represent elements that are not otherwise visualised on stage. Brown and green light beams on an empty stage, for instance, can be used to represent a forest, whereas the projection of a white square on a wall may suggest a door. Following the same logic, the chirping of birds alone can be used to indicate that characters on an otherwise empty stage are in a forest. Lighting in theatre also has a specific technical function, comparable to the shot change in film. A short spell of darkness can indicate the transition from daytime to nighttime, but it is frequently used simply to indicate a change in setting (and create time to switch the scenery). Sound, on the other hand, is typically artificial, as compared to film, since sounds do not necessarily resemble what they represent: the sound of approaching footsteps on a wooden floor, might actually represent a knight approaching a castle on his horse.

Navigate to Table of Contents

Target Text creation

When drafting your description, take into account the strategies proposed in the other chapters in these guidelines. Additionally, consider the general tendencies specific to theatre to adapt the said strategies accordingly where necessary. Also take into account all necessary contextual information as mentioned in chapter 1:

  • Determine (in deliberation with the client and/or theatre group) whether the AD needs to be combined with an AI and decide what form it will take (see chapter 3.2).
  • Determine which narrative building blocks (as discussed in chapter 1) are clearly represented on stage (through scenery, costumes, props, sound or dialogue) and which ones are assumed. In some cases, the rooms and settings of the story are recreated on stage quite realistically, in abstract theatre, on the other hand, actors can enact a story on a nearly empty stage, leaving the setting to the audience's own imagination.
    • When items are assumed to be present in the audience’s mind, decide whether they can be inferred by the blind and visually impaired, for example based on dialogue or background knowledge, or whether they need to be made explicit (in an Audio Introduction, for instance).
    • When the items in question are represented on stage, determine how they are represented and decide whether and how to describe them. For each of the signs, keep in mind the following questions typical of theatre:
      • Does a sign represent one element or is it used to refer to several elements? If so, decide how to make this clear in the AD (e.g. "Four kitchen chairs, arranged as car seats", "Four aligned chairs, a park bench")
      • Does a sign have a connotative meaning and how important is it? If so, decide whether it requires explicitation or whether its meaning can be inferred. Accordingly, decide on a strategy and degree of explicitness (e.g. include it in the introduction, describe the effect, see also chapter 1 on this decision-making process). For instance, a red velvet and golden King's robe indicate wealth and power, but groping wooden hands attached to the hem of his robe may symbolise the jealous forces seeking his downfall, which may be less obvious.
  • Pay special attention to the source of sound, light and dialogue and decide how to describe it. The source of daylight is usually not the sun, but a spotlight, the source of a sound can originate back-stage, even if it is related to an on stage action (e.g. a knocking sound when an actor pretends to knock on an invisible door), actors can use microphones, so the sound of their voices comes from the right side (where the stereo is located) even if the actor is standing to the far left. Decide how to describe this without causing confusion or breaking the narrative flow (e.g. in the AI, by explaining the possible incongruence: "Backstage sounds of wind and birds fill the stage"; or by helping the audience infer what is going on: "a man knocks on an invisible door" followed by a knock-knock sound.)

Next, proceed to write your descriptions. It is useful to include them in the script of the play in between the relevant blocks of dialogue and to write down relevant sound or musical events in the margin for reference during live reading. The fact that a performance and its AD are delivered live has a great influence on the scriptwriting phase. Time constraints are extremely tight in theatre. Moreover, improvisation and unannounced changes are quite common, which means that descriptions cannot be timed accurately in advance.

Keep the following in mind:

  • Live description makes it harder to use short gaps of a few seconds in between sound and dialogue. Theatre AD has a tendency to group information into blocks of several sentences. Longer descriptions are also easier to process for the audience, as they have to switch between two types of sound quality (through a headphone and live on stage).
  • Because of possible improvisation by the actors, it is safer to describe actions after they have happened. Only describe something before or as it happens, when you are absolutely sure it will happen when and exactly how you have prepared it.
  • In scenes where time is scarce, or where actors have the tendency to speed up, adapting the writing style can be useful. Short sentences that group essential information in front ensure that if you run out of time, this essential information can still be rendered even if the second half of the sentence has to be omitted. However, keep longer descriptions at hand, the pace of the performance might just as well slow down and long silences with no AD are to be avoided because your patrons may suspect a technical failure is to blame.

The Dutch play Van de Velde: J’aimerai mieux de bouche vous le dire (sic) by Belgian theatre group Olympique Dramatique, is a good illustration of the nature of theatre from an AD perspective.

The scenery consists of a simple glass cage in a black iron frame on an otherwise empty stage. This minimalistic setting never changes and represents several locations, even if it does not (realistically) resemble any of these. At different times in the play it represents a psychiatric institution, a boxing ring, a prison cell, a bar. The design and material of the cage suggest a cold, clinical, solitary environment. Changes from one location to another are implicit and must be deduced based on context, background knowledge and dialogue. At one point, a boxing duel takes place in a large stadium, even though the stage and cage remain empty, one immobile, silent figure excepted. The actual match in the stadium only takes place in the minds of the audience, supported by dialogue and sound effects (cheering and sports commentary in voice over).

Descriptive guides: Access to museums, cultural venues and heritage sites

Josélia Neves
What is a Descriptive Guide?

Descriptive guides (DG) comprise a variety of texts that may be rendered in writing or (oral) speech, presented in digital format on equipment such as audio guides, or provided by human guides during visits or tours to museums, cultural venues and/or heritage sites, among others. Like other forms of AD, DG’s are only a (small) part of a multisensory experience. A DG is an extra that has to fit in with the rest of the visit or event in such a way that it almost goes unnoticed. The DG cannot be the experience itself because people visit places to engage with what the place has to offer and not with the mediators/mediation technology.

Descriptive guides” can be broadly organised into the following categories:

  • Description of open spaces (cities, countryside, parks and gardens, zoos, playgrounds, heritage sites, etc.)
  • Description of architecture (buildings, rooms, indoor spaces, etc.)
  • Description of exhibitions (museums, galleries, collections, etc.)
  • Description of objects and artefacts (that cannot be touched);
  • Description of paintings;
  • Description of photographs;
  • Describing “how-to-…”;
  • Describing how to operate and use audio guide equipment;
  • Describing how to circulate (way-finding and navigation);
  • Describing how to“see” through touch.

Main differences between AD for cinema and theatre and DG

Descriptive guides differ from other types of AD types (i.e. AD for films or theatre) for a number of reasons:

  • Unlike AD in films, which is framed by an audiovisual text that is perfectly contextualised and self-contained, and where you are taken from the beginning to the end in a set sequence, descriptive guides are used in contexts where boundaries (of time, space and text) are not closed and are often changeable;
  • The AD in films, the theatre or other live performances is determined by a narrative and the amount of information to be given is decided on the basis of narrative relevance and the time available (to be delivered between speech and other important acoustic information). In DG, the information given is the narrative itself and the time issue is different in nature and is related to people’s attention span;
  • In film, relevance is seen in terms of the actual action and the time available to deliver the AD. In DG there is no “original text” to go by because the DG is the original text. There is however an original non-verbal text that will live as a co-text with the DG and that will determine the nature and structure of the DG. Thus, with DG relevance is seen in terms of a variety of open co-texts that require contextualization and interpretation and, above all, selection. This will probably be one of the major issues with DG: there is less concern with “when” to say, and a great emphasis on “how” and “what” to say about “what”;
  • When watching a film or a live performance, the viewers will be seated and focused and very little will be asked of them other than to take in and process the audiovisual stimuli that are available. They basically need to sit back and enjoy the film/play without having to do much other than listen. When using DG people might be standing or moving; they will be finding their way around, actively negotiating with space, the people around them and, very often, a piece of equipment or the exhibit itself. In other words, their attention might be required for different forms of engagement: they might be required to take action and make decisions (on what to do or where to go); or they be expected to do things, while they are listening (move, manipulate or explore through touch).
  • Issues such as safety, comfort and even proprioception may be so important that listening to a description may be a problem rather than an asset.

Navigate to Table of Contents

Before creating DG

Descriptive guides tend to combine factual information with description that need to be simultaneously accurate, clear and entertaining. Very often, there is no clear-cut ST as such (as happens with film) and the DG has to work within contexts that are multi-layered, that can be extensive (e.g. a castle and grounds) and changeable (e.g. gardens); encompassing and atmospheric (e.g. a temple); or minute and intricate (e.g. a work of art). And at times, the DG will lead to “seeing” through positioning, movement or touch. The diversity of contexts and possibilities makes it difficult to arrive at a set of guidelines which will cover all the possibilities, there are, however, basic elements that need to be addressed in all cases. Answers to these initial basic questions will provide the framework on which to build a DG.

  • (1) What sort of guide is in place?
    • Live (human) guide, that allows for clarification, interaction and personalisation;
    • A recorded DG to be used on a personal or public gadget.
  • (2) What is to be described?
    • Landscapes and open spaces, such as a city, the countryside, a park or garden, a zoo, a playground, a heritage site;
    • Architecture, such as a building, a room, an indoor space;
    • An exhibition, such as a museum, a gallery, a collection, a cabinet;
    • A (3D) object, such as a sculpture or an installation;
    • A (2D) object, such as a painting, a photograph or a panel;
    • How to use something, such as the audio guide, a piece of equipment;
    • How to explore something, such as a raised map, a replica, an object;
    • How to circulate, for instance way-finding.
  • (3) Where is the listener in relation to what is being described?
    • “In” the space that is being described, such as in a park, inside a church;
    • Near/far from what is being described;
    • Positioned in the presence of and in a particular place/position/angle in reference to what is being described. In front of a building.
  • (4) What context is there to what is being featured?
    • No specific context. A city, for instance, will be contextualised in a very broad sense;
    • A wide context. The (natural/built) environment around a building or park, for instance;
    • A closed context. Within a specific space or room, in a particular showcase;
  • (5) What are specificities of what is being featured?
    • What is to be described. For instance, identity (basic factual information – naming, date, origin, size, use, etc.);
    • Reasons for being exhibited. What makes it special or unique?
    • Relevant features?
    • How stable or changeable are the features? Change with time of day, season.
  • (6) What will the listener do with what is heard?
    • Gain an overview of location, space, size, context and co-text(s);
    • Create a general visual image;
    • Visualise specific details;
    • Find the way round/direction;
    • Learn how to manipulate/explore.
  • (7) What is the (linguistic/style) approach?
    • Simple, objective (factual) description;
    • Narrative approach, where facts and description work towards a “story” about what is being presented;
    • Interpretative approach/ Deconstructing and recreating through suggestive language, sound effects, music.

All of the above parameters influence the content and style of the DG and must be taken into account. Usually, a DG starts with an overview of the place, object, building etc. including a few facts, followed by a description of what makes it special or unique. Then it goes on to fill in specific information related to the object/space, exhibition, etc. Facts come first and description brings the facts to life but in some cases it may work out better if facts and description are intertwined.

Other useful hints include: go from the general to the specific, in other words, highlight the main features of the object or space; keep your language clear, simple and direct but vivid and diverse.

This chapter is just a very brief introduction to a very complex form of AD. Interested in learning more? Check out the reading list and go to appendix 4.3 for more details.

Navigate to Table of Contents


Example of a timed AD script

An excerpt of the AD script of the German version of Inglourious Basterds


(end of music, chopping sound)

In the warm light of the autumn sun. At the edge of a large meadow on top of a hill stands a little stone cottage.


(1 loud chopping sound)

Nearby a brown-haired man is chopping wood. Next to him a young woman hangs out laundry. She pauses, listens... (faint car sound) and spots approaching vehicles. (music)



The man stares at the vehicles. He wears poor clothes. Two girls hurry out of the cottage. To them:


„Geht zurück ins Haus und schließt die Tür!“
To the young woman:
He sits down on the (chopping) block.


(guitar music)

The vehicles come closer: three motorbikes escort a car. Hurriedly Julie runs to a water pump and fills a bowl. Her father takes a dirty handkerchief and closes his eyes for a moment. Then he glares at the vehicles.


(piano and guitar)

Julie puts the bowl down in front of a cottage window.


“Hier, Papa”

Julie puts the bowl down in front of a cottage window.


“Hier, Papa”

Her father wipes his face. With a tired look he gets up and ambles to the cottage.


„Danke, mein Schatz. Jetzt geh zu deinen Schwestern.“
The axe is stuck in the block. Julie turns to the door.
“Nicht rennen”
He glances in the window, then resolutely pours water over his face and his dirty, ripped shirt. The vehicles stop at a distance behind him next to some cows.



fast: A Colonel gets out of the car.


“wie sie wünschen, Herr Oberst”

The Colonel crosses the meadow.


„Ich bin Perrier LaPadite.“
The Colonel reaches out his hand.
„Natürlich. Nach ihnen.“
He makes an inviting gesture. (door) In the cottage. LaPadite and Landa step through the low door. The girls are looking at them seriously.

Navigate to Table of Contents

Example of an Audio Introduction

Audio Introduction for Inglorious Basterds, written by Louise Fryer and Pablo Romero Fresco

Welcome to this audio introduction to Inglourious Basterds, a film written and directed by Quentin Tarantino and released by Universal Studios in 2009. This AI lasts about 9 minutes. The film itself has a running time of 2 hours and 27 minutes.

It was nominated for eight Oscars: Best Picture, Best Director, Best Supporting Actor for Christoph Waltz, and Best Original Screenplay. The film is rated an 18, meaning it is suitable only for persons aged 18 years and over, and contains what the British censor describes as "strong, bloody violence". The DVD is not currently available in the UK with AD.

Inglourious Basterds – the words misspelt I.n.g.l.o.u.r.i.o.u.s. B.a.s.t.e.r.d.s. - is set during the 2nd World War. But while the locations are realistic and characters such as Goebbels, Hitler and Winston Churchill resemble their real-life selves, one of the lead actors, Christoph Waltz, has described the film as "a piece of art. Not a history lesson." It’s brutal but darkly funny and Tarantino includes plenty of anachronisms. The music includes Morricone’s Spaghetti Western-style themes lifted from other movies and the flourish of an electric guitar accompanies a character’s name as it flashes up onscreen, in bold, cartoon-style lettering. Other characters are identified, at various points in the film, as their name is scribbled in chalk over the shot with an arrow pointing them out. In contrast to Tarantino’s other films, like Pulp Fiction and Kill Bill, much of this movie is shot in unobtrusive, classic Hollywood style, and as close as he could get to glorious technicolour. This makes the moments where the camerawork deliberately draws attention to itself all the more remarkable. A doomed character walks forward in slow motion, crisply picked out as the background blurs. Or the camera closes in and lingers on a glass of milk, or a bowl of cream, bringing it to our attention. In a long tracking shot, the camera arcs around a table during a conversation, revealing the faces of those sitting round it from behind the head of each in turn. At one point, as a German Officer and a farmer talk in a farmhouse kitchen, they are shown from above, the camera slowly narrowing its view like an ever-tightening rope. Further into the conversation, the camera tracks down the farmer’s leg and continues on to reveal the area beneath the floorboards, as though the house were a doll’s house, open to view. When things turn violent, the camera doesn’t flinch or turn away but keeps a steady focus on the brutalities inflicted.

The film is divided into chapters – each announced in white letters on a black background in the manner of a silent movie. The chapters are self-contained, each with its own look and focus, and allow the plot to skip in time and place. There is also an occasional diversion within a chapter – Tarantino taking time out from the plot to insert a short information film, giving us the biography of a particular character, or technical details about, for example, the dangers of nitrate film, in a parody of a newsreel documentary, complete with authoritative voice-over, provided by Samuel L. Jackson, The words “Chapter one” are followed by an opening phrase that sets the tone of this fairytale yarn: “Once upon a time…in Nazi-occupied France”. It’s 1941. The scene? A sweep of French countryside and an isolated farmhouse on the brow of a hill. It’s home to Perrier LaPadite, a farmer in vigorous middle age. He's swarthy and stubbled, his clothes sweat-stained. LaPadite’s 3 teenage daughters are shy young women in simple cotton dresses. They talk in French, their words subtitled. When the German officer - Colonel Hans Landa - comes to call, the conversation turns to English. Landa is clean-cut, almost dapper in his officer’s peaked cap, grey uniform and gleaming jack boots. His short hair is parted on one side, brown with a hint of grey. Although Landa’s narrow lips often part in a charming smile, his bright eyes miss nothing. A teenage girl, Shoshanna Dreyfus, dirty and painfully thin makes her escape across the fields. We meet her again a couple of years later in Paris. But first we make the acquaintance of the Inglourious Basterds – a band of Jewish-American guerilla soldiers. Brad Pitt plays their hillbilly leader who hails from the mountains of Tennessee: Lieutenant Aldo Raine. Nicknamed Aldo the Apache, he’s out to collect Nazi scalps. When we first meet him, Aldo wears a khaki uniform, but in France adopts a rough tweed jacket and peaked, flat cap. Aldo’s about 40, with short brown hair swept back from his forehead and he sports a trim moustache. His movements are unhurried, he chews gum and sniffs tobacco. He speaks in a slow, southern drawl, a wry look in his clear blue eyes. There’s a rough red mark around Aldo’s neck, like a rope burn, and the name “Inglourious Basterds” is carved into the butt of his rifle.

Aldo recruits a band of 8 men – Jews who’ve fled the Third Reich. Among them is Wicki, an Austrian, tall and dark haired who acts as a translator; Stiglitz, who’s younger and stockier, with a craggy face, eyes narrow, hair razored to his scalp; and Donny Donowitz, nicknamed the Bear Jew – broad-shouldered, muscled and fiendish with a baseball bat. We first meet them in action in a wooded ravine, amongst the brick remains of an old German fort, with crumbling arches half submerged in undergrowth.

We catch up again with Shosanna in Paris in 1944. Under the pseudonym Emmanuelle Mimieux, she’s running Le Gamaar cinema. Shoshanna is now in her 20s, gamine and slim, with large features set in a finely-boned face. She has shoulder-length blonde hair which she sometimes wears up under a cap and she chooses boyish clothes for work. But she can look stunning, dressed up for an occasion, in a tailored red dress, and small black hat with a veil. Gold pillars flank the entrance to the cinema, and above it letters attached to a magnetic strip across an illuminated sign spell out the title of the latest film. Three sets of double doors lead into the terracotta-coloured foyer. Staircases either side sweep up to a balcony that overlooks the patterned marble floor below. Shoshanna is helped at the cinema by Marcel, a softly-spoken black man in his 30s. Marcel wears a cotton shirt with the collar undone and the sleeves rolled up, revealing his powerful biceps and muscular physique.

Shoshanna meets a German war hero Fredrick Zoller – a good looking, brown-haired young man in a Nazi uniform. Pinned to his breast is an Iron Cross. At a smart Parisian Restaurant with wood-pannelled walls and small tables with elegant place settings, Zoller introduces Shoshanna to important Nazi officers, including Goebbels – a small and rather mincing man with suspiciously black hair – and Hellstrom, who’s a major in the Gestapo. Hellstrom’s in his 30s, his face a little fleshy, his dark hair slicked back from his broad forehead. He wears a Gestapo officer’s long, black leather coat.

In London, Lieutenant Archie Hickox – an urbane, lean, uniformed officer with sharp cheek bones and a narrow moustache, a green beret pulled over his short, dark hair - is shown into the Prime Minister’s office. It’s a vast dark-panelled room with a polished wood parque floor, and a grand piano on a rug in the far corner. Churchill sits at the piano stool, a large, balding, jowly man with a lugubrious expression. He’s smoking a cigar and looks on, largely mute, as a General with a walrus moustache gives Hickox his orders.

Hickox meets up with the Basterds at an Inn in a French village. The lathe and plaster ceiling of the basement is giving way but in the bar, the stone walls are solid enough, and the roof is supported on heavy oak beams. Mismatched lamps hanging from the beams give off a dull glow. There are scrubbed pine tables and a spiral metal staircase leads upstairs. The landlord is assisted by a pretty young waitress, Mathilde, who has shoulder-length dark hair. There’s one other woman in the bar, Bridget von Hammersmark – a glamorous actress in her 30s. She has finely sculpted features and flawless skin, her lips and nails painted a sultry red. Bridget wears a tailored brown check suit, with a high-necked blouse and a matching trilby perched at a jaunty angle on her curled blonde hair. She smokes cigarettes in a tortoiseshell cigarette holder, with a studied pose, fully aware of the effect she makes on the men around her.

Navigate to Table of Contents

The main characters are:

  • Shoshanna Dreyfus, played by Melanie Laurent
  • Marcel, who helps Shoshanna at the cinema, is played by Jacky Ido
  • Lt. Aldo Raine, played by Brad Pitt, heads the Inglourious Basterds that include:
    • Sgt. Donny Donowitz – Eli Roth
    • Sgt. Hugo Stiglitz – Til Schweiger
    • Cpl. Wilhelm Wicki – Gedeon Burkhard
    • Private first class (Pfc) Smithson Utivich – B. J. Novak
    • Pfc. Omar Ulmer – Omar Doom
  • Col. Hans Landa is played by Christoph Waltz
  • Fredrick Zoller – the young war hero – by Daniel Bruhl
  • Major Hellstrom – August Diehl
  • Joseph Goebbels – Sylvester Groth
  • Lt. Archie Hickox is played by Michael Fassbender
  • The actress Bridget von Hammersmark by Diane Kruger
  • And the farmer Perrier LaPadite by Denis Menochet

The film is written and directed by Quentin Tarantino who finds himself a couple of small, non-speaking parts including the first scalped Nazi.

Additional hints for designing Descriptive Guides

Below some initial hints for designing different types of Descriptive Guides.

Description of open spaces (cities, countryside, parks and gardens, zoos, playgrounds, heritage sites,…)

  • Give an overview of the place you are going to describe with a few facts (size, number of inhabitants, features,…), age (important dates);
  • Highlight what makes the place special or unique;
  • Clarify how the place is to be described/explored: in a set route, highlighting specific features (buildings,…), exploring specific themes;
  • Break down the whole into stops and deal with each part separately while linking each part to other parts and to the whole (see types below);
  • In each stop offer information on the listeners position (if they are on the move) and if appropriate on how to get to the next stop.

Description of architecture (buildings, rooms, indoor spaces,…)

  • Give an overview of the building/space you are going to describe with a few facts (age, size, architectural features, historical/social importance;
  • Highlight what makes it special or unique;
  • Context (in terms of its immediate surroundings);
  • Clarify how the place is to be described/explored by providing the listener with a specific (physical or virtual) viewpoint;
  • Break down the whole into parts in a specific sequence (general to specific; left to right or moving round in a clockwise sequence; from top to bottom or bottom to top; from far to near or near to features that are further away). Whichever the direction to be taken, it is important to offer a sequence that will serve to highlight the main features in the specific place;
  • Highlight important features by adding extra information about interesting/relevant details.

Description of exhibitions (museums, galleries, collections,…)

  • Give an overview of the collection you are going to describe with a few facts (theme, age, provenience, type of exhibits,…);
  • Highlight what makes it special or unique;
  • Context (in terms of its position within the whole);
  • Clarify how the collection is to be described/explored by providing the listener with a specific (physical or virtual) viewpoint;
  • Break down the whole into parts in a specific sequence. “Take” the listener around a gallery, a collection, a showcase by sectioning or breaking down the collection into logical groupings;
  • Highlight important features by adding extra information about interesting/relevant details.

Description of objects and artefacts (that cannot be touched)

  • Present the piece you are going to describe with a few facts (identification, age, provenance, use,…);
  • Highlight what makes it special or unique;
  • Describe important and interesting features and highlight details;
  • Relate the piece with other pieces in the exhibition and, if possible, to common/well known objects.

Description of paintings

  • Present the painting you are going to describe with a few facts (identification, artist, date, style, technique…);
  • Highlight what makes it special or unique;
  • Give a general impression of the whole picture and then take the listener through a “journey” that may build a narrative or simply go through the elements that make up the painting;
  • Describe important and interesting features and highlight details by relating technique, colour, stroke and other technical features to the effect that is produced (be careful not to be over technical and minute);
  • Relate the painting to other pieces in the exhibition or the work of the artist and, if possible, attract the listener to other related works.

Description of photographs

  • Present the photograph you are going to describe with a few facts (identification, date, technique…);
  • Highlight what makes it special or unique;
  • Give a general impression of the whole picture and then take the listener through a “journey” that may build a narrative or simply go through the elements that make up the photo. .Deconstruct the photograph in “layers” to help understand a viewpoint, perspective or other compositional features;
  • Describe interesting features and highlight important details;
  • Relate the photograph to reality and mention how it captures or recreates a moment in life/space.

Describing how to “see” through touch – 3D objects

  • State clearly what is to be touched: a real piece, a replica, a model;
  • Present the object with a few facts (identification, date, provenience,…);
  • Highlight what makes it special or unique;
  • “Position” the person and the piece so that the person can explore the piece while listening to the description and clarify where the hand(s) are to be placed before the actual exploration is to begin;
  • If possible, invite the person to get a general (free) impression of the whole by mentioning size, form, and overall aspect;
  • Direct people’s hands through a systematic and logical exploration of the piece while calling attention to forms and textures;
  • Describe interesting features and highlight important details;
  • Relate the piece to reality and “bring the piece to life” by mentioning how it was/is used.

Describing how to “see” through touch – 2 ½D objects (raised drawings)

  • State clearly that the raised drawing is a simplified tactile version of a particular piece;
  • Relate the 2 ½D version to the actual piece to be described;
  • Follow the description of the original (see above) and relate it to the raised drawing;
  • Direct people’s hands through a systematic and logical exploration of the drawing while calling attention to forms and textures.

Circulate (way-finding and navigation)

  • Before beginning, state clearly what technique is to be followed (positioning, counting steps, identifying specific sensory features such as sounds, textures,…);
  • Position the person in relation to the surroundings identifying location clearly;
  • Position the person in terms of the direction to be taken and give clear distinctive landmarks to identify their position;
  • Keep directions to the minimum – counting steps, mentioning distances (2 meters away), mentioning the time that the move is expected to take may help;
  • Provide clear “locators” where people will need to change direction (turn left or right might not be enough);
  • Offer elements for reinforcement/conferral of position (on your left you will find a big door,…);
  • Keep attention spans as short as possible (3-4 indications at a time).

Describing how to operate and use audio guide equipment

  • Present the equipment in general terms;
  • Describe the layout of the keyboard;
  • Identify the keys: number pad, back and forward, rewind, jump and select;
  • Explain how to select and activate content, how to pause, repeat and change volume;
  • Explain how the information on the audio guide relates to what is to be visited;
  • Explain how the map/visit plan and the audio guide work together;
  • Explain how the information is organised (sequential, stops, layers,…);
  • State the expected time span for the guided visit.

Final (transversal) hints

  • Facts come first. Description brings facts to life;
  • Intertwine description with facts;
  • Go from the general to the specific;
  • Highlight only the main features;
  • Language needs to be clear, simple, direct but vivid and diverse;
  • Pieces (stops) must be short (1-2 minutes);
  • Prepare different DG with different lengths, degrees of detail, layers of information to be used at will.

Navigate to Table of Contents


A figure of speech that refers indirectly to an object or circumstance in the immediate context.
asynchronous description
Audio description that is presented earlier or later than what it actually describes appears on the screen.
average shot length
The average time of a shot (usually measured in seconds).
cameo appearance
A brief appearance of a known person in a film, typically unnamed or appearing as themselves.
A camera shot that tightly frames a person or an object from up close; its focus is on the fine details of the object or person rather than on the surroundings.
closing credits, end credits
Superimposed text at the end of a film that lists the crew members and additional elements, typically rolling across a blank screen.
diegetic (also intradiegetic)
Belonging to the reality depicted in the film.
A brief sound indicator. For instance, it can be used so that users can easily identify only by a distinctive sound that a programme contains audio description. It could also be used to indicate other aspects of audio description briefly.
The process of combining separate film shots and shot sequences into a complete motion picture.
In film, a scene that takes the narrative back in time from the current point of the story.
In film, a scene that takes the narrative back in time from the current point of the story.
The selection of information that is presented to the audience, in relation to what the narrator or the characters in the narrative experience and know.
frame number
The time code on a film print identifies precisely every image of a movie by giving the hour, the minute, the second and finally the number of the image or frame in that second. Films on (European) TV and DVD have 25 images or frames per second, films in cinemas and on Blu-ray 24 – frame numbers therefore start with 00 and end – depending on the material - with 23 or 24.
An explanation of a word or phrase.
A set of symbols and/or symbolic objects (icons) characteristic of and likely to appear in a given film genre.
A link, often a transfer or translation, as in “intermodal translation”, between two different semiotic modes (verbal-visual, visual-verbal, aural-visual, …)
In the world of audio description, keyword means the last sentence (or just the end of it) or a music or sound effect that indicates where the voice talent has to start reading the description.
long shot
A camera shot that typically shows the entire object or human figure , within/in relation to a broader context a landscape, specific surroundings, etc.).
An item in one (audiovisual) text to which reference is made in another. The target of an intertextual link.
An item in one (audiovisual) text making reference to an item in another. The originator of an intertextual link.
medium shot
A camera shot in which the object or human figure is in the middle distance (e.g. shown from the waist up), permitting some of the background to be seen.
French for “placing on stage”. When applied to the cinema, mise-en-scène refers to everything that appears before the camera and its arrangement—composition, sets, props, actors, costumes, and lighting.
The process and the way of shooting the elements of mise-en-scene (camera angle, camera position, camera movement, etc.).<
The concrete presentation of a chain of events, occurring in time and space, that are caused or experienced by characters.
non-diegetic (also extradiegetic)
Not belonging to the reality depicted in the film (usually music or off-screen narration).
opening credits
Superimposed text at the beginning of a film that lists the most important production and cast members and may include other features such as dedications, prizes won, epigraphs, censorship indications and further remarks (for instance, when saying that a film is based on real facts). They are part of the title or open sequence. They can be superimposed on a blank screen, on still images or on moving images, and may be accompanied by audible elements such as music and dialogue.
A part of a work (usually a film or music recording ) that is removed in the editing process and not included in the work's final, publicly released version.
over modulation
In sound recording the level of the recorded material is measured on a scale with the loudest signals at 100 per cent modulation. If a signal is recorded too loud it is going over the 100 per cent (it over modulates), which causes problems – the sound is not clear anymore and may have technical interference and deformation.
point of view (also perspective)
Refers to the position from which the events in a story are narrated/shown. Events can for example be narrated/shown from the point of view of a character or from a more neutral point of view.
Objects/items used in a film by the protagonists or static elements of set design (can be indicative of an epoch, social class, etc.).
secondary element
A visual element that is not of primary importance for the story, an element that illustrates and enriches the properties of narrative characters and settings, such as posters on the wall in the protagonist’s bedroom.
semantic field
A field comprising items or words that all fall within the same area of meaning (countryside, field, trees, cows, etc.).<
semiotic mode
Semiotic mode is understood in these guidelines as a modality in a (an audiovisual) text that creates meaning through the use of one sign system (words, images, sounds, music, lighting, perspective, film techniques such as flashbacks, flashforwards, etc.).<
The temporal and spatial context of an event that is presented in the narrative.
sound designer
A sound designer is a person working in a recording studio or sound studio responsible for manipulating or generating audio elements previously composed or recorded to create a desired effect or mood, such as sound effects and dialogue.
sound effect
In films a sound effect is a sound recorded and presented to make a specific storytelling or creative point without the use of dialogue or music.
sound flashback
Sounds from past events in the narrative that are combined with images of present events in the narrative.
sound studio (also recording studio)
A recording studio or sound studio is a facility for sound recording and mixing. Rooms are normally acoustically isolated to achieve optimum results (e.g. by absorption of reflected sound that could otherwise interfere with the sound heard by the listener). Persons working there are sound designers and sound directors.
A combination of sounds that creates a natural acoustic environment (e.g. clinking cutlery and plates combined with conversations as a restaurant soundscape).
synchronous description
An audio description that is presented concurrently with the described visuals.
A lexical item with the same or similar meaning to another lexical item, e.g. start/begin.
synthetic voice
A computer voice, an artificial reproduction of the human voice created through speech synthesis.
temporal orchestration
The manner in which events presented in a film are sequenced in time.
Text-to-speech (TTS) is a type of speech synthesis application that is used to create a spoken sound version of a written text in a computer document, such as a help file or a Web page. It is also used for audio-subtitles.
time code
In video production and filmmaking, time codes are used extensively for synchronization, and for logging and identifying material in recorded media. The common format "01:04:07:22" indicates that a given shot appears on screen at one hour, four minutes and seven seconds into the film. The figure 22 indicates the frame number within the second.<
translation brief
A term from (Functionalist) Translation Studies referring to the set of instructions that must ideally accompany any translation request or order.
voice talent
Voice acting is the art of providing voices in films or radio plays as well as doing voice-overs and dubbing. The performers are voice actors/actresses, voice artists or simply voice talents.
voice-over narration, off-screen narration, off-camera narration
A voice that can be heard by the audience but not by the film characters, that is not part of the diegesis and comes from an unseen, off-screen narrator (who can also be a character).

Navigate to Table of Contents

Suggested Reading

Agger, G. (1999). Intertextuality revisited: Dialogues and negotiations in media studies. Canadian Journal of Aesthetics, 4.

Altman, R. (2000). Film genre. London: British Film Institute.

Art Beyond Sight.

Bal, M. (1997). Narratology: An introduction to the theory of narrative. Toronto, ON: The University of Toronto Press.

Baldry, A. (2005). A multimodal approach to text studies in English: The role of MCA in multimodal concordancing and multimodal corpus linguistics. Campobasso: Palladino Editore.

Barsam, R. (2007). Looking at movies: An introduction to film. New York, NY: W. W. Norton & Company.

Benecke, B. (2014). Audiodeskription als partielle Translation: Modell und Methode. mitSprache Band 4. Berlin: Lit.

Bordwell, D. (1985). Narration in the fiction film. Madison, WI: The University of Wisconsin Press.

Bordwell, D., & Thompson, K. (2010). Film art: An introduction. New York, NY: Mc Graw Hill.

Braun, S. (2008). Audio description research: State of the art and beyond, Translation Studies in the New Millennium, 6, 14–30.

Braun S. (2011). Creating coherence in audio description. Meta, 56(3), 645–662.

Braun, S., & Orero, P. (2010). Audio description with audio subtitling: An emergent modality of audiovisual localisation. Perspectives: Studies in Translatology, 18(3), 173–188.

Chion, M. (1990). Audio-vision: Sound on screen. New York, NY: Columbia University Press.

Chmiel, A., & Mazur, I. (2011). Overcoming barriers: The pioneering years of audio description in Poland. In A. Serban, A. Matamala & J. Lavaur (Eds.),  Audiovisual translation in close-up: Practical and theoretical approaches (pp. 279–296). Bern: Peter Lang.

Corrigan, T., & White, P. (2009). The film experience: An introduction (2nd ed.). Boston, MA: Bedford/St. Martin’s.

Crook, T. (1999). Radio drama: Theory and practice. London: Routledge.

Dancyger, K. (2011). The technique of film and video editing: History, theory and practice. Burlington, MA: Elsevier.

Davila-Montes, J., & Orero, P. (in press). Audio description washes brighter?: A study in brand names and advertising”. Cultus.

De Beaugrande, R., & Dressler, W. (1981). Introduction to text linguistics. London: Longman.

Díaz-Cintas, J., Orero, P., & Remael, A. (Eds.). (2007). Media for all: Subtitling for the deaf, audio description, and sign language. Amsterdam: Rodopi.

Díaz-Cintas, J., Matamala, A., & Neves, J. (Eds.). (2010). Media for all 2: New insights into audiovisual translation and media accessibility. Amsterdam: Rodopi.

Elam, K. (1980). The semiotics of theatre and drama. London: Methuen.

Encelle, B., Ollagnier-Beldeme, M., Pouchot, S., & Prié, Y. (2011). Annotation-based video enrichment for blind people: A pilot study on the use of earcons and speech synthesis. ASSETS’11. The proceedings of the 13th international ACM SIGACCESS Conference on Computers and Accessibility, 123–130. Retrieved from:

Fix, U. (Ed.). (2005). Hörfilm: Bildkompensation durch Sprache. Berlin: Erich Schmidt.

Flückiger, B. (2001). Sound design: Die virtuelle Klangwelt des Films. Marburg: Schüren.

Fresno, N. (2014). Is a picture worth a thousand words?: The role of memory in audio description. Across Languages and Cultures, 15(1), 111–129.

Fryer, L. (2013). An ecological approach to audio description. The Psychologist, 26(6), 458–460.

Fryer, L., & Freeman, J. (2012). Cinematic language and the description of film: Keeping AD users in the frame. Perspectives: Studies in Translatology, 21(3), 412–426.

Fryer, L. (2010). Audio description as audio drama: A practitioner’s point of view.” Perspectives: Studies in Translatology, 18(3), 205–213.

Giovanni, E., & Bruti, S. (Eds.). (2012). Audiovisual translation across Europe: An ever changing landscape. Bern: Peter Lang

Grant, B. (2007). Film genre: From iconography to ideology. London: Wallflower.

Halliday, M. A. K., & Hasan, R. (1976). Cohesion in English. London: Longman.

Halliday, M. A. K., & Hasan, R. (1989). Language, context and text. Oxford: Oxford University Press.

Halliday, M. A. K., & Webster, J. (2004). Continuum companion to systemic functional linguistics. New York, NY: Continuum.

Hernández-Bartolomé, A., & Mendiluce-Cabrera G. (2004). Audesc: Translating images into words for Spanish visually impaired people. Meta, 49(2), 264–277.

Hirvonen, M. (2012). Contrasting visual and verbal cueing of space: Strategies and devices in the audio description of film. New Voices in Translation Studies, 8, 21–43.

Holland, A. (1999). Audiodescription from the point of view of the describer. Viewpoint, 53, 73–75.

Igareda, P. (2012). Lyrics against images: Music and audio description.” MonTI, 4, 233–254.

Ionides, J., & Howell, P. (2005). Another eyesight: Multi-sensory design in context. Ludlow: The Dog Rose Press.

ITC Guidance on Standards for Audio Description. 2000. Retrieved from:

Jimenez Hurtado, C. (Ed.). (2007). Traducción y accesibilidad. Frankfurt: Peter Lang.

Jiménez Hurtado C., & Seibel, C. (2012). Multisemiotic and multimodal corpus analysis in audio description: TRACCE. In A. Remael, P. Orero & M. Carroll (Eds.), Media for all 3: Audiovisual translation and media accessibility at the crossroads (pp. 409–426). Amsterdam, Rodopi.

Jimenez Hurtado, C., & Soler Gallego, S. (2013). Multimodality, translation and accessibility: A corpus-based study of audio description. Perspectives: Studies in Translatology, 21(4), 577–594.

Jones, C, & Ventola, E. (2008). From language to multimodality. London: Equinox.

Kristeva, J. (1980). Desire in language: A semiotic approach to literature and art. New York, NY: Columbia University Press.

Kruger, J. (2010). Audio narration: Re-narrativising film. Perspectives: Studies in Translatology, 18(3), 231–249.

Kruger, J. (2012). Making meaning in AVT: Eye tracking and viewer construction of narrative. Perspectives Studies in Translatology, 20(1), 67–86.

Kuhn, A., & Westwell, G. (2012). Oxford dictionary of film studies. Oxford: Oxford University Press.

Langford, B. (2005). Film genre: Hollywood and beyond. Edinburgh: Edinburgh University Press.

Lensing, J. (2009). Sound-Design, Sound-Montage, Soundtrack-Komposition – Über die Gestaltung von Filmton. Berlin: Schiele und Schön.

Magny, J. (2004). Le point de vue: De la vision du cinéaste au regard du spectateur. Paris: Cahiers du Cinéma.

Maszerowska, A. (2012). Casting the light on cinema: How luminance and contrast patterns create meaning. MonTI, 4, 65–85.

Maszerowska, A., Matamala, A., & Orero, P. (Eds.). (in press). Audio description: New perspectives illustrated. Amsterdam: John Benjamins.

Matamala, A., & Remael, A. (in press). Audio-description reloaded: An analysis of visual scenes in 2012 and Hero. Translation Studies.

Matamala, A., & Orero, P. (2011). Opening credit sequences: Audio describing films within films. International Journal of Translation, 23(2), 35–58.

Matamala, A., & Rami, N. (2009). Análisis comparativo de las audiodescripciones española y alemana de 'Good-bye Lenin'. Hermeneu, 11, 249–266.

Mazur, I., & Chmiel, A. (2012a). Audio description made to measure: Reflections on interpretation in AD based on the Pear Tree Project data”. In A. Remael, P. Orero & M. Carroll, M. (Eds.), Media for all 3: Audiovisual translation and media accessibility at the crossroads (pp. 173–188). Amsterdam: Rodopi.

Mazur, I., & Chmiel, A. (2012b). Towards common European audio description guidelines: Results of the Pear Tree Project”. Perspectives: Studies in Translatology, 20(1), 5–23.

Mazur, I., (in press). Audio description crisis points: The idea of common European audio description guidelines revisited”. In K. Nikolić & J. Díaz-Cintas (Eds.). Media for All 4.

Metz, C. (1974). Film language: A semiotics of the cinema. Chicago, IL: The University of Chicago Press.

Monaco, J. (2009). How to read a film: Movie, media and beyond (4th ed). Oxford: Oxford University Press.

Neves, J. (2012). Multi-sensory approaches to (audio)describing visual art. MonTI, 4, 277–294.

Orero, P. (2005). Audio description: Professional recognition, practice and standards in Spain. Translation Watch Quarterly, 1, 7–18.

Orero, P. (2011). The audio description of spoken, tactile and written languages in Be with me. In A. Serban, A. Matamala & J.M. Lavaur (Eds.), Audiovisual translation in close-up: Practical and theoretical approaches (pp. 239–256). Bern: Peter Lang.

Orero, P. (2011). Audio description for children: Once upon a time there was a different audio description for characters”. In E. di Giovanni (Ed.), Entre texto y receptor: Accesibilidad, doblaje y traducción (pp. 169–184). Frankfurt: Peter Lang.

Orero, P. (2012). Film reading for writing audio descriptions: A word is worth a thousand images?. In E. Perego (Ed.), Emerging topics in translation: Audio description (pp. 13–28). Trieste: EUT Edizioni Università di Trieste.

Orero, P., & Wharton, S. (2007). The audio description of a Spanish phenomenon: Torrente. Jostrans, 7, 164–178.

Orero, P., & Vilaró, A. (2012). Eye-tracking analysis of minor details in films for audio description. MonTI, 4, 295–319.

Palomo López, A. (2010). The benefits of audio description for blind children”. In J. Díaz-Cintas, A. Matamala & J. Neves (Eds.), Media for All 2: New insights into audiovisual translation and media accessibility (pp. 213–226). Amsterdam: Rodopi.

Pasco, A. H. (2002). Allusion: A literary graft. Charlottesville, VA: Rookwood Press.

Pavis, P. (1976). Problèmes de sémiologie théâtrale. Montréal, QC: Les Presses de l’Université de Quebec.

Piety, P. (2004). The language system of audio description: An investigation as a discursive process. Journal of Visual Impairment and Blindness, 98(8), 453–469.

Płażewski, J. (1982). Język filmu. Warsaw: Wydawnictwa Artystyczne i Filmowe.

Puigdomènech, L., Matamala, A., & P. Orero. (2010). Audio description of films: State of the art and a protocol proposal. In L. Bogucki & K. Kredens (Eds.), Perspectives on audiovisual translation (pp. 27–44). Frankfurt: Peter Lang.

Remael, A. (2012a). Audio description with audio subtitling for Dutch multilingual films: Manipulating textual cohesion on different levels. Meta, 57(2), 385–407.

Remael, A. (2012b). Media Accessibility. In Y. Gambier, & L. Van Doorslaer (Eds), Handbook of translation studies 3 (pp. 95–101). Amsterdam: Benjamins.

Remael, A. (2012c). For the use of sound: Film sound analysis for audio description: Some key issues. MonTI, 4, 255–276.

Remael, A., Orero, P., & Carroll, M. (Eds.). (2012). Media for all 3: Audiovisual translation and media accessibility at the crossroads. Amsterdam: Rodopi.

Remael, A., & Vercauteren, G. (2007). Audio describing the exposition phase of films: Teaching students what to choose. Trans, 11, 73–94.

Reviers, N. (2012). Audio description and translation studies: A functional text type analysis of the audio described Dutch play Wintervögelchen”. In E. Di Giovanni, & S. Bruti (Eds.), Audiovisual translation across Europe: An ever changing landscape (pp. 193–207). Bern: Peter Lang.

Ryan, M., & Lenos, M. (2012). An introduction to film analysis: Technique and meaning in narrative film. New York, NY: The Continuum International Publishing Group.

Salway, A. (2007). A corpus-based analysis of the language of audio description. In J. Díaz-Cintas, P. Orero & A. Remael, (Eds.), Media for All: Subtitling for the deaf, audio description, and sign language (pp. 151–174). Amsterdam: Rodopi.

Schmeidler, E., & Kirchner, C. (2001). Adding audio description: Does it make a difference? Journal of Visual Impairment and Blindness, 95(4), 197–212.

Snyder, J. (2014). The visual made verbal: A comprehensive manual and guide to the history and applications of audio description. Ludlow: The Dog Rose Press.

Sonnenschein, D. (2001). Sound design: The expressive power of music, voice and sound effects in cinema. Studio City, CA: Michael Wiese Productions.

Szymańska, B., & Strzymiński, T. (2010). Standardy tworzenia audiodeskrypcji do produkcji audiowizualnych. Retrieved from:

Szarkowska, A. (2011). Text-to-speech audio description: Towards wider availability of AD. JoSTrans: The Journal of Specialised Translation, 15, 142–162.

Szarkowska, A. (2013). Auteur description: From the director’s creative vision to audio description. Journal of Visual Impairment and Blindness, 107(5), 383–387.

Taylor, C., & Mauro, G. (2012). The Pear Tree Project: A geographico-statistical and linguistic analysis. Perspectives, 20(1), 25–42.

Taylor, C. (1998). Language to language. Cambridge: Cambridge University Press.

Taylor, C. (2012). From pre-script to post-script: Strategies in screenplay writing and audio description for the blind. In F. Dalziel, S. Gesuato & M. T. Musacchio (Eds.), A lifetime of English studies: Essays in honour of Carol Taylor Torsello (pp. 481–492). Padova: Il Poligrafo.

Thom, R. (1999). Designing a movie for sound. Retrieved from:

Van den Dries, L. (2001). Omtrent de opvoering: Heiner Müller en drie decennia theater in Vlaanderen. Ghent: Koninklijke Academie voor Taal- en Letterkunde.

Van Sijll, J. (2005). Cinematic storytelling: The 100 most powerful film conventions every filmmaker must know. Studio City, CA: Michael Wiese Productions.

Vandaele, J. (2012). What meets the eye: Cognitive narratology for audio description. Perspectives, 20(1), 87–102.

Vercauteren, G. (2007). Towards a European guideline for audio description. In J. Díaz-Cintas, P. Orero, & A. Remael, (Eds.), Media for All: Subtitling for the Deaf, Audio Description, and Sign Language (pp. 139–149). Amsterdam: Rodopi.

Vercauteren, G. (2012). A narratological approach to content selection in audio description: Towards a strategy for the description of narratological time. MonTI, 4, 207–231.

Vercauteren, G., & Orero, P. (2013). Describing facial expressions: Much more than meets the eye. Quaderns, 20, 187–199.

Vereisten voor de productie van kwaliteitsvolle live audiodescriptie (2011). Retrieved from:

Vilaró, A., & Orero, P. (2013). Leitmotif in audio description: Anchoring information to optimise retrieval. International Journal of Humanities and Social Science, 3(5), 56–64.

York, G. (2007).  Verdi made visible: Audio-introduction for opera and ballet. In J. Díaz-Cintas, P. Orero & A. Remael, (Eds.), Media for All: Subtitling for the Deaf, Audio Description, and Sign Language (pp. 215–229). Amsterdam: Rodopi.

Zórawska, A., Więckowski, R., Künstler, I., & Butkiewicz, U. (2012). Audiodeskrypcja: Standardy tworzenia. Retrieved from:

Navigate to Table of Contents


The following list includes all the films mentioned in these guidelines. If an audio described version is available, this is mentioned, including the language of the AD and the producer of the AD (when known).

A Lot like Love, N. Cole, 2005 [AD: EN])

Aanrijding in Moscou, C. Van Rompaey 2008 [AD: NL (Vrienden der blinden)]

Alice in Wonderland, Burton, 2010 [AD: EN]

All about Steve, P. Traill, 2009

American Beauty, S. Mendes, 1999

Annie Hall, W. Allen, 1977

Away from her, S. Polley, 2006

Back to the Future, R. Zemeckic, 1985

Bride Flight, B. Sombogaart, 2008 [AD: NL]

Brokeback Mountain, A. Lee, 2005 [AD: EN, DE]

Butch Cassidy and The Sundance Kid, Hill, 1969

Cold Creek Manor, M. Figgis, 2003 [AD: EN]

Contagion, S. Soderbergh, 2011 [AD: EN]

Deja Vu, T. Scott, 2006 [AD: EN]

Down with Love, P. Reed, 2003

E.T. The Extraterrestrial, S. Spielberg, 1982

East is East, D. O’Donnell, 1999 [AD]

Finding Neverland, M. Forster, 2004 [AD: EN]

Friends, “The one with the breast milk”, episode 283, M. Lembeck, 1995

Girl with a Pearl Earring, P. Webber, 2003 [AD: EN]

Grown Ups, D. Dugan, 2010

Hero, Y. Zhang, 2002 [AD: EN]

Hitch, A. Tennant, 2005 [AD: EN]

Hitchcock, S. Gervasi, 2012

In the Name of the father, J. Sheridan, 1993

Inglourious Basterds, Q. Tarantino, 2008 [AD: IT (Senza Barriere)]

It’s Complicated, N. Meyers, 2009 [AD: EN]

Julie & Julia, N. Ephron, 2009 [AD: EN]

Loft, E. Van Looy, 2008 [AD: NL, The Subtitling Company]

Londyńczycy, Zglinski, 2008 [AD: PL (Künstler, I, Butkiewicz, U, & Zawadzka, M.)]

Love Actually, R. Curtis, 2003

Memoirs of a Geisha, R. Marshall, 2005 [AD: EN]

Nights in Rodanthe, G.C. Wolfe, 2008 [AD: EN]

No Country for Old Men, E. Coen and J. Coen, 2007

North By Northwest, A. Hitchcock, 1959 [AD: EN, DE]

Nosferatu, W. Murnau, 1922 [AD: IT (VITAC)]

Pride and Prejudice, S. Langton, 1995

Ransom, R. Howard, 1996 [AD: EN]

Rock of Ages, A. Shankman, 2012

Saving Private Ryan, S. Spielberg, 1998

Sexy Beast, J. Glazer, 2000 [AD: EN]

Shutter Island, M. Scorsese, 2010 [AD: EN]

Sin City, F. Miller et al., 2005. [AD: EN]

Slumdog Millionaire, D. Boyle & L. Tandan, 2008 [AD: EN, DE]

Spy Kids, R. Rodriguez, 2001

Standup Guys, F. Stevens, 2012

Talk to her, P. Almodovar, 2002 [AD: EN, DE]

Taxi Driver, M. Scorsese, 1976

The 39 Steps, A. Hitchock, 1935

The Birds, A. Hitchcock, 1963 [AD: DE (Arte)]

The Brave One, N. Jordan, 2007 [AD: EN, (Vickers, M., ITFC)]

The Notebook, N. Cassavetes, 2004

The Counselor, R. Scott, 2013 [AD: EN]

The Curious case of Benjamin Buttons, D. Fincher, 2008 [AD: EN]

The Devil Wears Prada, D. Frankel, 2006 [AD: PL (Laskowski, M.)]

The English Patient, A. Minghella, 1996 [AD: EN, DE (Arte)]

The Forgotten, J. Ruben, 2004 [AD: EN]

The Girl with a Dragon Tattoo, D. Fincher, 2011

The History Boys, N. Hytner, 2006 [AD: EN]

The Hours, S. Daldry, 2002 [AD: EN]

The Imposter, B. Layton, 2012

The King's Speech, T. Hooper, 2010 [AD: DE]

The Lady Vanishes, A. Hitchcock, 1938 [AD]

The Ladykillers, E. Coen and J. Coen, 2004 [AD: EN]

The Lord of the Rings, P. Jackson, 2001-2003

The Lovely Bones, P. Jackson, 2009 [AD: EN]

The Ring, G. Verbinski, 2002

The Shining, S. Kubrick, 1980

The Simpsons

The Water Horse, J. Russel, 2007

The Wedding Planner, A. Shankman, 2001

The Wizard of Oz, N. Langley, 1939

Tootsie, S. Pollack, 1982 [AD: EN, DE]

Vicky Cristina Barcelona, W. Allen, 2008

Women in Love, K. Russel, 1969

Navigate to Table of Contents