Captioning Best Practices

Captions allow deaf and hard of hearing viewers to understand the spoken content of videos by displaying words in sync with audio. Below are some generally accepted captioning standards for compliance.

Best Practices for Caption Timing and Positioning:

  • Each caption frame should hold 1 to 3 lines of text onscreen at a time, viewable for a duration of 3 to 7 seconds. Each line should not exceed 32 characters.
  • Each caption frame should be replaced by another caption.
  • All caption frames should be precisely time-synched to the audio.
  • A caption frame should be repositioned if it obscures onscreen text or other essential visual elements.

Best Practices for Caption Style and Formatting:

  • Spelling should be at least 99% accurate.
  • When multiple speakers are present, sometimes it is helpful to identify who is speaking, especially when the video does not make this clear.
  • Both upper and lowercase letters should be used.
  • The font should be a non-serif, such as Helvetica medium.
  • Non-speech sounds like [MUSIC] or [LAUGHTER] should be added in square brackets.
  • Punctuation should be used for maximum clarity in the text, not necessarily for textbook style.
  • Captions should preserve and identify slang or accents.