There are 3 steps to creating captions:
- transcribing the content
- breaking it into caption blocks
- aligning the caption blocks to the video.
A variety of software programs can help with each step in this process. Throughout this process, you should review the captions for quality and accuracy based on our Captioning Quality Guidelines.
Transcribing the Content
If you create your own videos, it’s recommended to write out a script ahead of time so that you can use the transcript to easily create captions later. No machine-generated captions will be consistently accurate enough on their own; you will need to review and correct the output of any machine-aided transcription.
- Software and Programs for Machine-Aided Transcription
- Google Docs Voice Typing
- Dragon Naturally Speaking
- Note: If you are using any of the above software except YouTube to transcribe pre-recorded audio, you may need to route your computer’s audio output back into the computer as audio input for better transcription quality. Suggestions are available on how to do so for Mac and Windows.
- Software and Programs for Manual Transcription
Breaking into Caption Blocks
Next, you will need to split apart your transcript into separate captioning blocks that will appear on-screen, one after the other. For easier reading, try to avoid splitting your captions in the middle of a phrase.
If you are manually transcribing your audio in a captioning program like Amara, you can create each caption block in the software as you transcribe. If you provide YouTube with a transcript of a video, it can automatically set the timings for you.
Syncing to the Audio
This is the process in which the caption blocks are assigned start and end times so that they appear at the correct part of the video. Many subtitle programs require you to do this manually.
YouTube will do this process automatically if you have a transcript prepared and select “set timings”. The results may not be perfectly accurate; check any long gaps in time or blocks with non-speech sounds to ensure they are aligned accurately.
Review for Quality
Once you create your caption file, you should review it for quality. A short summary of quality issues to check for is listed below. Please reference the Captioning Quality Guidelines for a more extensive list.
- Identify all changes in speaker (e.g. “Sarah: ”, “Person”, or “>>” if speaker name unknown.)
- Add any meaningful non-speech sounds in brackets (e.g. [car honks])
- Ensure all spoken content is transcribed exactly, not paraphrased.
- Do not include any more than 2 lines of text per caption block.
- Ensure the caption blocks appear long enough to be easily read; generally they should appear for at least 1 second.
Save or Export Your File
If you are creating your captions in a separate software from the media player they will be displayed in, you will need to save or export your captions from the caption editor software so that they can be uploaded to the destination media repository.
Captioning files are typically saved with one of the following extensions: .srt, .vtt, .sbv, .dfxp, .sami, or .ttml. SRT files are the simplest format, and are able to easily be edited by anyone using a text editor. However, they do not support features like vertical caption placement or text markup. If those features are required, VTT is recommended for ease of editing.
If you create your captions in a captioning editor like YouTube or Amara, you can export your file to a variety of caption formats which can then be uploaded to Kaltura, Vimeo, or any other player that accepts standard caption formats.
Creating Captions with Others
If you would like to work with a group to create a caption file, consider using Amara.
- Amara Public Editor: Amara lets any user contribute to an existing captioning project, and logs contributions by username. You don’t have final control over which changes are approved; any submission is approved automatically, but you have access to every version that has been uploaded.