Documenting the Wichita Language on Digital Video & Audio

Processing the Digital Video and Audio Tapes

Once we returned from our fieldwork we were faced with the task of processing and annotating the approximately 16 hours of digital video and 75 hours of digital audio. We have followed a separate procedure for each of these components for various reasons that are summarized below.

mini-DV Tapes

The digital video material requires a lot of hard-disk storage space which we lack. We decided to leave the digital capture of these video tapes to the DOBES technical team since (i) they already have a workflow setup for the processing of digital tapes and (ii) they have much more expertise and storage space than we can afford.

The workflow with DOBES involved the following stages of pre-processing to get the video material ready for annotation and clipping into sessions.

(a) Back in our lab in Boulder we made backup copies of our mini-DV tapes (i.e. a mini-DV to mini-DV copy using two DV playback machines).

(b) We sent the original DV tapes to the MPI at Nijmegen for pre-processing (see next step).

(c) The DOBES team at MPI digitally captured our tapes onto storage space and compressed the files into high quality MPEG-2 format. These digital media files (DMFs) each one corresponding to one of our long language sessions, were burned onto CDs and sent to us.

(d) In our lab in Boulder we transferred these long MPEG files onto our computers and started the process of cutting out individual (sometimes short but sometimes long) "sessions" from each of the language meetings. This process is rather tedious and involves hours of playback in order to make decisions on what the appropriate sessions should be and what the precise components and timing of each session are. We have used Quicktime, Windows Media Player, ELAN, and SoundForge for various stages of approximate and subsequent precise timing of start and end time stamps of the individual sessions.

(e) Once we decided on the session points we used the IMDI tool from DOBES to fill out session metadata; this includes detailed information about the participants, the researchers, contacts, time of recording, nature of data, and what the time frames of the particular session are. These IMDI files contain all the information that DOBES needs to make individual session DMFs for us.

(f) We sent our IMDI files to DOBES (via email) and they made the appropriate cuts of the media files for each of the sessions and sent the .mpeg and .wav files to us on CDs.

(g) We copied the individual session DMFs onto our computers and began the procedure of annotating the sessions using the ELAN annotation tool from DOBES.

We are currently in the process of continued annotation of the Summer 2002 fieldwork DV tapes and are about to start cutting individual sessions from the Summer 2003 tapes.

DAT Audio Tapes

We decided to do our own digital capture, processing, and annotation of the digital audio tapes since we were able use some portion of our equipment budget for purchasing an optical DAT capture device and a an external hard-drive which gives us enough space for the processing. Furthermore, the audio capture process is familiar to us from the digitization of the older reel and cassette tapes.

The DAT tapes are mostly for archiving and reference for our fieldwork notes as we worked with native speakers on annotation of new and old material. They are therefore mostly for our own reference. However, some of the tapes contain music sessions which are very much a part of the culture and the material in which language seems to be used most frequently from an outsiders perspective. Also, the tapes contain individual personal narratives and important cultural information that we see as an important component of the present-day state of the Wichita language.

The DAT audio capture involved the following stages of processing, all of which took place in our lab.

(a) We connected the following equipment together:

       - Apple powerbook with a 40 GB hard disk, USB, firewire, and DVD-writer
       - EDIROL UA-5 External digital optical audio capture device
       - Sony DAT recoder/player with digital optical output
       - 250 GB Lacie firewire external hard drive

(b) Using a standard audio software on the laptop we replayed the DAT tapes, capturing the audio data in real-time at 44-48 kHz, 16bit, Stereo format onto the large Lacie Hard-drive.

(c) The captured audio files are named according to the audio tape label (which follows the standard set by DOBES for tape-labeling). Each digital media file for the tape is also accompanied by a text file that gives information about the content of the tape (i.e., some meta-data information as well as a general summary of what is in the tape).

(d) Portions of the tapes that contain music, Wichita words and phrases that cannot be found anywhere else in our archive, and some important personal narratives that the speakers allow us to make publicly accessible, will be cut into individual "sessions" and included in the main part of the archive at DOBES.

(e) The long audio DFMs are burned onto DVD-Rs for backup and storage; this will make our fielwork notes much more accessible to those who are interested in hearing them.

So far we have captured about 20 DAT tapes, each approximately 2 hours in length, and are writing annotation notes for each one. It is important to note that during the "language meetings" in our fieldwork we simultaneously video and audio taped the sessions; the separate DAT audio tapes from these meetings are also being captured in our by the above audio-processing procedure. We currently face the small complication that the time codes on our "simultanous" video and audio tapes do not match exactly.

