The field of language documentation in the modern context involves a complex and ever-evolving set of tools and methods, and the study and development of their use - and, especially, identification and promotion of best practices - can be considered a sub-field of language doc...
- Repository URL:
- Toolbox; media files; ELAN; time-aligned transcription; Transcriber
This article focuses on our documentation project’s use of Toolbox with media files, i.e., the source audio/video material that our transcripts are based on: why we set things up the way we do, and how. The process begins with an appropriate media file. This is marked up in Transcriber to produce a series of time-aligned annotations containing transcripts and speaker names, which correspond to intonation units in the recording. The resulting file is converted to a text format that can be used natively in Toolbox and easily imported into ELAN. The article also covers techniques for managing and querying the resulting data, both within Toolbox and with spreadsheets and relational databases. Further, it discusses some other language-oriented programs (especially Transcriber and ELAN) insofar as they affect our use of Toolbox. When Toolbox is used in close conjunction with source media files, it becomes particularly powerful. Some common tasks become easier, and new types of enquiry are possible. This is largely the result of Toolbox’s ability to play discrete segments from a sound file. There is no single established methodology for creating such a conjunction, and there are a multitude of possibilities for using the results. This paper offers one account.