search catalogue

catalogue Up

Yolox_Elict_ZGS507_2010-12-12-b

Object Type: Folder
In Folder: yoloxochitl-amith-0078

▲

Expand to show object metadata

Yolox_Elict_ZGS507_2010-12-12-b

Session Information

Session Handle: http://hdl.handle.net/2196/48069396-15c6-4b31-ad54-95496ba80df3

Title: Lista 001 tonos completos 2010-12-12-b

Description: This file (and derivative files, e.g. mono conversions of stereo elicitations and edited mono files[see below]) was given on a hard disk to the Endangered Language Archive (ELAR) via David Nathan on 9 Jan. 2011 at the Pittsburgh LSA Conference. Along with other material on the disk, it was not accessioned and on 4 April 2012 a second hard disk was given to Tom Castle with these and other files. Word list 001 is archived in three related files: Archived-elicitation-list-001_261-word-tokens-for-all-tonal-patterns_261-words with a .doc (Word document), .xls (Excel spreadsheet), and .pdf/A (portable document format) extension. The lists presented on these files were of the words pronounced in the 20 sound files for segmentation. Ten speakers were asked to repeat 261 words in two sessions. Thus there should be 20 wav files. There are actually 21 as the first recording was redone and never segmented for PRAAT analysis: Yolox_Elict_CTB501_Lista-001-tonos-completos_2010-12-08-a.wav. The other 20 files (2 sessions x 10 speakers) were all segmented (see below). The two speakers were: Constantino Constantino Teodoro Bautista Teodoro Celso, Esteban Castillo García, Esteban Guadalupe Sierra, Estela Santiago Castillo, Guillermina Nazario Sotero, Rey Castillo García, Soledad García Bautista, Victorino Ramos Rómulo, Zoila Guadalupe Sierra. Each speaker was asked to repeat the 261 words 3 times in each session (x 2 sessions = 6 tokens). The targeted speaker was miked for one channel (usually left) and Rey Castillo García was miked on the other channel (usually right). Rey would try to elicit without pronouncing the target word, but this wasn't always possible. Rey would listen and, if the speaker uttered a tonal sequence that was not the targeted pattern, Rey would re-elicit. Thus there were sometimes 4 or 5 tokens. ANALYSIS AND SEGMENTATION: The first process was to isolate the channel of the targeted speaker. Thus from the file Yolox_Elict_CTB501_Lista-001-tonos-completos_2010-12-08-c.wav, the left channel was isolated as a mono file and so named: Yolox_Elict_CTB501_Lista-001-tonos-completos_2010-12-08-c_mono.wav. Then Rey Castillo removed, cut out from Yolox_Elict_CTB501_Lista-001-tonos-completos_2010-12-08-c_mono all the sounds that were not the targeted words. This left a clean sound file of pure tokens, an average of 3 per word per session (3 x 261 = 783 tokens). At this point William Poser segmented each token in an automated process. Rey Castillo had previously given a list of the number of repetitions for each token (e.g., 001,3; 002,3; 003, 4; 004,2 ...). Poser then segmented into tokens for all 20 sessions and then recombined into a single file (e.g., 001x6_CBT501.wav). Leandro DiDomenico, a graduate student in France, was then hired to segment the phonemes in PRAAT of the first and second utterances in each session. Generally these were the first, second, fourth, and fifth tokens of the six-token sequence. Much later, while on a postdoc at Haskins laboratories, Christian DiCanio went over and corrected each TextGrid (e.g. Yolox_Elict_List-01_0001x6_CTB501.TextGrid associated with Yolox_Elict_List-01_0001x6_CTB501.wav). These four-tokens-revised TextGrids will soon be superceded by a complete six-token TextGrid, which is the TextGrid that will be archived at ELAR and AILLA. The total number of hand-segmented tokens, therefore, is 10 speakers x 6 repetitions x 261 words = 15660 individual tokens). Finally, as part of the NSF project led by Doug Whalen, two automated segmenters were evaluated for accuracy against the hand-segmented tier. A short article whose principal author is Christian DiCanio was written about the results of this test: " Assessing agreement level between forced alignment models with data from endangered language documentation corpora.; duration: 070:13, recording device: Marantz PMD 670, microphone:

Date created: 2010-12-12

Location

Continent: Americas

Country: Mexico

Region: Guerrero

Address: Yoloxochitl, Guerrero, Mexico

Project

ID: MDP0201 PPG0048

Name: 0078-MDP0201 0078-PPG0048

Description:

Content

Genre: Elicitation

Languages

Language

Name: Yoloxochitl Mixtec

Description: Subject Language

Language

Name: Spanish

Description: Working Language

Actor

Full Name: Jonathan D. Amith

Role: Recorder

Actor

Full Name: Zoila Guadalupe Sierra

Role: Consultant

Actor

Full Name: Rey Castillo García

Role: Elicitator

Media Files

Filename: Yolox_Elict_ZGS507_Lista-001-tonos-completos_2010-12-12-b.wav

File handle: http://hdl.handle.net/2196/403741c4-3bce-4343-85c3-93b6eaf02168

File type: Audio

Access: U

Filename: Yolox_Elict_ZGS507_Lista-001-tonos-completos_2010-12-12-b_mono-editado.wav

File handle: http://hdl.handle.net/2196/a8b10391-1c48-4edc-85fa-3834072abb87

File type: Audio

Access: U

Show more ▼

View Gallery

Title ▲ ▼

File Format

Access ▲ ▼

Yolox_Elict_ZGS507_Lista-001-tonos-completos_2010-12-12-b

wave

U_UserAccess

Yolox_Elict_ZGS507_Lista-001-tonos-completos_2010-12-12-b_mono-editado

wave

U_UserAccess