-
Towards a sustainable handling of inter-linear-glossed text in language documentation
- Author(s):
- Johann-Mattis List (see profile) , Nathaniel A. Sims
- Date:
- 2019
- Group(s):
- Classical Philology and Linguistics, Digital Humanists, Digital Humanities East Asia, Linguistics
- Subject(s):
- Historical linguistics, Computational linguistics
- Item Type:
- Article
- Tag(s):
- retro-standardization, inter-linear-glossed text, Sino-Tibetan language, standardisation
- Permanent URL:
- http://dx.doi.org/10.17613/gscz-mb13
- Abstract:
- Efforts on language documentation have been increasing in the past. While the amount of digital data of the world's languages is increasing, only a small amount of the data is sustainable, since data reuse is often exacerbated by idiosyncratic formats and a negligence of standards that could help to increase the comparability of linguistic data. The sustainability problem is nicely reflected in the current practice of handling inter-linear-glossed text, one of the crucial resources produced in language documentation. Although large collections of glossed texts have been produced so far, the current practice of data handling greatly exacerbates the reuse of data. In order to address this problem, we propose a first framework for the computer-assisted, sustainable handling of inter-linear-glossed text resources. Building on recent standardization proposals for word lists and structural datasets, combined with state-of-the-art methods for automated sequence comparison in historical linguistics, we show how our workflow can be used to lift a collection of inter-linear-glossed Qiang texts (an endangered language spoken in Sichuan, China), and how the lifted data can assist linguists in their research.
- Notes:
- Manuscript preprint, currently under review.
- Metadata:
- xml
- Status:
- Published
- Last Updated:
- 4 years ago
- License:
- All Rights Reserved
- Share this:
Downloads
Item Name: list-sims-2019-inter-linear-glossed-text-preprint.pdf
Download View in browser Activity: Downloads: 347
-
Towards a sustainable handling of inter-linear-glossed text in language documentation