-
Multiple sequence alignment in historical linguistics. A sound class based approach
- Author(s):
- Johann-Mattis List (see profile)
- Date:
- 2012
- Group(s):
- Linguistics
- Subject(s):
- Historical linguistics, Computational linguistics
- Item Type:
- Article
- Tag(s):
- phonetic alignment, multiple alignment
- Permanent URL:
- http://dx.doi.org/10.17613/hv4x-9x04
- Abstract:
- In this paper, a new method for multiple sequence alignment in historical linguistics is presented. The algorithm is based on the traditional framework of progressive multiple sequence alignment (cf. Durbin et al. 2002:143-149) whose shortcomings are further enhanced by (1) a sound class representation of phonetic sequences (cf. Dolgopolsky 1986, Turchin et al. 2010) accompanied by specific scoring functions, (2) the modification of gap scores based on prosodic context, (3) a new method for the detection of swapped sites in already aligned sequences. The algorithm is implemented as part of the LingPy library (http://lingulist.de/lingpy), a suite of open source Python modules for various tasks in quantitative historical linguistics. The method was tested on a benchmark dataset of 152 manually edited multiple alignments covering data for 192 Bulgarian dialects (Prokić et al. 2009). The results show that the new method yields alignments which differ only in 5 % of all sequences from the gold standard.
- Metadata:
- xml
- Published as:
- Journal article Show details
- Pub. Date:
- 2012
- Journal:
- Proceedings of ConSOLE
- Volume:
- XIX
- Issue:
- 1
- Page Range:
- 241 - 260
- Status:
- Published
- Last Updated:
- 3 years ago
- License:
- Attribution-NonCommercial
- Share this:
Downloads
Item Name: list-2012-multiple-sequence-alignments-console-proceedings.pdf
Download View in browser Activity: Downloads: 70
-
Multiple sequence alignment in historical linguistics. A sound class based approach