• NER on Ancient Greek with minimal annotation

    Author(s):
    Chiara Palladino, Farimah Karimi, Brigitte Mathiak (see profile)
    Date:
    2020
    Group(s):
    DH2020
    Subject(s):
    Digital humanities, Research, Methodology
    Item Type:
    Conference paper
    Conf. Title:
    DH2020
    Conf. Org.:
    ADHO
    Conf. Loc.:
    online
    Conf. Date:
    21.07.2020
    Tag(s):
    Conditional Random Fields, Herodot, Named Entity Recognition, Digital humanities research and methodology
    Permanent URL:
    http://dx.doi.org/10.17613/j7jt-b052
    Abstract:
    This paper presents the results in the adaptation of a new workflow of Named Entity Recognition and classification applied to Ancient Greek. We used a model of data extraction and pattern discovery based on machine learning algorithms which is easily customizable for different languages. This allowed the creation of a dataset of automatically classified place-names and ethnonyms starting from a small manually annotated list. We worked on the assumption that premodern textual sources display a recognized systematicity in their linguistic encoding of space, which provides a test-case for automatic context-based methods. The idea is that we should be able to train the machine to recognize an entity from recurring elements in the context, without providing a large training dataset in advance.
    Metadata:
    Published as:
    Online publication    
    Status:
    Published
    Last Updated:
    3 years ago
    License:
    Attribution-NonCommercial-ShareAlike
    Share this:

    Downloads

    Item Name: pdf places-and-ethnicities-in-herodot-dh-2020.pdf
      Download View in browser
    Activity: Downloads: 290