МЕТОДЫ И МОДЕЛИ АВТОМАТИЧЕСКОГО ИЗВЛЕЧЕНИЯ КЛЮЧЕВЫХ СЛОВ

Светлана Олеговна Шереметьева; Павел Григорьевич Осминин

Authors

Svetlana Olegovna Sheremetyeva Author
Pavel Grigor'evich Osminin Author

Abstract

The paper presents an overview and classification of major approaches to the automatic
extraction of keywords from text documents. The approaches can be divided into statistical and
hybrid approaches. Both of these types can be further classified into corpora-based and documentbased.
Advantages and shortcomings of particular approaches are analyzed. It is claimed that the use
of statistical keyword extraction methods for inflecting languages, such as Russian, is problematic.
Requirements to the efficient model of automatic keyword extraction from texts in Russian are
formulated and particular recommendations to meet these requirements are given. It is emphasized
that to create effective keyword extractors one should take into consideration the linguistic types of
natural languages (analytical, inflecting, agglutinative, isolating), the domain (sublanguage) and the
availability of linguistic and programming resources. The approach is illustrated by a case study of a
keyword extractor for Russian texts on mathematical modeling.

Author Biographies

Svetlana Olegovna Sheremetyeva

PhD (Habilitation), professor of the Linguistics and Intercultural Communication
department, South Ural State University (Chelyabinsk), linklana@yahoo.com
Pavel Grigor'evich Osminin

assistant professor of the Linguistics and Intercultural Communication department, South
Ural State University (Chelyabinsk), osperevod@gmail.com

ON METHODS AND MODELS OF KEYWORD AUTOMATIC EXTRACTION

Authors

Abstract

Author Biographies

Issue

Section