Pan-Art Pedagogy. Theory & Practice Philology. Theory & Practice Manuscript

Archive of Scientific Articles

ISSUE:    Almanac of Modern Science and Education. 2016. Issue 12
COLLECTION:    Technical Sciences

All issues

License Agreement on scientific materials use.

SOFTWARE TOOLS FOR INFORMATION EXTRACTION FROM NATURAL-LANGUAGE TEXTS

Andrei Valer'evich Rubailo
Chelyabinsk State University

Maksim Yur'evich Kosenko
Chelyabinsk State University


Submitted: January 17, 2017
Abstract. The article describes the existing tools for extracting named entities from natural-language texts. A comparison of the considered tools to identify the most suitable of them to solve the task of extracting named entities from non-formatted Russian-language texts is carried out. The authors substantiate practical efficiency of Tomita-parser to solve tasks of extracting named entities from non-formatted Russian-language texts.
Key words and phrases:
извлечение именованных сущностей
обработка текста
обработка информации
автоматизация
Томита-парсер
Named Entity Recognition
GATE
PullEnti SDK
Eureka Engine
extraction of named entities
word processing
data processing
automation
Tomita-parser
Reader Open the whole article in PDF format. Free PDF-files viewer can be downloaded here.
References:
  1. Томита-парсер. Руководство разработчика [Электронный ресурс]. URL: https://tech.yandex.ru/tomita/doc/dg/concept/ about-docpage/ (дата обращения: 01.12.2016).
  2. Cunningham H., Maynard D., Tablan V. JAPE: a Java Annotation Patterns Engine. Second edition. Sheffield, 2000. 30 p.
  3. Eureka Engine [Электронный ресурс]. URL: http://eurekaengine.ru (дата обращения: 01.12.2016).
  4. General Architecture for Text Engineering [Электронный ресурс]. URL: http://www.gate.ac.uk/ (дата обращения: 05.12.2016).
  5. Hilbert M. The World’s Technological Capacity to Store, Communicate, and Compute Information // Science. 2011. Vol. 332. Iss. 6025. P. 60-65.
  6. PullEnti [Электронный ресурс]. URL: www.pullenti.ru (дата обращения: 04.12.2016).
  7. Tomita M. LR Parsers for Natural Languages // COLING: 10th International Conference on Computational Linguistics: Proceedings of COLING 84. California, 1984. P. 354-357.
  8. White paper: Cisco VNI Forecast and Methodology, 2015-2020 [Электронный ресурс]. URL: http://www.cisco. com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/complete-white-paper-c11-481360.html (дата обращения: 30.11.2016).
All issues


© 2006-2025 GRAMOTA Publishing