An analysis of intelligent methods and algorithms for unlabeled data processing

E А Engel; Engel E А; Ekaterina Aleksandrovna Engel'; Энгель Екатерина Александровна

An analysis of intelligent methods and algorithms for unlabeled data processing

Authors: Engel EА¹, Engel' E.A.¹
Affiliations:
Issue: Vol 12, No 2 (2011)
Pages: 92-96
Section: Articles
Published: 15.06.2011
URL: https://journals.eco-vector.com/2712-8970/article/view/505744
ID: 505744

Cite item

Full Text

Abstract
About the authors
References
Supplementary files
Statistics

Abstract

Intelligent algorithms and method is well-suited to many problems in data processing, where unlabeled data may
be abundant. We survey previously used selection strategies for intelligent model, and propose two novel algorithms
to address their shortcomings, focus on Active Learning (AL). While has already been shown to markedly reduce
the annotation efforts for many sequence labeling tasks compared to random selection, AL remains unconcerned about
the internal structure of the selected sequences (typically, sentences). We propose a semi-supervised AL approach
for sequence labeling.

Keywords

intelligent algorithms and methods, data processing, active learning

About the authors

E А Engel

Ekaterina Aleksandrovna Engel'

References

Lewis D., Gale W. A sequential algorithm for training text classifiers // Proc. of the ACM SIGIR Conf. on R & D in Information Retrieval. 1994. P. 3-12.
Settles B., Craven M., Ray S. Multiple-instance active learning // Advances in Neural Information Processing Systems (NIPS). 2008. Vol. 20. P. 1289-1296.
Seung H. S., Opper M, Sompolinsky H. Query by committee // Proc. of the ACM Workshop on Computational Learning Theory. 1992. P. 287-294.
Tong S., Koller D. Active Learning for Parameter Estimation in Bayesian Networks // NIPS. 2000. P. 647-653.
Cohn D., Ghahramani Z., Jordan M. I. Active learning with statistical models // J. of Artificial Intelligence Research. 1996. Vol. 4. P. 129-145.
Roy N., McCallum A. Toward optimal active learning through sampling estimation of error reduction // Proc. of the Intern. Conf. on Machine Learning (ICML). 2001. P. 441-448.
Settles B., Craven M. An analysis of active learning strategies for sequence labeling tasks // Proc. of the Conf. on Empirical Methods in Natural Language Processing (EMNLP). 2008. P. 1069-1078.
Lafferty J., McCallum A., Pereira F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data // Proc. of the 18th Intern. Conf. on Machine Learning. 2001. P. 282-289.
Rabiner L. R. A tutorial on hidden markov models and selected applications in speech recognition // Proc. of the IEEE. 1989. Vol. 77. № 2. P. 257-286.
McCallum A., Freitag D., Pereira F. Maximum entropy Markov models for information extraction and segmentation // Proc. of the 17th Intern. Conf. on Machine Learning. 2000. P. 591-598.
Sang E. F. Introduction to the conll-2002 shared task: Language-independent named entity recognition // Proc. of the CoNLL-2002. 2002. P. 155-158.
Stegeman L. Part-of-speech tagging and chunk parsing of spoken Dutch using support vector machines // Proc. of the 4th Twente Student Conf. on IT. 2006.
Engel E. A. Modified artificial neural network for information processing with the selection of essential connections : Ph. D. thesis. Krasnoyarsk, 2004.
Tong S., Koller D. Support vector machine active learning with applications to text classification // Proc. of the ICML-00, 17th Intern. Conf. on Machine Learning. 2000. P. 999-1006.

Supplementary files

Supplementary Files

Action

1. JATS XML

Download

Username
Password
Remember me

Forgot password?	Register

Username
Password
Remember me

Forgot password?	Register