An analysis of intelligent methods and algorithms for unlabeled data processing


Citar

Texto integral

Resumo

Intelligent algorithms and method is well-suited to many problems in data processing, where unlabeled data may
be abundant. We survey previously used selection strategies for intelligent model, and propose two novel algorithms
to address their shortcomings, focus on Active Learning (AL). While has already been shown to markedly reduce
the annotation efforts for many sequence labeling tasks compared to random selection, AL remains unconcerned about
the internal structure of the selected sequences (typically, sentences). We propose a semi-supervised AL approach
for sequence labeling.

Sobre autores

E Engel

Ekaterina Engel'

Bibliografia

  1. Lewis D., Gale W. A sequential algorithm for training text classifiers // Proc. of the ACM SIGIR Conf. on R & D in Information Retrieval. 1994. P. 3-12.
  2. Settles B., Craven M., Ray S. Multiple-instance active learning // Advances in Neural Information Processing Systems (NIPS). 2008. Vol. 20. P. 1289-1296.
  3. Seung H. S., Opper M, Sompolinsky H. Query by committee // Proc. of the ACM Workshop on Computational Learning Theory. 1992. P. 287-294.
  4. Tong S., Koller D. Active Learning for Parameter Estimation in Bayesian Networks // NIPS. 2000. P. 647-653.
  5. Cohn D., Ghahramani Z., Jordan M. I. Active learning with statistical models // J. of Artificial Intelligence Research. 1996. Vol. 4. P. 129-145.
  6. Roy N., McCallum A. Toward optimal active learning through sampling estimation of error reduction // Proc. of the Intern. Conf. on Machine Learning (ICML). 2001. P. 441-448.
  7. Settles B., Craven M. An analysis of active learning strategies for sequence labeling tasks // Proc. of the Conf. on Empirical Methods in Natural Language Processing (EMNLP). 2008. P. 1069-1078.
  8. Lafferty J., McCallum A., Pereira F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data // Proc. of the 18th Intern. Conf. on Machine Learning. 2001. P. 282-289.
  9. Rabiner L. R. A tutorial on hidden markov models and selected applications in speech recognition // Proc. of the IEEE. 1989. Vol. 77. № 2. P. 257-286.
  10. McCallum A., Freitag D., Pereira F. Maximum entropy Markov models for information extraction and segmentation // Proc. of the 17th Intern. Conf. on Machine Learning. 2000. P. 591-598.
  11. Sang E. F. Introduction to the conll-2002 shared task: Language-independent named entity recognition // Proc. of the CoNLL-2002. 2002. P. 155-158.
  12. Stegeman L. Part-of-speech tagging and chunk parsing of spoken Dutch using support vector machines // Proc. of the 4th Twente Student Conf. on IT. 2006.
  13. Engel E. A. Modified artificial neural network for information processing with the selection of essential connections : Ph. D. thesis. Krasnoyarsk, 2004.
  14. Tong S., Koller D. Support vector machine active learning with applications to text classification // Proc. of the ICML-00, 17th Intern. Conf. on Machine Learning. 2000. P. 999-1006.

Arquivos suplementares

Arquivos suplementares
Ação
1. JATS XML

Declaração de direitos autorais © Engel E.А., Engel' E.A., 2011

Creative Commons License
Este artigo é disponível sob a Licença Creative Commons Atribuição 4.0 Internacional.

Este site utiliza cookies

Ao continuar usando nosso site, você concorda com o procedimento de cookies que mantêm o site funcionando normalmente.

Informação sobre cookies