ABSTRACT
Given a model family and a set of unlabeled examples, one could either label specific examples or state general constraints---both provide information about the desired model. In general, what is the most cost-effective way to learn? To address this question, we introduce measurements, a general class of mechanisms for providing information about a target model. We present a Bayesian decision-theoretic framework, which allows us to both integrate diverse measurements and choose new measurements to make. We use a variational inference algorithm, which exploits exponential family duality. The merits of our approach are demonstrated on two sequence labeling tasks.
- Borwein, J. M., & Zhu, Q. J. (2005). Techniques of variational analysis. Springer.Google Scholar
- Chaloner, K., & Verdinelli, I. (1995). Bayesian experimental design: A review. Statistical Science, 10, 273--304.Google ScholarCross Ref
- Chang, M., Ratinov, L., & Roth, D. (2007). Guiding semi-supervision with constraint-driven learning. Association for Computational Linguistics (ACL) (pp. 280--287).Google Scholar
- Druck, G., Mann, G., & McCallum, A. (2008). Learning from labeled features using generalized expectation criteria. ACM Special Interest Group on Information Retreival (SIGIR) (pp. 595--602). Google ScholarDigital Library
- Dudík, M., Phillips, S. J., & Schapire, R. E. (2007). Maximum entropy density estimation. Journal of Machine Learning Research, 8, 1217--1260. Google ScholarDigital Library
- Graça, J., Ganchev, K., & Taskar, B. (2008). Expectation maximization and posterior constraints. Advances in Neural Information Processing Systems (NIPS) (pp. 569--576).Google Scholar
- Haghighi, A., & Klein, D. (2006). Prototype-driven learning for sequence models. North American Association for Computational Linguistics (NAACL) (pp. 320--327). Google ScholarDigital Library
- Lafferty, J., McCallum, A., & Pereira, F. (2001). Conditional random fields: Probabilistic models for segmenting and labeling data. International Conference on Machine Learning (ICML) (pp. 282--289). Google ScholarDigital Library
- Mann, G., & McCallum, A. (2007). Simple, robust, scalable semi-supervised learning via expectation regularization. International Conference on Machine Learning (ICML) (pp. 593--600). Google ScholarDigital Library
- Mann, G., & McCallum, A. (2008). Generalized expectation criteria for semi-supervised learning of conditional random fields. Human Language Technology and Association for Computational Linguistics (HLT/ACL) (pp. 870--878).Google Scholar
- Quadrianto, N., Smola, A. J., Caetano, T. S., & Le, Q. V. (2008). Estimating labels from label proportions. International Conference on Machine Learning (ICML) (pp. 776--783). Google ScholarDigital Library
- Roy, N., & McCallum, A. (2001). Toward optimal active learning through sampling estimation of error reduction. International Conference on Machine Learning (ICML) (pp. 441--448). Google ScholarDigital Library
- Seeger, M., & Nickisch, H. (2008). Compressed sensing and Bayesian experimental design. International Conference on Machine Learning (ICML) (pp. 912--919). Google ScholarDigital Library
Index Terms
- Learning from measurements in exponential families
Recommendations
Graphical Models, Exponential Families, and Variational Inference
The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building large-scale multivariate statistical models. Graphical models have become a focus of research in many ...
Reference priors for exponential families with simple quadratic variance function
Reference analysis is one of the most successful general methods to derive noninformative prior distributions. In practice, however, reference priors are often difficult to obtain. Recently developed theory for conditionally reducible natural ...
Comments