Uncertainty-Based Sample Optimization Strategies for Large Forest Samples Set

Guo, Yan; Liu, Wenyi; Liu, Fujiang

doi:10.1007/978-981-10-0356-1_55

Yan Guo¹⁴,
Wenyi Liu¹⁵ &
Fujiang Liu¹⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 575))

Included in the following conference series:

International Symposium on Computational Intelligence and Intelligent Systems

1662 Accesses
1 Citations

Abstract

Our study was focused on the optimization of large training samples set selected from the global forest cover change detection system. Automatically delineating training samples procedure labeled tens of millions of samples representing forests and non-forests. To improve the precision, reduce the computational complexity and avoid over-fitting, we need to select samples from the large set of tens of millions of samples that are helpful for training a classifier. In this paper, two methods were used to optimize a large sample set from the Landsat-7 ETM+ data and obtain samples for training the classifier. The first method was the traditional stratified system sampling strategy. The second was uncertainty-based sample set optimization that selects training samples based on uncertainty by examining the uncertainty measure of samples and the distribution of their feature space, and involving the subtractive clustering, KNN and support vector machine. Through precision evaluation, our experiments validated that the uncertainty-based sampling strategy can achieve better results than the stratified system sampling strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Donmez, P., Carbonell, J.G., Bennett, P.N.: Dual strategy active learning. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 116–127. Springer, Heidelberg (2007)
Chapter Google Scholar
Huang, C., Song, K., Kim, S., et al.: Use of a dark object concept and support vector machines to automate forest cover change analysis. Remote Sens. Environ. 112, 970–985 (2008)
Article Google Scholar
Li, Y., Wang, Q., Li, Z.: The research for the influence of the classification precision caused by sampling method. J. Huhai Insts. Technol. (Nat. Sci. Ed.) S1, 67–69 (2011)
Google Scholar
Zhang, H.: Study on reliable classification method based on remotely sensed image. China University of Mining and Technology (2012)
Google Scholar
Ye, Y., Wu, Q.: Stratified sampling for feature subspace selection in random forests for high dimensional data. Pattern Recogn. 46(3), 769–787 (2013)
Article Google Scholar
Xie, X., Sun, S.: Multitask centroid twin support vector machines. Neurocomputing 149, 1085–1091 (2014)
Article Google Scholar
Yang, X., Tan, L., He, L.: A robust least squares support vector machine for regression and classification with noise. Neurocomputing 140, 41–52 (2014)
Article Google Scholar
Dasgupta, S., Hsu, D.: Hierarchical sampling for active learning. In: International Conference on Machine Learning, pp. 208–215 (2008)
Google Scholar
Chen, J.Y., Qin, Z., Jia, J.: A weighted subtractive clustering algorithm. Inf. Technol. J. 7, 356–360 (2008)
Article Google Scholar
Saini, I., Singh, D., Khosla, A.: QRS detection using k-nearest neighbor algorithm (KNN) and evaluation on standard ECG databases. J. Adv. Res. 7(4), 331–344 (2013)
Article Google Scholar
Anggiuli, F.: Fast nearest neighbor condensation for large data sets classification. IEEE Trans. Knowl. Data Eng. 19(11), 1450–1464 (2007)
Article Google Scholar
Bay, S.: Nearest neighbor classification from multiple feature sets. Intell. Data Anal. 3, 191–209 (1999)
Article Google Scholar
Hu, Z., Gao, W.: The algorithm of sampling optimizing based on FCNN and SVM nearest to boundary rule. J. Yanshan Univ. 5, 421–425 (2010)
Google Scholar
Huang, C., Thomas, N., Gaward, S.N.: Automated masking of cloud and cloud shadow for forest change analysis using Landsat images. Int. J. Remote Sens. 31(20), 5449–5464 (2010)
Article Google Scholar
Zhang, L., Guo, J.: A method for the selection of training samples based on boundary samples. J. Beijing Univ. Post Telecommun. 4, 77–80 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer Science, China University of Geosciences, Wuhan, Hubei, China
Yan Guo
School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, Hubei, China
Wenyi Liu
Faculty of Information Engineering, China University of Geosciences, Wuhan, Hubei, China
Fujiang Liu

Authors

Yan Guo
View author publications
You can also search for this author in PubMed Google Scholar
Wenyi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Fujiang Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yan Guo .

Editor information

Editors and Affiliations

College of Mathematics and Informatics, The South China Agricultural University, Guangzhou, China
Kangshun Li
School of Computer Science, Guangzhou University, Guangzhou, China
Jin Li
School of Computer Science and Engineeri, The University of Aizu, Aizu-Wakamatsu, Fukushima, Japan
Yong Liu
Dept. of Informatics, University of Salerno, Fisciano, Italy
Aniello Castiglione

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Guo, Y., Liu, W., Liu, F. (2016). Uncertainty-Based Sample Optimization Strategies for Large Forest Samples Set. In: Li, K., Li, J., Liu, Y., Castiglione, A. (eds) Computational Intelligence and Intelligent Systems. ISICA 2015. Communications in Computer and Information Science, vol 575. Springer, Singapore. https://doi.org/10.1007/978-981-10-0356-1_55

Download citation

DOI: https://doi.org/10.1007/978-981-10-0356-1_55
Published: 19 January 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-0355-4
Online ISBN: 978-981-10-0356-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics