Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Uncertainty-Based Sample Optimization Strategies for Large Forest Samples Set

  • Conference paper
  • First Online:
Computational Intelligence and Intelligent Systems (ISICA 2015)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 575))

Abstract

Our study was focused on the optimization of large training samples set selected from the global forest cover change detection system. Automatically delineating training samples procedure labeled tens of millions of samples representing forests and non-forests. To improve the precision, reduce the computational complexity and avoid over-fitting, we need to select samples from the large set of tens of millions of samples that are helpful for training a classifier. In this paper, two methods were used to optimize a large sample set from the Landsat-7 ETM+ data and obtain samples for training the classifier. The first method was the traditional stratified system sampling strategy. The second was uncertainty-based sample set optimization that selects training samples based on uncertainty by examining the uncertainty measure of samples and the distribution of their feature space, and involving the subtractive clustering, KNN and support vector machine. Through precision evaluation, our experiments validated that the uncertainty-based sampling strategy can achieve better results than the stratified system sampling strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Donmez, P., Carbonell, J.G., Bennett, P.N.: Dual strategy active learning. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 116–127. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  2. Huang, C., Song, K., Kim, S., et al.: Use of a dark object concept and support vector machines to automate forest cover change analysis. Remote Sens. Environ. 112, 970–985 (2008)

    Article  Google Scholar 

  3. Li, Y., Wang, Q., Li, Z.: The research for the influence of the classification precision caused by sampling method. J. Huhai Insts. Technol. (Nat. Sci. Ed.) S1, 67–69 (2011)

    Google Scholar 

  4. Zhang, H.: Study on reliable classification method based on remotely sensed image. China University of Mining and Technology (2012)

    Google Scholar 

  5. Ye, Y., Wu, Q.: Stratified sampling for feature subspace selection in random forests for high dimensional data. Pattern Recogn. 46(3), 769–787 (2013)

    Article  Google Scholar 

  6. Xie, X., Sun, S.: Multitask centroid twin support vector machines. Neurocomputing 149, 1085–1091 (2014)

    Article  Google Scholar 

  7. Yang, X., Tan, L., He, L.: A robust least squares support vector machine for regression and classification with noise. Neurocomputing 140, 41–52 (2014)

    Article  Google Scholar 

  8. Dasgupta, S., Hsu, D.: Hierarchical sampling for active learning. In: International Conference on Machine Learning, pp. 208–215 (2008)

    Google Scholar 

  9. Chen, J.Y., Qin, Z., Jia, J.: A weighted subtractive clustering algorithm. Inf. Technol. J. 7, 356–360 (2008)

    Article  Google Scholar 

  10. Saini, I., Singh, D., Khosla, A.: QRS detection using k-nearest neighbor algorithm (KNN) and evaluation on standard ECG databases. J. Adv. Res. 7(4), 331–344 (2013)

    Article  Google Scholar 

  11. Anggiuli, F.: Fast nearest neighbor condensation for large data sets classification. IEEE Trans. Knowl. Data Eng. 19(11), 1450–1464 (2007)

    Article  Google Scholar 

  12. Bay, S.: Nearest neighbor classification from multiple feature sets. Intell. Data Anal. 3, 191–209 (1999)

    Article  Google Scholar 

  13. Hu, Z., Gao, W.: The algorithm of sampling optimizing based on FCNN and SVM nearest to boundary rule. J. Yanshan Univ. 5, 421–425 (2010)

    Google Scholar 

  14. Huang, C., Thomas, N., Gaward, S.N.: Automated masking of cloud and cloud shadow for forest change analysis using Landsat images. Int. J. Remote Sens. 31(20), 5449–5464 (2010)

    Article  Google Scholar 

  15. Zhang, L., Guo, J.: A method for the selection of training samples based on boundary samples. J. Beijing Univ. Post Telecommun. 4, 77–80 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yan Guo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media Singapore

About this paper

Cite this paper

Guo, Y., Liu, W., Liu, F. (2016). Uncertainty-Based Sample Optimization Strategies for Large Forest Samples Set. In: Li, K., Li, J., Liu, Y., Castiglione, A. (eds) Computational Intelligence and Intelligent Systems. ISICA 2015. Communications in Computer and Information Science, vol 575. Springer, Singapore. https://doi.org/10.1007/978-981-10-0356-1_55

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-0356-1_55

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-0355-4

  • Online ISBN: 978-981-10-0356-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics