Abstract
In this article, we propose a unified geolocation framework to automatically determine where on the earth a web video was shot. We analyze different social, visual, and textual relationships from a real-world dataset and find four relationships with apparent geography clues that can be used for web video geolocation. Then, the geolocation process is formulated as an optimization problem that simultaneously takes the social, visual, and textual relationships into consideration. The optimization problem is solved by an iterative procedure, which can be interpreted as a propagation of the geography information among the web video social network. Extensive experiments on a real-world dataset clearly demonstrate the effectiveness of our proposed framework, with the geolocation accuracy higher than state-of-the-art approaches.
- S. Ahern, M. Naaman, R. Nair, and J. H.-I. Yang. 2007. World explorer: Visualizing aggregate data from unstructured text in geo-referenced collections. In JCDL. 1--10. Google ScholarDigital Library
- E. Amitay, N. Har’El, R. Sivan, and A. Soffer. 2004. Web-a-where: Geotagging web content. In SIGIR. 273--280. Google ScholarDigital Library
- L. Backstrom, J. M. Kleinberg, R. Kumar, and J. Novak. 2008. Spatial variation in search engine queries. In WWW. 357--366. Google ScholarDigital Library
- H. Bay, T. Tuytelaars, and L. J. V. Gool. 2006. Surf: Speeded up robust features. In ECCV (1). 404--417. Google ScholarDigital Library
- D. Brockmann, L. Hufnagel, and T. Geisel. 2006. The scaling laws of human travel. Nature 439, 7075, 462--5.Google Scholar
- J. Cao, C.-W. Ngo, Y.-D. Zhang, and J.-T. Li. 2011. Tracking web video topics: Discovery, visualization, and monitoring. IEEE Transactions on Circuits and Systems for Video Technology 21, 12, 1835--1846.Google ScholarCross Ref
- J. Choi, H. Lei, and G. Friedland. 2011. The 2011 ICSI video location estimation system. In MediaEval 2011.Google Scholar
- A. Clauset, M. E. J. Newman, and C. Moore. 2004. Finding community structure in very large networks. Physical Review E 70, 6, 066111+.Google ScholarCross Ref
- D. J. Crandall, L. Backstrom, D. P. Huttenlocher, and J. M. Kleinberg. 2009. Mapping the world’s photos. In WWW. 761--770. Google ScholarDigital Library
- J. Davidson, B. Liebald, J. Liu, P. Nandy, T. V. Vleet, U. Gargi, S. Gupta, Y. He, M. Lambert, B. Livingston, and D. Sampath. 2010. The YouTube video recommendation system. In RecSys. 293--296. Google ScholarDigital Library
- G. Friedland, O. Vinyals, and T. Darrell. 2010. Multimodal location estimation. In ACM Multimedia. 1245--1252. Google ScholarDigital Library
- J. Hays and A. A. Efros. 2008. Im2gps: estimating geographic information from a single image. In CVPR.Google Scholar
- T. Hwang and R. Kuang. 2010. A heterogeneous label propagation algorithm for disease gene discovery. In SDM. 583--594.Google Scholar
- F. Inc. 2013. Flickr. Retrieved from http://www.flickr.com/.Google Scholar
- Y. Inc. 2011. YouTube. Retrieved from http://www.youtube.com/.Google Scholar
- M. Ji, Y. Sun, M. Danilevsky, J. Han, and J. Gao. 2010. Graph regularized transductive classification on heterogeneous information networks. In ECML/PKDD (1). 570--586. Google ScholarDigital Library
- P. Kelm, S. Schmiedeke, and T. Sikora. 2011. Multi-modal, multi-resource methods for placing Flickr videos on the map. In ICMR. 52. Google ScholarDigital Library
- O. V. Laere, S. Schockaert, and B. Dhoedt. 2011. Finding locations of Flickr resources using language models and similarity search. In ICMR. 48. Google ScholarDigital Library
- M. Larson, M. Soleymani, P. Serdyukov, S. Rudinac, C. Wartena, V. Murdock, G. Friedland, R. Ordelman, and G. J. F. Jones. 2011. Automatic tagging and geotagging in video collections and communities. In ICMR. 51. Google ScholarDigital Library
- L. T. Li, J. Almeida, and R. da Silva Torres. 2011. Recod working notes for placing task MediaEval 2011. Retrieved from http://ceur-ws.org/Vol-807/Li_UNICAMP_Placing_me11wn.pdf.Google Scholar
- L. T. Li, J. Almeida, D. C. G. Pedronette, O. A. B. Penatti, and R. da Silva Torres. 2012. A multimodal approach for video geocoding. Retrieved from http://ceur-ws.org/Vol-927/mediaeval2012_submission_19.pdf.Google Scholar
- D. Liu, S. Yan, X.-S. Hua, and H.-J. Zhang. 2011. Image retagging using collaborative tag propagation. IEEE Transactions on Multimedia 13, 4, 702--712. Google ScholarDigital Library
- J. Luo, D. Joshi, J. Yu, and A. C. Gallagher. 2011. Geotagging in multimedia and computer vision—a survey. Multimedia Tools Appl. 51, 1, 187--211. Google ScholarDigital Library
- MediaEval. 2011. Placing task in MediaEval 2011. Retrieved from http://www.multimediaeval.org/mediaeval2011/placing2011/.Google Scholar
- MediaEval. 2012. Placing task in MediaEval 2012. Retrieved from http://www.multimediaeval.org/mediaeval2012/placing2012/.Google Scholar
- O. A. B. Penatti, L. T. Li, J. Almeida, and R. da Silva Torres. 2012. A visual approach for video geocoding using bag-of-scenes. In ICMR. 53. Google ScholarDigital Library
- A. Popescu and N. Ballas. 2012. Cea list’s participation at MediaEval 2012 placing task. Retrieved from http://ceur-ws.org/Vol-927/mediaeval2012_submission_32.pdf.Google Scholar
- A. Rae and P. Kelm. 2012. Working notes for the placing task at MediaEval 2012. Retrieved from http://ceur-ws.org/Vol-927/mediaeval2012_submission_6.pdf.Google Scholar
- T. Rattenbury, N. Good, and M. Naaman. 2007. Towards automatic extraction of event and place semantics from Flickr tags. In SIGIR. 103--110. Google ScholarDigital Library
- K. Sahr, D. White, and A. J. Kimerling. 2003. Geodesic discrete global grid systems. Cartography and Geographic Information Science 30, 2, 121--134.Google ScholarCross Ref
- R. L. Santos, B. P. Rocha, C. G. Rezende, and A. A. F. Loureiro. 2007. Characterizing the YouTube video-sharing community. Retrieved from http://www.mendeley.com/research/characterizing-youtube-qvideosharing-community-4/.Google Scholar
- P. Serdyukov, V. Murdock, and R. van Zwol. 2009. Placing Flickr photos on a map. In SIGIR. 484--491. Google ScholarDigital Library
- Y. Song, J. Cao, Z. Chen, Y. Zhang, and J. Li. 2010. Tag transformer. In ACM Multimedia. 639--642. Google ScholarDigital Library
- Y. Song, Y.-D. Zhang, J. Cao, T. Xia, W. Liu, and J.-T. Li. 2012. Web video geolocation by geotagged social resources. IEEE Transactions on Multimedia 14, 2, 456--470. Google ScholarDigital Library
- J. Tang, R. Hong, S. Yan, T.-S. Chua, G.-J. Qi, and R. Jain. 2011. Image annotation by knn-sparse graph-based label propagation over noisily tagged web images. ACM Transactions on Intelligent System Technology 2, 2, 14:1--14:15. Google ScholarDigital Library
- J. Tang, S. Yan, R. Hong, G.-J. Qi, and T.-S. Chua. 2009. Inferring semantic concepts from community-contributed images and noisy tags. In Proceedings of the 17th ACM International Conference on Multimedia (MM’09). ACM, New York, 223--232. Google ScholarDigital Library
- K. Yanai, H. Kawakubo, and B. Qiu. 2009. A visual analysis of the relationship between word concepts and geographical locations. In CIVR. Google ScholarDigital Library
- W. Zhao, X. Wu, and C.-W. Ngo. 2010. On the annotation of web videos by efficient near-duplicate search. IEEE Transactions on Multimedia 12, 5, 448--461. Google ScholarDigital Library
- Y.-T. Zheng, Z.-J. Zha, and T.-S. Chua. 2011. Research and applications on georeferenced multimedia: A survey. Multimedia Tools and Applications 51, 77--98. Google ScholarDigital Library
- D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Schölkopf. 2003. Learning with local and global consistency. In NIPS.Google Scholar
Index Terms
- A Unified Geolocation Framework for Web Videos
Recommendations
Web Video Geolocation by Geotagged Social Resources
This paper considers the problem of web video geolocation: we hope to determine where on the Earth a web video was taken. By analyzing a 6.5-million geotagged web video dataset, we observe that there exist inherent geography intimacies between a video ...
Constructing places from spatial footprints
GEOCROWD '12: Proceedings of the 1st ACM SIGSPATIAL International Workshop on Crowdsourced and Volunteered Geographic InformationPlace is an essential concept in human discourse. It is people's interaction and experience with their surroundings that identify place from non-place in space. This paper explores the use of spatial footprints as a record of human interaction with the ...
Blog Based Personal LBS
Proceedings of the First International Conference on Distributed, Ambient, and Pervasive Interactions - Volume 8028One of the problems in the current commercial LBS Location-based Service is weak functionality for users to use their own generated content on the LBS. This paper proposes a new framework of Personal LBS which solves the problem by using blog as both a ...
Comments