research-article

A Unified Geolocation Framework for Web Videos

Authors:
Yicheng Song

Institute of Computing Technology, Chinese Academy of Sciences, Beijing China

Institute of Computing Technology, Chinese Academy of Sciences, Beijing China
View Profile

,
Yongdong Zhang

Institute of Computing Technology, Chinese Academy of Sciences, Beijing China

Institute of Computing Technology, Chinese Academy of Sciences, Beijing China
View Profile

,
Juan Cao

Institute of Computing Technology, Chinese Academy of Sciences, Beijing China

Institute of Computing Technology, Chinese Academy of Sciences, Beijing China
View Profile

,
Jinhui Tang

School of Computer Science, Nanjing University of Science and Technology, Nanjing, China

School of Computer Science, Nanjing University of Science and Technology, Nanjing, China
View Profile

,
Xingyu Gao

Institute of Computing Technology, Chinese Academy of Sciences, Beijing China

Institute of Computing Technology, Chinese Academy of Sciences, Beijing China
View Profile

,
Jintao Li

Institute of Computing Technology, Chinese Academy of Sciences, Beijing China

Institute of Computing Technology, Chinese Academy of Sciences, Beijing China
View Profile

ACM Transactions on Intelligent Systems and Technology Volume 5 Issue 3Article No.: 49pp 1–22https://doi.org/10.1145/2533989

Published:17 July 2014Publication History

ACM Transactions on Intelligent Systems and Technology

Abstract

In this article, we propose a unified geolocation framework to automatically determine where on the earth a web video was shot. We analyze different social, visual, and textual relationships from a real-world dataset and find four relationships with apparent geography clues that can be used for web video geolocation. Then, the geolocation process is formulated as an optimization problem that simultaneously takes the social, visual, and textual relationships into consideration. The optimization problem is solved by an iterative procedure, which can be interpreted as a propagation of the geography information among the web video social network. Extensive experiments on a real-world dataset clearly demonstrate the effectiveness of our proposed framework, with the geolocation accuracy higher than state-of-the-art approaches.

References

S. Ahern, M. Naaman, R. Nair, and J. H.-I. Yang. 2007. World explorer: Visualizing aggregate data from unstructured text in geo-referenced collections. In JCDL. 1--10. Google ScholarDigital Library
E. Amitay, N. Har’El, R. Sivan, and A. Soffer. 2004. Web-a-where: Geotagging web content. In SIGIR. 273--280. Google ScholarDigital Library
L. Backstrom, J. M. Kleinberg, R. Kumar, and J. Novak. 2008. Spatial variation in search engine queries. In WWW. 357--366. Google ScholarDigital Library
H. Bay, T. Tuytelaars, and L. J. V. Gool. 2006. Surf: Speeded up robust features. In ECCV (1). 404--417. Google ScholarDigital Library
D. Brockmann, L. Hufnagel, and T. Geisel. 2006. The scaling laws of human travel. Nature 439, 7075, 462--5.Google Scholar
J. Cao, C.-W. Ngo, Y.-D. Zhang, and J.-T. Li. 2011. Tracking web video topics: Discovery, visualization, and monitoring. IEEE Transactions on Circuits and Systems for Video Technology 21, 12, 1835--1846.Google ScholarCross Ref
J. Choi, H. Lei, and G. Friedland. 2011. The 2011 ICSI video location estimation system. In MediaEval 2011.Google Scholar
A. Clauset, M. E. J. Newman, and C. Moore. 2004. Finding community structure in very large networks. Physical Review E 70, 6, 066111+.Google ScholarCross Ref
D. J. Crandall, L. Backstrom, D. P. Huttenlocher, and J. M. Kleinberg. 2009. Mapping the world’s photos. In WWW. 761--770. Google ScholarDigital Library
J. Davidson, B. Liebald, J. Liu, P. Nandy, T. V. Vleet, U. Gargi, S. Gupta, Y. He, M. Lambert, B. Livingston, and D. Sampath. 2010. The YouTube video recommendation system. In RecSys. 293--296. Google ScholarDigital Library
G. Friedland, O. Vinyals, and T. Darrell. 2010. Multimodal location estimation. In ACM Multimedia. 1245--1252. Google ScholarDigital Library
J. Hays and A. A. Efros. 2008. Im2gps: estimating geographic information from a single image. In CVPR.Google Scholar
T. Hwang and R. Kuang. 2010. A heterogeneous label propagation algorithm for disease gene discovery. In SDM. 583--594.Google Scholar
F. Inc. 2013. Flickr. Retrieved from http://www.flickr.com/.Google Scholar
Y. Inc. 2011. YouTube. Retrieved from http://www.youtube.com/.Google Scholar
M. Ji, Y. Sun, M. Danilevsky, J. Han, and J. Gao. 2010. Graph regularized transductive classification on heterogeneous information networks. In ECML/PKDD (1). 570--586. Google ScholarDigital Library
P. Kelm, S. Schmiedeke, and T. Sikora. 2011. Multi-modal, multi-resource methods for placing Flickr videos on the map. In ICMR. 52. Google ScholarDigital Library
O. V. Laere, S. Schockaert, and B. Dhoedt. 2011. Finding locations of Flickr resources using language models and similarity search. In ICMR. 48. Google ScholarDigital Library
M. Larson, M. Soleymani, P. Serdyukov, S. Rudinac, C. Wartena, V. Murdock, G. Friedland, R. Ordelman, and G. J. F. Jones. 2011. Automatic tagging and geotagging in video collections and communities. In ICMR. 51. Google ScholarDigital Library
L. T. Li, J. Almeida, and R. da Silva Torres. 2011. Recod working notes for placing task MediaEval 2011. Retrieved from http://ceur-ws.org/Vol-807/Li_UNICAMP_Placing_me11wn.pdf.Google Scholar
L. T. Li, J. Almeida, D. C. G. Pedronette, O. A. B. Penatti, and R. da Silva Torres. 2012. A multimodal approach for video geocoding. Retrieved from http://ceur-ws.org/Vol-927/mediaeval2012_submission_19.pdf.Google Scholar
D. Liu, S. Yan, X.-S. Hua, and H.-J. Zhang. 2011. Image retagging using collaborative tag propagation. IEEE Transactions on Multimedia 13, 4, 702--712. Google ScholarDigital Library
J. Luo, D. Joshi, J. Yu, and A. C. Gallagher. 2011. Geotagging in multimedia and computer vision—a survey. Multimedia Tools Appl. 51, 1, 187--211. Google ScholarDigital Library
MediaEval. 2011. Placing task in MediaEval 2011. Retrieved from http://www.multimediaeval.org/mediaeval2011/placing2011/.Google Scholar
MediaEval. 2012. Placing task in MediaEval 2012. Retrieved from http://www.multimediaeval.org/mediaeval2012/placing2012/.Google Scholar
O. A. B. Penatti, L. T. Li, J. Almeida, and R. da Silva Torres. 2012. A visual approach for video geocoding using bag-of-scenes. In ICMR. 53. Google ScholarDigital Library
A. Popescu and N. Ballas. 2012. Cea list’s participation at MediaEval 2012 placing task. Retrieved from http://ceur-ws.org/Vol-927/mediaeval2012_submission_32.pdf.Google Scholar
A. Rae and P. Kelm. 2012. Working notes for the placing task at MediaEval 2012. Retrieved from http://ceur-ws.org/Vol-927/mediaeval2012_submission_6.pdf.Google Scholar
T. Rattenbury, N. Good, and M. Naaman. 2007. Towards automatic extraction of event and place semantics from Flickr tags. In SIGIR. 103--110. Google ScholarDigital Library
K. Sahr, D. White, and A. J. Kimerling. 2003. Geodesic discrete global grid systems. Cartography and Geographic Information Science 30, 2, 121--134.Google ScholarCross Ref
R. L. Santos, B. P. Rocha, C. G. Rezende, and A. A. F. Loureiro. 2007. Characterizing the YouTube video-sharing community. Retrieved from http://www.mendeley.com/research/characterizing-youtube-qvideosharing-community-4/.Google Scholar
P. Serdyukov, V. Murdock, and R. van Zwol. 2009. Placing Flickr photos on a map. In SIGIR. 484--491. Google ScholarDigital Library
Y. Song, J. Cao, Z. Chen, Y. Zhang, and J. Li. 2010. Tag transformer. In ACM Multimedia. 639--642. Google ScholarDigital Library
Y. Song, Y.-D. Zhang, J. Cao, T. Xia, W. Liu, and J.-T. Li. 2012. Web video geolocation by geotagged social resources. IEEE Transactions on Multimedia 14, 2, 456--470. Google ScholarDigital Library
J. Tang, R. Hong, S. Yan, T.-S. Chua, G.-J. Qi, and R. Jain. 2011. Image annotation by knn-sparse graph-based label propagation over noisily tagged web images. ACM Transactions on Intelligent System Technology 2, 2, 14:1--14:15. Google ScholarDigital Library
J. Tang, S. Yan, R. Hong, G.-J. Qi, and T.-S. Chua. 2009. Inferring semantic concepts from community-contributed images and noisy tags. In Proceedings of the 17th ACM International Conference on Multimedia (MM’09). ACM, New York, 223--232. Google ScholarDigital Library
K. Yanai, H. Kawakubo, and B. Qiu. 2009. A visual analysis of the relationship between word concepts and geographical locations. In CIVR. Google ScholarDigital Library
W. Zhao, X. Wu, and C.-W. Ngo. 2010. On the annotation of web videos by efficient near-duplicate search. IEEE Transactions on Multimedia 12, 5, 448--461. Google ScholarDigital Library
Y.-T. Zheng, Z.-J. Zha, and T.-S. Chua. 2011. Research and applications on georeferenced multimedia: A survey. Multimedia Tools and Applications 51, 77--98. Google ScholarDigital Library
D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Schölkopf. 2003. Learning with local and global consistency. In NIPS.Google Scholar

Index Terms

A Unified Geolocation Framework for Web Videos
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
      2. Computer vision tasks
        Scene understanding
2. Information systems
  1. Information systems applications

Recommendations

Web Video Geolocation by Geotagged Social Resources

This paper considers the problem of web video geolocation: we hope to determine where on the Earth a web video was taken. By analyzing a 6.5-million geotagged web video dataset, we observe that there exist inherent geography intimacies between a video ...
Read More
Constructing places from spatial footprints
GEOCROWD '12: Proceedings of the 1st ACM SIGSPATIAL International Workshop on Crowdsourced and Volunteered Geographic Information

Place is an essential concept in human discourse. It is people's interaction and experience with their surroundings that identify place from non-place in space. This paper explores the use of spatial footprints as a record of human interaction with the ...
Read More
Blog Based Personal LBS
Proceedings of the First International Conference on Distributed, Ambient, and Pervasive Interactions - Volume 8028

One of the problems in the current commercial LBS Location-based Service is weak functionality for users to use their own generated content on the LBS. This paper proposes a new framework of Personal LBS which solves the problem by using blog as both a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Intelligent Systems and Technology Volume 5, Issue 3
Special Section on Urban Computing
September 2014
361 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/2648782
Editor:
Qiang Yang
Hong Kong University of Science and Technology, Hong Kong
Issue’s Table of Contents
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 July 2014
- Accepted: 1 September 2013
- Revised: 1 July 2013
- Received: 1 March 2013
Published in tist Volume 5, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Unified geolocation framework
geotag
web video
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 297
  Total Downloads
- Downloads (Last 12 months)8
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A Unified Geolocation Framework for Web Videos

ACM Transactions on Intelligent Systems and Technology

Abstract

References

Cited By

Index Terms

Recommendations

Web Video Geolocation by Geotagged Social Resources

Constructing places from spatial footprints

Blog Based Personal LBS