Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3038912.3052695acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

AttriInfer: Inferring User Attributes in Online Social Networks Using Markov Random Fields

Published:03 April 2017Publication History

ABSTRACT

In the attribute inference problem, we aim to infer users' private attributes (e.g., locations, sexual orientation, and interests) using their public data in online social networks. State-of-the-art methods leverage a user's both public friends and public behaviors (e.g., page likes on Facebook, apps that the user reviewed on Google Play) to infer the user's private attributes. However, these methods suffer from two key limitations: 1) suppose we aim to infer a certain attribute for a target user using a training dataset, they only leverage the labeled users who have the attribute, while ignoring the label information of users who do not have the attribute; 2) they are inefficient because they infer attributes for target users one by one. As a result, they have limited accuracies and applicability in real-world social networks.

In this work, we propose AttriInfer, a new method to infer user attributes in online social networks. AttriInfer can leverage both friends and behaviors, as well as the label information of training users who have an attribute and who do not have the attribute. Specifically, we model a social network as a pairwise Markov Random Field (pMRF). Given a training dataset, which consists of some users who have a certain attribute and some users who do not have a certain attribute, we compute the posterior probability that a target user has the attribute and use the posterior probability to infer attributes. In the basic version of AttriInfer, we use Loopy Belief Propagation (LBP) to compute the posterior probability. However, LBP is not scalable to very large-scale real-world social networks and not guaranteed to converge. Therefore, we further optimize LBP to be scalable and guaranteed to converge. We evaluated our method and compare it with state-of-the-art methods using a real-world Google+ dataset with 5.7M users. Our results demonstrate that our method substantially outperforms state-of-the-art methods in terms of both accuracy and efficiency.

References

  1. Jianming He, Wesley W. Chu, and Zhenyu Victor Liu. Inferring privacy information from social networks. In IEEE Intelligence and Security Informatics, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Jack Lindamood, Raymond Heatherly, Murat Kantarcioglu, and Bhavani Thuraisingham. Inferring private information using social network data. In WWW, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. E. Zheleva and L. Getoor. To join or not to join: The illusion of privacy in social networks with mixed public and private user profiles. In WWW, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Kurt Thomas, Chris Grier, and David M. Nicol. unfriendly: Multi-party privacy risks in social networks. In PETS, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Neil Zhenqiang Gong, Ameet Talwalkar, Lester Mackey, Ling Huang, Eui Chul Richard Shin, Emil Stefanov, Elaine Runting Shi, and Dawn Song. Joint link prediction and attribute inference using a social-attribute network. ACM TIST, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Alan Mislove, Bimal Viswanath, Krishna P. Gummadi, and Peter Druschel. You are who you know: Inferring user profiles in online social networks. WSDM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Sebastian Labitzke, Florian Werling, and Jens Mittag. Do online social network friends still threaten my privacy? In CODASPY, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Udi Weinsberg, Smriti Bhagat, Stratis Ioannidis, and Nina Taft. Blurme: Inferring and obfuscating user gender based on ratings. In RecSys, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Abdelberi Chaabane, Gergely Acs, and Mohamed Ali Kaafar. You are what you like! information leakage through users' interests. In NDSS, 2012.Google ScholarGoogle Scholar
  10. Michal Kosinski, David Stillwell, and Thore Graepel. Private traits and attributes are predictable from digital records of human behavior. PNAS, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  11. Neil Zhenqiang Gong and Bin Liu. You are who you know and how you behave: Attribute inference attacks via users' social friends and behaviors. In USENIX Security Symposium, 2016.Google ScholarGoogle Scholar
  12. Payas Gupta, Swapna Gottipati, Jing Jiang, and Debin Gao. Your love is public now: Questioning the use of personal information in authentication. In AsiaCCS, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Data brokers: a call for transparency and accountability. Federal Trade Commission, 2014.Google ScholarGoogle Scholar
  14. Oana Goga, Howard Lei, Sree Hari Krishnan Parthasarathi, Gerald Friedland, Robin Sommer, and Renata Teixeira. Exploiting innocuous activity for correlating users across sites. In WWW, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Sadia Afroz, Aylin Caliskan-Islam, Ariel Stolerman, Rachel Greenstadt, and Damon McCoy. Doppelgänger finder: Taking stylometry to the underground. In IEEE S&P, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. Sweeney. k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Tehila Minkus, Yuan Ding, Ratan Dey, and Keith W. Ross. The city privacy attack: Combining social media and public records for detailed profiles of adults and children. In COSN, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Pearl. Probabilistic reasoning in intelligent systems: networks of plausible inference. 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Amanda L. Trauda, Peter J. Muchaa, and Mason A. Porter. Social structure of facebook networks. Physica A: Statistical Mechanics and its Applications, 391(16), 2012.Google ScholarGoogle Scholar
  20. Joseph Bonneau, Jonathan Anderson, and George Danezis. Prying data out of a social network. In ASONAM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jahna Otterbacher. Inferring gender of movie reviewers: exploiting writing style, content and metadata. In CIKM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Arvind Narayanan, Hristo Paskov, Neil Zhenqiang Gong, John Bethencourt, Richard Shin, Emil Stefanov, and Dawn Song. On the feasibility of internet-scale author identification. In IEEE S&P, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Faiyaz Al Zamal, Wendy Liu, and Derek Ruths. Homophily and latent attribute inference: Inferring latent attributes of twitter users from neighbors. In ICWSM, 2012.Google ScholarGoogle Scholar
  24. Emiliano De Cristofaro, Claudio Soriente, Gene Tsudik, and Andrew Williams. Hummingbird: Privacy at the time of twitter. In IEEE S&P, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. J. Feldman, A. Blankstein, M. J. Freedman, and E. W. Felten. Social networking with frientegrity: Privacy and integrity with an untrusted provider. In USENIX Security Symposium, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Neil Zhenqiang Gong, Wenchang Xu, Ling Huang, Prateek Mittal, Emil Stefanov, Vyas Sekar, and Dawn Song. Evolution of social-attribute networks: Measurements, modeling, and implications using google. In IMC, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Gang Wang, Tristan Konolige, Christo Wilson, and Xiao Wang. You are how you click: Clickstream analysis for sybil detection. In USENIX Security Symposium, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Gang Wang, Xinyi Zhang, Shiliang Tang, Haitao Zheng, and Ben Y. Zhao. Unsupervised clickstream clustering for user behavior analysis. In CHI, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. Liblinear: A library for large linear classification. JMLR, 9:1871--1874, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Miller McPherson, Lynn Smith-Lovin, and James M. Cook. Birds of a feather: Homophily in social networks. Annual Review of Sociology, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  31. Neil Zhenqiang Gong, Mario Frank, and Prateek Mittal. Sybilbelief: A semi-supervised learning approach for structure-based sybil detection. IEEE TIFS, 9(6), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Wolfgang Gatterbauer, Stephan Günnemann, Danai Koutra, and Christos Faloutsos. Linearized and single-pass belief propagation. PVLDB, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Yousef Saad. Iterative methods for sparse linear systems. Siam, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. NA Derzko and AM Pfeffer. Bounds for the spectral radius of a matrix. Mathematics of Computation, 19(89), 1965.Google ScholarGoogle Scholar

Index Terms

  1. AttriInfer: Inferring User Attributes in Online Social Networks Using Markov Random Fields

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader