ABSTRACT
In the attribute inference problem, we aim to infer users' private attributes (e.g., locations, sexual orientation, and interests) using their public data in online social networks. State-of-the-art methods leverage a user's both public friends and public behaviors (e.g., page likes on Facebook, apps that the user reviewed on Google Play) to infer the user's private attributes. However, these methods suffer from two key limitations: 1) suppose we aim to infer a certain attribute for a target user using a training dataset, they only leverage the labeled users who have the attribute, while ignoring the label information of users who do not have the attribute; 2) they are inefficient because they infer attributes for target users one by one. As a result, they have limited accuracies and applicability in real-world social networks.
In this work, we propose AttriInfer, a new method to infer user attributes in online social networks. AttriInfer can leverage both friends and behaviors, as well as the label information of training users who have an attribute and who do not have the attribute. Specifically, we model a social network as a pairwise Markov Random Field (pMRF). Given a training dataset, which consists of some users who have a certain attribute and some users who do not have a certain attribute, we compute the posterior probability that a target user has the attribute and use the posterior probability to infer attributes. In the basic version of AttriInfer, we use Loopy Belief Propagation (LBP) to compute the posterior probability. However, LBP is not scalable to very large-scale real-world social networks and not guaranteed to converge. Therefore, we further optimize LBP to be scalable and guaranteed to converge. We evaluated our method and compare it with state-of-the-art methods using a real-world Google+ dataset with 5.7M users. Our results demonstrate that our method substantially outperforms state-of-the-art methods in terms of both accuracy and efficiency.
- Jianming He, Wesley W. Chu, and Zhenyu Victor Liu. Inferring privacy information from social networks. In IEEE Intelligence and Security Informatics, 2006. Google ScholarDigital Library
- Jack Lindamood, Raymond Heatherly, Murat Kantarcioglu, and Bhavani Thuraisingham. Inferring private information using social network data. In WWW, 2009. Google ScholarDigital Library
- E. Zheleva and L. Getoor. To join or not to join: The illusion of privacy in social networks with mixed public and private user profiles. In WWW, 2009. Google ScholarDigital Library
- Kurt Thomas, Chris Grier, and David M. Nicol. unfriendly: Multi-party privacy risks in social networks. In PETS, 2010. Google ScholarDigital Library
- Neil Zhenqiang Gong, Ameet Talwalkar, Lester Mackey, Ling Huang, Eui Chul Richard Shin, Emil Stefanov, Elaine Runting Shi, and Dawn Song. Joint link prediction and attribute inference using a social-attribute network. ACM TIST, 2014. Google ScholarDigital Library
- Alan Mislove, Bimal Viswanath, Krishna P. Gummadi, and Peter Druschel. You are who you know: Inferring user profiles in online social networks. WSDM, 2010. Google ScholarDigital Library
- Sebastian Labitzke, Florian Werling, and Jens Mittag. Do online social network friends still threaten my privacy? In CODASPY, 2013. Google ScholarDigital Library
- Udi Weinsberg, Smriti Bhagat, Stratis Ioannidis, and Nina Taft. Blurme: Inferring and obfuscating user gender based on ratings. In RecSys, 2012. Google ScholarDigital Library
- Abdelberi Chaabane, Gergely Acs, and Mohamed Ali Kaafar. You are what you like! information leakage through users' interests. In NDSS, 2012.Google Scholar
- Michal Kosinski, David Stillwell, and Thore Graepel. Private traits and attributes are predictable from digital records of human behavior. PNAS, 2013.Google ScholarCross Ref
- Neil Zhenqiang Gong and Bin Liu. You are who you know and how you behave: Attribute inference attacks via users' social friends and behaviors. In USENIX Security Symposium, 2016.Google Scholar
- Payas Gupta, Swapna Gottipati, Jing Jiang, and Debin Gao. Your love is public now: Questioning the use of personal information in authentication. In AsiaCCS, 2013. Google ScholarDigital Library
- Data brokers: a call for transparency and accountability. Federal Trade Commission, 2014.Google Scholar
- Oana Goga, Howard Lei, Sree Hari Krishnan Parthasarathi, Gerald Friedland, Robin Sommer, and Renata Teixeira. Exploiting innocuous activity for correlating users across sites. In WWW, 2013. Google ScholarDigital Library
- Sadia Afroz, Aylin Caliskan-Islam, Ariel Stolerman, Rachel Greenstadt, and Damon McCoy. Doppelgänger finder: Taking stylometry to the underground. In IEEE S&P, 2014. Google ScholarDigital Library
- L. Sweeney. k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 2002. Google ScholarDigital Library
- Tehila Minkus, Yuan Ding, Ratan Dey, and Keith W. Ross. The city privacy attack: Combining social media and public records for detailed profiles of adults and children. In COSN, 2015. Google ScholarDigital Library
- J. Pearl. Probabilistic reasoning in intelligent systems: networks of plausible inference. 1988. Google ScholarDigital Library
- Amanda L. Trauda, Peter J. Muchaa, and Mason A. Porter. Social structure of facebook networks. Physica A: Statistical Mechanics and its Applications, 391(16), 2012.Google Scholar
- Joseph Bonneau, Jonathan Anderson, and George Danezis. Prying data out of a social network. In ASONAM, 2009. Google ScholarDigital Library
- Jahna Otterbacher. Inferring gender of movie reviewers: exploiting writing style, content and metadata. In CIKM, 2010. Google ScholarDigital Library
- Arvind Narayanan, Hristo Paskov, Neil Zhenqiang Gong, John Bethencourt, Richard Shin, Emil Stefanov, and Dawn Song. On the feasibility of internet-scale author identification. In IEEE S&P, 2012. Google ScholarDigital Library
- Faiyaz Al Zamal, Wendy Liu, and Derek Ruths. Homophily and latent attribute inference: Inferring latent attributes of twitter users from neighbors. In ICWSM, 2012.Google Scholar
- Emiliano De Cristofaro, Claudio Soriente, Gene Tsudik, and Andrew Williams. Hummingbird: Privacy at the time of twitter. In IEEE S&P, 2011. Google ScholarDigital Library
- A. J. Feldman, A. Blankstein, M. J. Freedman, and E. W. Felten. Social networking with frientegrity: Privacy and integrity with an untrusted provider. In USENIX Security Symposium, 2012. Google ScholarDigital Library
- Neil Zhenqiang Gong, Wenchang Xu, Ling Huang, Prateek Mittal, Emil Stefanov, Vyas Sekar, and Dawn Song. Evolution of social-attribute networks: Measurements, modeling, and implications using google. In IMC, 2012. Google ScholarDigital Library
- Gang Wang, Tristan Konolige, Christo Wilson, and Xiao Wang. You are how you click: Clickstream analysis for sybil detection. In USENIX Security Symposium, 2013. Google ScholarDigital Library
- Gang Wang, Xinyi Zhang, Shiliang Tang, Haitao Zheng, and Ben Y. Zhao. Unsupervised clickstream clustering for user behavior analysis. In CHI, 2016. Google ScholarDigital Library
- Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. Liblinear: A library for large linear classification. JMLR, 9:1871--1874, 2008. Google ScholarDigital Library
- Miller McPherson, Lynn Smith-Lovin, and James M. Cook. Birds of a feather: Homophily in social networks. Annual Review of Sociology, 2001.Google ScholarCross Ref
- Neil Zhenqiang Gong, Mario Frank, and Prateek Mittal. Sybilbelief: A semi-supervised learning approach for structure-based sybil detection. IEEE TIFS, 9(6), 2014. Google ScholarDigital Library
- Wolfgang Gatterbauer, Stephan Günnemann, Danai Koutra, and Christos Faloutsos. Linearized and single-pass belief propagation. PVLDB, 2015. Google ScholarDigital Library
- Yousef Saad. Iterative methods for sparse linear systems. Siam, 2003. Google ScholarDigital Library
- NA Derzko and AM Pfeffer. Bounds for the spectral radius of a matrix. Mathematics of Computation, 19(89), 1965.Google Scholar
Index Terms
- AttriInfer: Inferring User Attributes in Online Social Networks Using Markov Random Fields
Recommendations
Attribute Inference Attacks in Online Social Networks
We propose new privacy attacks to infer attributes (e.g., locations, occupations, and interests) of online social network users. Our attacks leverage seemingly innocent user information that is publicly available in online social networks to infer ...
Joint Link Prediction and Attribute Inference Using a Social-Attribute Network
Special Issue on Linking Social Granularity and FunctionsThe effects of social influence and homophily suggest that both network structure and node-attribute information should inform the tasks of link prediction and node-attribute inference. Recently, Yin et al. [2010a, 2010b] proposed an attribute-augmented ...
A Multilevel Inference Mechanism for User Attributes over Social Networks
Database Systems for Advanced ApplicationsAbstractIn a real social network, each user has attributes for self-description called user attributes which are semantically hierarchical. With these attributes, we can implement personalized services such as user classification and targeted ...
Comments