Achieving Optimal K-Anonymity Parameters for Big Data

Mohammed Essa Al-Zobbi; Seyed Shahrestani; Chun Ruan

doi:10.17972/ijicta20184136

Mohammed Essa Al-Zobbi, Mr.

Western Sydney University

https://orcid.org/0000-0002-2262-0556
Seyed Shahrestani, Dr.

Western Sydney University

https://orcid.org/0000-0002-8786-9213
Chun Ruan, Dr.

Western Sydney University

Keywords

Access Control, Anonymization, k-anonymity, Big Data, MapReduce

Abstract

Datasets containing private and sensitive information are useful for data analytics. Data owners cautiously release such sensitive data using privacy-preserving publishing techniques. Personal re-identification possibility is much larger than ever before. For instance, social media has dramatically increased the exposure to privacy violation. One well-known technique of k-anonymity proposes a protection approach against privacy exposure. K-anonymity tends to find k equivalent number of data records. The chosen attributes are known as Quasi-identifiers. This approach may reduce the personal re-identification. However, this may lessen the usefulness of information gained. The value of k should be carefully determined, to compromise both security and information gained. Unfortunately, there is no any standard procedure to define the value of k. The problem of the optimal k-anonymization is NP-hard. In this paper, we propose a greedy-based heuristic approach that provides an optimal value for k. The approach evaluates the empirical risk concerning our Sensitivity-Based Anonymization method. Our approach is derived from the fine-grained access and business role anonymization for big data, which forms our framework.

Abstract 1092 | PDF Downloads 2

References

Al-Zobbi, M., Shahrestani, S., & Ruan, C. (2016). Sensitivity-based anonymization of big data. Paper presented at the Local Computer Networks Workshops (LCN Workshops), 2016 IEEE 41st Conference on.

Basu, A., Nakamura, T., Hidano, S., & Kiyomoto, S. (2015). k-anonymity: Risks and the Reality. Paper presented at the Trustcom/BigDataSE/ISPA, 2015 IEEE.

Bayardo, R. J., & Rakesh Agrawal, R. J. (2005). Data privacy through optimal k-anonymization (pp. 217-228). USA.

Daries, J. P., Reich, J., Waldo, J., Young, E. M., Whittinghill, J., Ho, A. D., . . . Chuang, I. (2014). Privacy, anonymity, and big data in the social sciences. Communications of the ACM, 57(9), 56-63.

Fung, B. C. M., Wang, K., & Yu, P. S. (2007). Anonymizing Classification Data for Privacy Preservation. Knowledge and Data Engineering, IEEE Transactions on, 19(5). doi: 10.1109/TKDE.2007.1015

Guller, M. (2015). Big Data Analytics with Spark A Practitioner's Guide to Using Spark for Large Scale Data Analysis: Berkeley, CA : Apress : Imprint: Apress, 2015.

Hariharan, R., Mahesh, C., Prasenna, P., & Kumar, R. V. (2016). Enhancing privacy preservation in data mining using cluster based greedy method in hierarchical approach. Indian Journal of Science and Technology, 9(3).

Institute, N. C. (2013). Accessing the 1973-2013 SEER Data. from http://seer.cancer.gov/data/access.html

Kabir, E., Mahmood, A., Wang, H., & Mustafa, A. (2015). Microaggregation sorting framework for k-anonymity statistical disclosure control in cloud computing. IEEE Transactions on Cloud Computing.

Lu, R., Zhu, H., Liu, X., Liu, J., & Shao, J. (2014). Toward efficient and privacy-preserving computing in big data era. IEEE Network, 28(4), 46-50. doi: 10.1109/MNET.2014.6863131

Meyerson, A., & Williams, R. (2004). On the complexity of optimal K-anonymity. In C. Beeri (Ed.), PODS '04 (pp. 223-228): ACM.

Morgenstern, M. (1987). Security and inference in multilevel database and knowledge-base systems (Vol. 16): ACM.

Motwani, R., & Xu, Y. (2007). Efficient algorithms for masking and finding quasi-identifiers. Paper presented at the Proceedings of the Conference on Very Large Data Bases (VLDB).

Park, H., & Shim, K. (2007). Approximate algorithms for k-anonymity. Paper presented at the Proceedings of the 2007 ACM SIGMOD international conference on Management of data.

Rajeev Motwani, Y. X. (2007). Efficient Algorithms for Masking and Finding Quasi-Identifiers.

Smith, M., Szongott, C., Henne, B., & Von Voigt, G. (2012). Big data privacy issues in public social media. Paper presented at the Digital Ecosystems Technologies (DEST), 2012 6th IEEE International Conference on.

Su, T. A., & Ozsoyoglu, G. (1991). Controlling FD and MVD Inferences in Multilevel Relational Database Systems. IEEE Transactions on Knowledge and Data Engineering, 3(4), 474-485. doi: 10.1109/69.109108

Sweeney, L. (2002). ACHIEVING -ANONYMITY PRIVACY PROTECTION USING GENERALIZATION AND SUPPRESSION. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10(05), 571-588. doi: 10.1142/S021848850200165X

Yu, S. (2016). Big privacy: Challenges and opportunities of privacy study in the age of big data. IEEE access, 4, 2751-2763.

PDF

Published

May 15, 2018

How to Cite

Achieving Optimal K-Anonymity Parameters for Big Data. (2018). International Journal of Information, Communication Technology and Applications, 4(1), 23-33. https://doi.org/10.17972/ijicta20184136

Issue

Vol. 4 No. 1 (2018)

Section

Articles

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Copyright © Australasian Association for Information and Communication Technology General permission to republish, but not for profit, all or part of this material is granted, under the Creative Commons Australian Attribution-NonCommercial-NoDerivs 4.0 Licence, provided that the copyright notice is given and that reference is made to the publication, to its date of issue, and to the fact that reprinting privileges were granted by permission of the Copyright holder.

Author Biographies

Mohammed Essa Al-Zobbi, Mr., Western Sydney University

A Ph. D. student, Western Sydney University, School of Computing, Engineering and Mathematics.

Seyed Shahrestani, Dr., Western Sydney University

Senior Lecturer, Western Sydney University. Seyed is also the head of the Networking, Security and Cloud Research (NSCR) group at UWS

Chun Ruan, Dr., Western Sydney University

Lecturer, Western Sydney University

Main Article Content

Keywords

Abstract

References

Article Sidebar

Article Details

Mohammed Essa Al-Zobbi, Mr., Western Sydney University

Seyed Shahrestani, Dr., Western Sydney University

Chun Ruan, Dr., Western Sydney University