Skip to main content

Table 2 Relevant literatures on PPDM in terms of their merits and de-merits

From: A comprehensive review on privacy preserving data mining

References PPDM, PPDM based on data distortion, data mining, outsourced data mining, distributed and anonymity method Merits and de-merits Parameters
Matwin (2013) Surveyed the existing privacy-preserving data mining methods Analyzed the methods PPDM
Vatsalan et al. (2013) Presented methods that permitted the linking of databases between organizations and preserved the privacy of these data Presented taxonomy of PPRL techniques PPDM
Qi and Zong (2012) Stated methods of data mining for privacy protection Classified PPDM methods PPDM
Raju et al. (2009) Apply homomorphic encryption on multiply protocol Possible influence in many applications PPDM
Malina and Hajny (2013), Sachan et al. (2013) Analyzed current privacy preserving solutions for cloud services and outlined their solution based on advanced cryptographic components Outputted the experimental results and compared the performance with related solutions PPDM
Mukkamala and Ashok (2011) Compared a set of fuzzy based on mapping methods Combined the multiple practical values of a data item into a single value PPDM
Kamakshi (2012) Distortion method, A novel idea to identify the sensitive attributes dynamically The data is modified be retaining the original properties of the data Privacy
Zhang et al. (2012a) Distortion method, proposed HPNGS Reduced the noise requests over Privacy and utility
Zhang et al. (2012b) Distortion method, Proposed a novel APNGS Improved the effectiveness of privacy protection on noise obfuscation in terms of association probabilities
Extra cost in comparison to existing representative strategies is the main demerit
Privacy
Li et al. (2009a) Distortion method, proposed anonymous perturbation method Low costs with a high strength Privacy
Kamakshi and Babu (2010) Distortion method, proposed model include three parts that are data centers, clients, and database Customers and their sits database role could be interchangeable Privacy
Islam and Brankovic (2011) Distortion method, introduced a framework that incorporates several novel techniques to perturb all attributes of a data set Effective in preserving original patterns in a perturbed data set Privacy
Wang and Lee (2008) Distortion method, proposed an approach to avoid Forward-Inference Attacks, generated by the sanitization process Restricted Forward-Inference Attacks Privacy
Shrivastava et al. (2011) Data mining algorithms, Proposed an improved distortion technique for privacy preserving frequent item-set mining Enhanced the performance of the algorithm by reducing the disk access time Privacy and performance
Vijayarani et al. (2010a) Data mining algorithms, introduced various communities Focused on importance of association rule Privacy
Aggarwal and Yu (2008) Stated that support and confidence are considered the two significant measures within association rule mining Explained the basic elements of association rule PPDM
Belwal et al. (2013) Data mining algorithms, proposed the basis of reduction of support and confidence of sensitive rules Hided any desired sensitive association rule without any side effect
Hidden only the rule that has single sensitive item on the left side is disadvantageous
PPDM
Jain et al. (2011) Data mining algorithms, proposed a new algorithm that increases and decreases the support of the left side and right side item of hide association rule Made minimum modification to the data entries to hide a set of rules with lesser CPU time than the previous work Privacy
Naeem et al. (2010) Data mining algorithms, proposed an architecture which hides the restricted association rules with the complete removal of the known side effects like the generation of unwanted, non-genuine association rules while yielding no hiding failure Used other standard statistical measures instead of conventional framework of support and confidence to generate association rules Privacy
Li and Liu (2009) Data mining algorithms, Proposed DDIL based on data disturbance and inquiry limitation Effective, good privacy and accuracy
Restriction with random parameters is disadvantageous
Privacy
Weng et al. (2008) Data mining algorithms, FHSAR Fast Hiding Sensitive Association Rules (SAR) algorithm Adv. hiding sensitive association rules with limited side effects Privacy
Dehkordi et al. (2009) Data mining algorithms, proposed method for hiding sensitive association rules by depending on the concept of genetic algorithms Offered security as well as keeping the utility Security and Utility
Gkoulalas-Divanis and Verykios (2009) Data mining algorithms, proposed a novel approach that offers best solution to hide sensitive frequent item sets Provided effective solution to hide sensitive frequent item sets Privacy and efficiency
Li et al. (2009b) Data mining algorithms, introduced a new algorithm for sanitizing a transactional database Selection of victim-items with no affection to the non-sensitive patterns is disadvantageous Privacy
Kasthuri and Meyyappan (2013) Data mining algorithms, proposed a new method to detect the sensitive items for hiding sensitive association rules Found the frequent item sets and generates the association rules Privacy
Quoc et al. (2013) Data mining algorithms, proposed a heuristic algorithm to hide a set of sensitive association rules using the distortion technique Specified the victim item and minimum number of transactions Privacy
Domadiya and Rao (2013) Data mining algorithms, proposed MDSRRC Highly efficient and maintains database quality Privacy, efficiency and quality
Xiong et al. (2006) Data mining algorithms, used k as the closet neighbor classification technique based on SMC techniques Balance in accuracy, performance, and privacy protection Privacy and accuracy.
Singh et al. (2010) Data mining algorithms, attempted providing a simple and efficient privacy preserving classification for cloud data Facilitated computing local neighbors at each node in the cloud in a secure way and classifies the unseen records using weighted k-NN classification approach Privacy
Baotou (2010) Data mining algorithms, proposed an effective algorithm depending on random perturbation matrix Enhanced privacy protection and the accuracy Privacy and accuracy
Vaidya et al. 2008) Data mining algorithms developed an approach for vertically partitioned mining data Modified and extended to a variety of data mining applications as decision trees Privacy and efficiency
Kantarcıoglu and Vaidya (2003) Data mining algorithms, discussed the use of secure logarithm and summation, where the distributed naive Bayes classifier can be determined securely Supported the concept that few useful secure protocols facilitated the secure deployment of different types of distributed data mining algorithms Privacy and accuracy
Sathiyapriya and Sadasivam (2013) Data mining algorithms, a classification of privacy preserving techniques The optimal sanitization is proved to be NP-Hard and always there is a trade-off between privacy and accuracy is the notable de-merit Privacy
Yi and Zhang (2013) Data mining algorithms, applied k-means clustering on vertically partitioned data Did not apply any secure two-party computation algorithm is the demerit Privacy and security
Raghuram and Gyani (2012) Data mining algorithms, proposed an associative classification model Accuracy is tested Privacy
Lin and Lo (2013) Data mining algorithms, proposed a set of algorithms, containing EWS algorithm, ROD algorithm, SSWS algorithm and the PSWS algorithm Delivered excellent performance with respect to scalability and execution time Privacy, scalability and execution time
Harnsamut and Natwichai (2008) Data mining algorithms, proposed a novel heuristic algorithm to preserve the privacy and maintain the data quality Efficient and highly effective Privacy and efficient
Seisungsittisunti and Natwichai (2011) Data mining algorithms, proposed an incremental polynomial- time algorithm to transform the data to meet a privacy standard Efficient in every problem setting Privacy and efficient
Giannotti et al. (2013) Outsourced data mining, proposed model based on background knowledge of attack Strong defense against an attack
They do not deal with other attack is the demerit
PPDM
Worku et al. (2014) Outsourced data mining, improved their method by minimizing bilinear mapping Secured and efficient
The demerit is it is not wholly active
PPDM
Arunadevi and Anuradha (2014) Outsourced data mining, proposed an attack model based on the basic assumption Improved the security of the system PPDM
Lai et al. (2014) Outsourced data mining, proposed the first semantically secure solution for outsourcing association rule mining with data privacy The demerit is it is non-deterministic and secure against an adversary at cloud servers PPDM
Kerschbaum and Julien (2008) Outsourced data mining, proposed a searchable encryption scheme for outsourcing data analytics Secured PPDM
Ying-hua et al. (2011) Distributed, survey on the distributed privacy preserving data mining (DPPDM) Surveyed on the DPPDM PPDM
Li (2013) Distributed, designed, and analyzed a symmetric-key based privacy- preserving scheme for mining support counts Effective in detecting misbehaving nodes and increasing average throughput in the whole network Privacy
Dev et al. (2012) Distributed, combining categorization, fragmentation and distribution, prevents data mining by maintaining privacy levels, splitting data into chunks and storing these chunks of data to appropriate cloud providers Provided an effective way to protect privacy from mining based attacks
It introduced performance overhead as demerit
Privacy
Tassa (2014) Distributed, proposed a protocol based on association rules in horizontally distributed databases Devised an effective protocol for disparity verifications is disadvantageous Privacy, accuracy and efficiency
Chan and Keng (2013) Distributed, proposed a distributed architecture for privacy preserving outsourcing of association rules mining Computational and storage overheads are significantly reduced in such a scheme Privacy
Dong and Kresman (2009) Distributed, focused on the linking between distributed data mining It is simple to implement with least computing requirements Privacy
Aggarwal et al. (2005) Distributed, have discussed the developed techniques such as services based on data encryption, causing a large overhead in query processing and proposed a new distributed framework to enable privacy-preservation for the outsourced storage of data A new definition for privacy has been demonstrated based on hiding sets of attribute values and it also discussed how proposed decomposition approaches help to achieve privacy, and identify the best privacy-preserving decomposition technique Privacy
Xu and Yi (2011) Distributed, proposed taxonomy to categorize those PPDDM protocols into important categories High performance of these protocols Privacy
Inan and Saygin (2010) Distributed, proposed a method which constructs different matrix in the horizontal distributed data mining Provided different comparison function for either character or numerical data Privacy
Nanavati and Jinwala (2012) Distributed, proposed techniques that protect privacy for global and partial cycles in a distributed data Distinguished global cycles in a cooperative setup Privacy
Agrawal and Srikant (2000) Distributed, have developed a uniform randomization method based association rule for the categorical datasets The data reassembled is sanitized knowledge based Privacy
Wang et al. (2010) Distributed, proposed an enhanced algorithm (PPFDM) An effective and appropriate for the practical application fields Privacy
Nguyen et al. (2012) Distributed, Proposed Enhanced Scheme (EMHS) Performance is better than MHS in specific databases Privacy
Om Kumar et al. (2013) Distributed, used WEKA to predict the patterns in a single cloud and by using cloud data distributor with a secure distributed approach An effective solution that prevents such mining attacks on cloud thus making the cloud a secure platform for service and storage Privacy
Mokeddem and Belbachir (2010) Distributed, proposed model allowing the class association rules detection in a shared-nothing architecture Created classification rules in a parallel setting Privacy
Ibrahim et al. (2012) Distributed, presented a practical cryptographic method to compute the KNN classification problem Demonstrated that accuracy of the proposed work is the same as that of a naive scheme without security Privacy
Patel et al. (2012) Distributed, stated an effective algorithm to preserve privacy of distributed K-Means clustering Faster than other algorithms and it is more appropriate for huge datasets in practical scenario privacy
Kumbhar and Kharat (2012) Distributed, analyzed different methods for PPARM Studied the methods that depended on association rules mining on distributed dataset Privacy
Nix et al. (2012) Distributed, implemented two sketching protocols for the scalar (dot) product of two vectors which can be used as sub-protocols in larger data mining tasks Accuracy and efficiency results through extensive experimentation Privacy, accuracy and efficiency
Keshavamurthy et al. (2013) Distributed, proved approach of Genetic Algorithm (GA) has two potential advantages comparison with traditional frequent pattern mining algorithm The fitness function of GA plays an important role, and the convergence of search space is directly proportionate to the effectiveness of fitness function
The GA could result in duplicate formation in its successive generations is a de-merit
Privacy
Loukides et al. (2012), Machanavajjhala et al. (2007) Anonymity, proposed a novel approach that fulfils utility of data requirements Effective Privacy and utility
Wang et al. (2004) Anonymity, have studied data mining as approach used for data masking, known as data mining-based privacy protection Two key factors, quality and scalability has been focused specifically is advantageous Privacy, quality, and scalability
Friedman et al. (2008), Loukides and Gkoulalas-divanis (2012) Anonymity, presented definitions of k-anonymity It could be used in many data mining algorithms Privacy
Ciriani et al. (2008) Anonymity, presented the possible threats to K-anonymity and categorized two main approaches for merging K- anonymity in data mining Discussed different methods that could be applied to detect K-anonymity violations Privacy
He et al (2011), Friedman et al. (2008) Anonymity, proposed an algorithm which is based on clustering to produce a utility-friendly anonymized version of micro data Utility is improved by their approach Privacy and utility
Patil and Patankar (2013), He et al. (2011) Anonymity, analyzed existing K-anonymity model and its applications Analyzed current K-anonymity model Privacy
Zhu and Chen (2012), Patil and Patankar (2013) Anonymity, studied K-anonymity model Surveyed K-anonymity model Privacy
Soodejani et al. (2012), Zhu and Chen (2012) Anonymity, employed a version of the chase, called standard chase Provided a stronger privacy model for the proposed method and can be valuable Privacy
Karim et al. (2012), Soodejani et al. (2012) Anonymity, proposed a numerical method to mine maximal frequent patterns with privacy preserving capability An efficient data transformation technique, a novel encoded and compressed lattice structure, and MFPM algorithm Privacy
Loukides et al. (2012), Karim et al. (2012) Anonymity, proposed a rule-based privacy model that allows data publishers to express fine-grained protection requirements for both identity and sensitive information disclosure Outperformed the state-of-the-art in terms of retaining data utility, while achieving good protection Privacy, utility and scalability
Vijayarani et al. (2010a, b), Loukides et al. (2012) K-anonymity has been studied as an interesting approach to protect micro data related to public or semi-public sectors from linking attacks Proposed novel approach Privacy
Nergiz et al. (2009), Xu and Yi (2011) Anonymity, proposed new clustering algorithms to achieve multi relational anonymity Provided utility of data and efficiency Utility, effectiveness and efficiency
Tai et al. (2013), Vijayarani et al. (2010b) Anonymity, proposed a Distributed k-support Noise
Taxonomy tree algorithm, abbreviated as DKNT
Achieved good protection and better computation efficiency, as compared to the computation efficiency on single machine Privacy and efficiency
Tai et al. (2010, 2013) Anonymity, introduce a pseudo taxonomy tree and have the third party mine the generalized frequent item-sets instead Achieved very good privacy protection with moderate storage overhead Privacy
Pan et al. (2012), Tai et al. (2010) Anonymity, had analyzed and performed a comparison for the present developed K-anonymity models and its applications Enhanced K -anonymity and improve it Privacy
Deivanai et al. (2011), Pan et al. (2012) Anonymity, proposed novel method named kactus Accuracy is better than other methods based on K -anonymity Privacy and accuracy
Monreale et al. (2014), Deivanai et al. (2011) Anonymity, a new definition of K-anonymity for personal sequential data which provides an effective privacy protection model is introduced Results are extremely interesting in the case of dense datasets Privacy
Nergiz et al. (2013), Monreale et al. (2014) Anonymity, the hybrid generalizations with data relocation Increased the utility of data Privacy and utility
Zhang et al. (2013a, 2014a), Nergiz et al. (2013) Anonymity, proposed hybrid approach by combining Top-Down Specialization and Bottom-Up Generalization Improved the scalability and efficiency of TDS Privacy and scalability
Zhang et al. (2014a) Anonymity, proposed a highly scalable two-phase TDS approach using Map Reduce on cloud Scalability and efficiency of TDS are improved significantly over existing approaches Privacy and scalability
Zhang et al. (2013a, b), Zhang et al. (2014a) Anonymity, proposed method depends on an efficient quasi-identifier index Protected privacy when new data is added Privacy and efficiency
Nergiz and Gök (2014) Anonymity, Hybrid generalizations Ensured the utility of data Privacy and utility
Ding et al. (2013), Zhang et al. (2013c) Anonymity, have presented a distributed anonymization protocol for privacy-preserving data publishing from multiple data providers in a cloud system Performed a personalized anonymization to satisfy every data provider’s requirements and the union forms a global anonymization to be published Privacy