Skip to main content

Table 2 Relevant literatures on PPDM in terms of their merits and de-merits

From: A comprehensive review on privacy preserving data mining

References

PPDM, PPDM based on data distortion, data mining, outsourced data mining, distributed and anonymity method

Merits and de-merits

Parameters

Matwin (2013)

Surveyed the existing privacy-preserving data mining methods

Analyzed the methods

PPDM

Vatsalan et al. (2013)

Presented methods that permitted the linking of databases between organizations and preserved the privacy of these data

Presented taxonomy of PPRL techniques

PPDM

Qi and Zong (2012)

Stated methods of data mining for privacy protection

Classified PPDM methods

PPDM

Raju et al. (2009)

Apply homomorphic encryption on multiply protocol

Possible influence in many applications

PPDM

Malina and Hajny (2013), Sachan et al. (2013)

Analyzed current privacy preserving solutions for cloud services and outlined their solution based on advanced cryptographic components

Outputted the experimental results and compared the performance with related solutions

PPDM

Mukkamala and Ashok (2011)

Compared a set of fuzzy based on mapping methods

Combined the multiple practical values of a data item into a single value

PPDM

Kamakshi (2012)

Distortion method, A novel idea to identify the sensitive attributes dynamically

The data is modified be retaining the original properties of the data

Privacy

Zhang et al. (2012a)

Distortion method, proposed HPNGS

Reduced the noise requests over

Privacy and utility

Zhang et al. (2012b)

Distortion method, Proposed a novel APNGS

Improved the effectiveness of privacy protection on noise obfuscation in terms of association probabilities

Extra cost in comparison to existing representative strategies is the main demerit

Privacy

Li et al. (2009a)

Distortion method, proposed anonymous perturbation method

Low costs with a high strength

Privacy

Kamakshi and Babu (2010)

Distortion method, proposed model include three parts that are data centers, clients, and database

Customers and their sits database role could be interchangeable

Privacy

Islam and Brankovic (2011)

Distortion method, introduced a framework that incorporates several novel techniques to perturb all attributes of a data set

Effective in preserving original patterns in a perturbed data set

Privacy

Wang and Lee (2008)

Distortion method, proposed an approach to avoid Forward-Inference Attacks, generated by the sanitization process

Restricted Forward-Inference Attacks

Privacy

Shrivastava et al. (2011)

Data mining algorithms, Proposed an improved distortion technique for privacy preserving frequent item-set mining

Enhanced the performance of the algorithm by reducing the disk access time

Privacy and performance

Vijayarani et al. (2010a)

Data mining algorithms, introduced various communities

Focused on importance of association rule

Privacy

Aggarwal and Yu (2008)

Stated that support and confidence are considered the two significant measures within association rule mining

Explained the basic elements of association rule

PPDM

Belwal et al. (2013)

Data mining algorithms, proposed the basis of reduction of support and confidence of sensitive rules

Hided any desired sensitive association rule without any side effect

Hidden only the rule that has single sensitive item on the left side is disadvantageous

PPDM

Jain et al. (2011)

Data mining algorithms, proposed a new algorithm that increases and decreases the support of the left side and right side item of hide association rule

Made minimum modification to the data entries to hide a set of rules with lesser CPU time than the previous work

Privacy

Naeem et al. (2010)

Data mining algorithms, proposed an architecture which hides the restricted association rules with the complete removal of the known side effects like the generation of unwanted, non-genuine association rules while yielding no hiding failure

Used other standard statistical measures instead of conventional framework of support and confidence to generate association rules

Privacy

Li and Liu (2009)

Data mining algorithms, Proposed DDIL based on data disturbance and inquiry limitation

Effective, good privacy and accuracy

Restriction with random parameters is disadvantageous

Privacy

Weng et al. (2008)

Data mining algorithms, FHSAR Fast Hiding Sensitive Association Rules (SAR) algorithm

Adv. hiding sensitive association rules with limited side effects

Privacy

Dehkordi et al. (2009)

Data mining algorithms, proposed method for hiding sensitive association rules by depending on the concept of genetic algorithms

Offered security as well as keeping the utility

Security and Utility

Gkoulalas-Divanis and Verykios (2009)

Data mining algorithms, proposed a novel approach that offers best solution to hide sensitive frequent item sets

Provided effective solution to hide sensitive frequent item sets

Privacy and efficiency

Li et al. (2009b)

Data mining algorithms, introduced a new algorithm for sanitizing a transactional database

Selection of victim-items with no affection to the non-sensitive patterns is disadvantageous

Privacy

Kasthuri and Meyyappan (2013)

Data mining algorithms, proposed a new method to detect the sensitive items for hiding sensitive association rules

Found the frequent item sets and generates the association rules

Privacy

Quoc et al. (2013)

Data mining algorithms, proposed a heuristic algorithm to hide a set of sensitive association rules using the distortion technique

Specified the victim item and minimum number of transactions

Privacy

Domadiya and Rao (2013)

Data mining algorithms, proposed MDSRRC

Highly efficient and maintains database quality

Privacy, efficiency and quality

Xiong et al. (2006)

Data mining algorithms, used k as the closet neighbor classification technique based on SMC techniques

Balance in accuracy, performance, and privacy protection

Privacy and accuracy.

Singh et al. (2010)

Data mining algorithms, attempted providing a simple and efficient privacy preserving classification for cloud data

Facilitated computing local neighbors at each node in the cloud in a secure way and classifies the unseen records using weighted k-NN classification approach

Privacy

Baotou (2010)

Data mining algorithms, proposed an effective algorithm depending on random perturbation matrix

Enhanced privacy protection and the accuracy

Privacy and accuracy

Vaidya et al. 2008)

Data mining algorithms developed an approach for vertically partitioned mining data

Modified and extended to a variety of data mining applications as decision trees

Privacy and efficiency

Kantarcıoglu and Vaidya (2003)

Data mining algorithms, discussed the use of secure logarithm and summation, where the distributed naive Bayes classifier can be determined securely

Supported the concept that few useful secure protocols facilitated the secure deployment of different types of distributed data mining algorithms

Privacy and accuracy

Sathiyapriya and Sadasivam (2013)

Data mining algorithms, a classification of privacy preserving techniques

The optimal sanitization is proved to be NP-Hard and always there is a trade-off between privacy and accuracy is the notable de-merit

Privacy

Yi and Zhang (2013)

Data mining algorithms, applied k-means clustering on vertically partitioned data

Did not apply any secure two-party computation algorithm is the demerit

Privacy and security

Raghuram and Gyani (2012)

Data mining algorithms, proposed an associative classification model

Accuracy is tested

Privacy

Lin and Lo (2013)

Data mining algorithms, proposed a set of algorithms, containing EWS algorithm, ROD algorithm, SSWS algorithm and the PSWS algorithm

Delivered excellent performance with respect to scalability and execution time

Privacy, scalability and execution time

Harnsamut and Natwichai (2008)

Data mining algorithms, proposed a novel heuristic algorithm to preserve the privacy and maintain the data quality

Efficient and highly effective

Privacy and efficient

Seisungsittisunti and Natwichai (2011)

Data mining algorithms, proposed an incremental polynomial- time algorithm to transform the data to meet a privacy standard

Efficient in every problem setting

Privacy and efficient

Giannotti et al. (2013)

Outsourced data mining, proposed model based on background knowledge of attack

Strong defense against an attack

They do not deal with other attack is the demerit

PPDM

Worku et al. (2014)

Outsourced data mining, improved their method by minimizing bilinear mapping

Secured and efficient

The demerit is it is not wholly active

PPDM

Arunadevi and Anuradha (2014)

Outsourced data mining, proposed an attack model based on the basic assumption

Improved the security of the system

PPDM

Lai et al. (2014)

Outsourced data mining, proposed the first semantically secure solution for outsourcing association rule mining with data privacy

The demerit is it is non-deterministic and secure against an adversary at cloud servers

PPDM

Kerschbaum and Julien (2008)

Outsourced data mining, proposed a searchable encryption scheme for outsourcing data analytics

Secured

PPDM

Ying-hua et al. (2011)

Distributed, survey on the distributed privacy preserving data mining (DPPDM)

Surveyed on the DPPDM

PPDM

Li (2013)

Distributed, designed, and analyzed a symmetric-key based privacy- preserving scheme for mining support counts

Effective in detecting misbehaving nodes and increasing average throughput in the whole network

Privacy

Dev et al. (2012)

Distributed, combining categorization, fragmentation and distribution, prevents data mining by maintaining privacy levels, splitting data into chunks and storing these chunks of data to appropriate cloud providers

Provided an effective way to protect privacy from mining based attacks

It introduced performance overhead as demerit

Privacy

Tassa (2014)

Distributed, proposed a protocol based on association rules in horizontally distributed databases

Devised an effective protocol for disparity verifications is disadvantageous

Privacy, accuracy and efficiency

Chan and Keng (2013)

Distributed, proposed a distributed architecture for privacy preserving outsourcing of association rules mining

Computational and storage overheads are significantly reduced in such a scheme

Privacy

Dong and Kresman (2009)

Distributed, focused on the linking between distributed data mining

It is simple to implement with least computing requirements

Privacy

Aggarwal et al. (2005)

Distributed, have discussed the developed techniques such as services based on data encryption, causing a large overhead in query processing and proposed a new distributed framework to enable privacy-preservation for the outsourced storage of data

A new definition for privacy has been demonstrated based on hiding sets of attribute values and it also discussed how proposed decomposition approaches help to achieve privacy, and identify the best privacy-preserving decomposition technique

Privacy

Xu and Yi (2011)

Distributed, proposed taxonomy to categorize those PPDDM protocols into important categories

High performance of these protocols

Privacy

Inan and Saygin (2010)

Distributed, proposed a method which constructs different matrix in the horizontal distributed data mining

Provided different comparison function for either character or numerical data

Privacy

Nanavati and Jinwala (2012)

Distributed, proposed techniques that protect privacy for global and partial cycles in a distributed data

Distinguished global cycles in a cooperative setup

Privacy

Agrawal and Srikant (2000)

Distributed, have developed a uniform randomization method based association rule for the categorical datasets

The data reassembled is sanitized knowledge based

Privacy

Wang et al. (2010)

Distributed, proposed an enhanced algorithm (PPFDM)

An effective and appropriate for the practical application fields

Privacy

Nguyen et al. (2012)

Distributed, Proposed Enhanced Scheme (EMHS)

Performance is better than MHS in specific databases

Privacy

Om Kumar et al. (2013)

Distributed, used WEKA to predict the patterns in a single cloud and by using cloud data distributor with a secure distributed approach

An effective solution that prevents such mining attacks on cloud thus making the cloud a secure platform for service and storage

Privacy

Mokeddem and Belbachir (2010)

Distributed, proposed model allowing the class association rules detection in a shared-nothing architecture

Created classification rules in a parallel setting

Privacy

Ibrahim et al. (2012)

Distributed, presented a practical cryptographic method to compute the KNN classification problem

Demonstrated that accuracy of the proposed work is the same as that of a naive scheme without security

Privacy

Patel et al. (2012)

Distributed, stated an effective algorithm to preserve privacy of distributed K-Means clustering

Faster than other algorithms and it is more appropriate for huge datasets in practical scenario

privacy

Kumbhar and Kharat (2012)

Distributed, analyzed different methods for PPARM

Studied the methods that depended on association rules mining on distributed dataset

Privacy

Nix et al. (2012)

Distributed, implemented two sketching protocols for the scalar (dot) product of two vectors which can be used as sub-protocols in larger data mining tasks

Accuracy and efficiency results through extensive experimentation

Privacy, accuracy and efficiency

Keshavamurthy et al. (2013)

Distributed, proved approach of Genetic Algorithm (GA) has two potential advantages comparison with traditional frequent pattern mining algorithm

The fitness function of GA plays an important role, and the convergence of search space is directly proportionate to the effectiveness of fitness function

The GA could result in duplicate formation in its successive generations is a de-merit

Privacy

Loukides et al. (2012), Machanavajjhala et al. (2007)

Anonymity, proposed a novel approach that fulfils utility of data requirements

Effective

Privacy and utility

Wang et al. (2004)

Anonymity, have studied data mining as approach used for data masking, known as data mining-based privacy protection

Two key factors, quality and scalability has been focused specifically is advantageous

Privacy, quality, and scalability

Friedman et al. (2008), Loukides and Gkoulalas-divanis (2012)

Anonymity, presented definitions of k-anonymity

It could be used in many data mining algorithms

Privacy

Ciriani et al. (2008)

Anonymity, presented the possible threats to K-anonymity and categorized two main approaches for merging K- anonymity in data mining

Discussed different methods that could be applied to detect K-anonymity violations

Privacy

He et al (2011), Friedman et al. (2008)

Anonymity, proposed an algorithm which is based on clustering to produce a utility-friendly anonymized version of micro data

Utility is improved by their approach

Privacy and utility

Patil and Patankar (2013), He et al. (2011)

Anonymity, analyzed existing K-anonymity model and its applications

Analyzed current K-anonymity model

Privacy

Zhu and Chen (2012), Patil and Patankar (2013)

Anonymity, studied K-anonymity model

Surveyed K-anonymity model

Privacy

Soodejani et al. (2012), Zhu and Chen (2012)

Anonymity, employed a version of the chase, called standard chase

Provided a stronger privacy model for the proposed method and can be valuable

Privacy

Karim et al. (2012), Soodejani et al. (2012)

Anonymity, proposed a numerical method to mine maximal frequent patterns with privacy preserving capability

An efficient data transformation technique, a novel encoded and compressed lattice structure, and MFPM algorithm

Privacy

Loukides et al. (2012), Karim et al. (2012)

Anonymity, proposed a rule-based privacy model that allows data publishers to express fine-grained protection requirements for both identity and sensitive information disclosure

Outperformed the state-of-the-art in terms of retaining data utility, while achieving good protection

Privacy, utility and scalability

Vijayarani et al. (2010a, b), Loukides et al. (2012)

K-anonymity has been studied as an interesting approach to protect micro data related to public or semi-public sectors from linking attacks

Proposed novel approach

Privacy

Nergiz et al. (2009), Xu and Yi (2011)

Anonymity, proposed new clustering algorithms to achieve multi relational anonymity

Provided utility of data and efficiency

Utility, effectiveness and efficiency

Tai et al. (2013), Vijayarani et al. (2010b)

Anonymity, proposed a Distributed k-support Noise

Taxonomy tree algorithm, abbreviated as DKNT

Achieved good protection and better computation efficiency, as compared to the computation efficiency on single machine

Privacy and efficiency

Tai et al. (2010, 2013)

Anonymity, introduce a pseudo taxonomy tree and have the third party mine the generalized frequent item-sets instead

Achieved very good privacy protection with moderate storage overhead

Privacy

Pan et al. (2012), Tai et al. (2010)

Anonymity, had analyzed and performed a comparison for the present developed K-anonymity models and its applications

Enhanced K -anonymity and improve it

Privacy

Deivanai et al. (2011), Pan et al. (2012)

Anonymity, proposed novel method named kactus

Accuracy is better than other methods based on K -anonymity

Privacy and accuracy

Monreale et al. (2014), Deivanai et al. (2011)

Anonymity, a new definition of K-anonymity for personal sequential data which provides an effective privacy protection model is introduced

Results are extremely interesting in the case of dense datasets

Privacy

Nergiz et al. (2013), Monreale et al. (2014)

Anonymity, the hybrid generalizations with data relocation

Increased the utility of data

Privacy and utility

Zhang et al. (2013a, 2014a), Nergiz et al. (2013)

Anonymity, proposed hybrid approach by combining Top-Down Specialization and Bottom-Up Generalization

Improved the scalability and efficiency of TDS

Privacy and scalability

Zhang et al. (2014a)

Anonymity, proposed a highly scalable two-phase TDS approach using Map Reduce on cloud

Scalability and efficiency of TDS are improved significantly over existing approaches

Privacy and scalability

Zhang et al. (2013a, b), Zhang et al. (2014a)

Anonymity, proposed method depends on an efficient quasi-identifier index

Protected privacy when new data is added

Privacy and efficiency

Nergiz and Gök (2014)

Anonymity, Hybrid generalizations

Ensured the utility of data

Privacy and utility

Ding et al. (2013), Zhang et al. (2013c)

Anonymity, have presented a distributed anonymization protocol for privacy-preserving data publishing from multiple data providers in a cloud system

Performed a personalized anonymization to satisfy every data provider’s requirements and the union forms a global anonymization to be published

Privacy