blog




  • Essay / Hiding sensitive XML association rules via Bayesian...

    Abstract: Privacy-preserving data mining (PPDM) attracts the attention of researchers in different fields, especially in data rule mining. 'association. The purpose of conservative association rules is to minimize the risk of disclosure of shared information to external parties. In this paper, we proposed a PPDM model for XML association rules (XAR). The proposed model identifies the most probable so-called sensitive element to modify the original data source with more precision and reliability. Such reliability is never addressed before in the literature in any type of methodology used in the PPDM domain and in particular in XML association rule mining. Thus, the importance of suggested model sets and opens a new dimension for academia to control sensitive information in a more inflexible line of attack. Keywords: XAR, PPDM, K2 algorithm, Bayesian network, association rules I. INTRODUCTIONIn data mining, trends and patterns are identified on a large dataset to discover knowledge. In such analysis, there are a variety of algorithms to extract knowledge such as clustering, classification and association rule mining. Thus, association rules leverage a domain to provide knowledge about complex data. Moreover, the basis of the discovered association rules is usually determined by the minimum support s% and minimum confidence c% to represent the transactional elements in database D. Thus, this has the implication of the form A B, where A is the antecedent. and B is the consequent. The problem with such display of rules is the disclosure of sensitive information to the external party when the data is shared. Hence the emergence of privacy preservation in data mining (PPDM) linked to association rules. In PPDM, sensitive information is contained...... middle of paper ......066-1395, IEEE Computer Society Washington, DC, USA[ 7]. M. Atallah, E. Bertino, A. Elmagarmid, M. Ibrahim, V. Verykios, “Disclosure Limitation of Sensitive Rules”, pages 45-52, year of publication: 1999, ISBN 0-7695-0453-1, IEEE Computer Society, Washington, DC, United States[8]. Gregory F. Cooper and Edward Herskovits. A Bayesian method for inducing probabilistic networks from data. Mach. Learn., 9(4):309{347, 1992.[9]. R. Agralwal, T. Imielinski and A. Swami. Mining associations between sets of items in large databases. In P. Buneman and S. Jajodia, editors, SIGMOD93, pages 207-216, Washington, DC, USA, May 1993[10]. O. Doguc and JE Ramirez-Marquez “A generic method for estimating system reliability using Bayesian networks,” in proc. Reliability Engineering and Systems Security, (2008)[11]. http://tunedit.org/repo/UCI/lymph.arff,DatasetAccessDate:31-03-2010