Minakshi Kaushik

Tallinn University of Technology

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Trends

Author Order

Document Type

Co-Authors

Rahul Sharma

Tallinn University of Technology

Dirk Draheim

Tallinn University of Technology

Sijo Arakkal Peious

Tallinn University of Technology

Mahtab Shahin

Tallinn University of Technology

Sadok Ben Yahia

Tunis University

Syed Attique Shah

Birmingham City University

Ankit Vidyarthi

Jaypee Institute of Information Technology

Iztok Fister

University of Maribor

Prayag Tiwari

Halmstad University

Rahul Sharma

Shri Mata Vaishno Devi University

Cooperative Institutions

Tallinn University of Technology

University of Tartu

Chitkara University

Sharda University

National Institute of Technology Kurukshetra

Institute of Management Technology

National University of Computer and Emerging Sciences

Birmingham City University

VTT Technical Research Centre of Finland

TU Wien

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Field

A Systematic Assessment of Numerical Association Rule Mining Methods

SN Computer Science (2021)

Minakshi Kaushik Rahul Sharma Sijo Arakkal Peious Mahtab Shahin Sadok Ben Yahia

Categorical variable

Association (psychology)

10.1007/s42979-021-00725-2

Cite

Citations (45)

Grand Reports: A Tool for Generalizing Association Rule Mining to Numeric Target Values

Lecture notes in computer science (2020)

Sijo Arakkal Peious Rahul Sharma Minakshi Kaushik Syed Attique Shah Sadok Ben Yahia

Association (psychology)

10.1007/978-3-030-59065-9_3

Cite

Citations (10)

Discretizing Numerical Attributes: An Analysis of Human Perceptions

arXiv (Cornell University) (2023)

Minakshi Kaushik Rahul Sharma Dirk Draheim

Machine learning (ML) has employed various discretization methods to partition numerical attributes into intervals. However, an effective discretization technique remains elusive in many ML applications, such as association rule mining. Moreover, the existing discretization techniques do not reflect best the impact of the independent numerical factor on the dependent numerical target factor. This research aims to establish a benchmark approach for numerical attribute partitioning. We conduct an extensive analysis of human perceptions of partitioning a numerical attribute and compare these perceptions with the results obtained from our two proposed measures. We also examine the perceptions of experts in data science, statistics, and engineering by employing numerical data visualization techniques. The analysis of collected responses reveals that $68.7\%$ of human responses approximately closely align with the values generated by our proposed measures. Based on these findings, our proposed measures may be used as one of the methods for discretizing the numerical attributes.

Benchmark (surveying)

10.48550/arxiv.2311.03278

Cite

Citations (0)

An Exhaustive Multi-Aspect Analysis of Swarm Intelligence Algorithms in Numerical Association Rule Mining

IEEE Access (2024)

Minakshi Kaushik Rahul Sharma Pilleriin Kõiva Iztok Fister Dirk Draheim

Swarm intelligence

Algorithm design

10.1109/access.2024.3417334

Cite

Citations (0)

Existence of the Yule-Simpson Effect: An Experiment with Continuous Data

2022 12th International Conference on Cloud Computing, Data Science & Engineering (Confluence) (2022)

Rahul Sharma Minakshi Kaushik Sijo Arakkal Peious Mahtab Shahin Ankit Vidyarthi

In today's world, artificial intelligence based smart applications and smart medical devices are developed with big data-based trained datasets. However, what if a training dataset used to train a machine learning module is incorrect and has a statistical paradox. Statistical paradoxes are complicated to observe in data but are very important to analyze in every training datasets. This article discusses Simpson's paradox and its effects on various datasets. We provide that Simpson's paradox is more common in a variety of data and it leads to wrong conclusions potentially with harmful consequences. We provide a mathematical analysis of Simpson's paradox and analyse its effects on continuous data. Experiments on real-world and synthetic datasets clearly show that the paradox severely impacts big data.

10.1109/confluence52989.2022.9734211

Cite

Citations (1)

Analysis of Spatial Features in CBIR System

International Journal of Computer Applications (2012)

Minakshi Kaushik Rahul Sharma Ankit Vidhyarthi

Content based image retrieval from large database has become an area of wide interest nowadays in many applications.Content-based image retrieval (CBIR) technique use image content to search and retrieve digital images.Content-based image retrieval (CBIR) is an important research area for manipulating large amount of image databases.In this paper the analysis work is done for finding the spatial features and collects them into a frame to view all the spatial features and the scope of implementing these features into the image retrieval.The commercial image search engines available as on date are: QBIC, VisualSeek, Virage, Netra, PicSOM, FIRE, AltaVista, etc. Region-Based Image Retrieval (RBIR) is a promising extension of CBIR.The shape and spatial features are quite simple to derive and effective, and can be extracted in real time.Our analysis is able to propose a system that has the advantage of increasing the retrieval accuracy and decreasing the retrieval time.

10.5120/8657-2477

Cite

Citations (10)

Swarm-Intelligence Algorithms for Mining Numerical Association Rules: An Exhaustive Multi-Aspect Analysis of Performance Assessment Data

Minakshi Kaushik

Numerical association rule mining (NARM) is an extended version of association rule mining that determines association rules in numerical data items, primarily via distribution, discretization and optimization techniques. Under the umbrella of optimization techniques, several evolutionary and swarm intelligence-based algorithms have been proposed to extract association rules from a numeric dataset. However, a sufficient understanding of the performance of swarm intelligence-based algorithms, especially for NARM, is still missing. In state-of-the-art, various swarm intelligence-based optimization algorithms are claimed to be better based on their arbitrary comparisons with different algorithms in different classes, e.g., swarm intelligence-based algorithms are compared with genetic algorithms. Unfortunately, they are not compared within their own class algorithms. Therefore, it is challenging to select an appropriate swarm intelligence-based algorithm for NARM. This article aims at filling this gap by conducting an exhaustive multi-aspect analysis of four popular swarm intelligence-based optimization algorithms (MOPAR, MOCANAR, ACO-R and MOB-ARM) with four real-world datasets and six major metrics and objectives: performance time, the number of rules, support, confidence, comprehensibility, and interestingness. In our analysis, the MOPAR algorithm produces a low number of rules and shows high values of confidence, comprehensibility, and interestingness. The MOCANAR algorithm provides satisfactory results with respect to all six parameters across all the data sets. The ACO-R algorithm produces high-quality rules but needs parameter modification for a large number of attributes in datasets, and the MOB-ARM algorithm is way slower than the other three algorithms.

Swarm intelligence

Association (psychology)

Data association

10.2139/ssrn.4399331

Cite

Citations (0)

Overcoming the Impact of Statistical Paradoxes in Artificial Intelligence: Towards Fair and Trustworthy Decision Making

SSRN Electronic Journal (2023)

Rahul Sharma Minakshi Kaushik Dirk Draheim

In the past two decades, there have been tremendous advancements in artificial intelligence (AI), machine learning (ML), and deep learning (DL) across various fields, including healthcare, autonomous driving, personal assistant technology, businesses, education, and justice. However, despite many success stories and advantages, AI-based systems are often considered biased, unfair, and untrustworthy. In this paper, we argue that statistical paradoxes are one of the well-known challenges for inducing bias in AI systems. Unfortunately, they have not been adequately addressed in the AI application development scenario. To support our claim, we investigate instances of Simpson's paradox, an extreme case of confounding, in various benchmark datasets. In doing so, we demonstrate the severe consequences of statistical paradoxes on AI systems. Thus, to handle confounding effects and deal with the severe impacts of statistical paradoxes in AI systems, the contribution of this paper is threefold; First, we introduce a framework to mitigate bias in training datasets. Second, we present a set of three algorithms capable of identifying and adjusting the impact of potential statistical confounders in both categorical and continuous datasets. Third, on top of the proposed framework and algorithms, we develop a web-based tool which identifies confounding effects, deals with the instances of Simpson's paradox, and provides adjusted observations to reduce the impacts of confounders. A series of experiments are conducted on multiple real-world and benchmark datasets to validate the efficacy and usefulness of the proposed framework and the algorithms. The results validate the effectiveness of the framework and algorithms. Additionally, the web application serves as a valuable tool for data scientists and researchers by automatically detecting and addressing confounding effects. This paper significantly contributes towards fostering fair and trustworthy AI and holds immense potential for further extensions beyond its current use.

Trustworthiness

Decision-making models

10.2139/ssrn.4444719

Cite

Citations (0)

Numerical Association Rule Mining: A Systematic Literature Review

arXiv (Cornell University) (2023)

Minakshi Kaushik Rahul Sharma Iztok Fister Dirk Draheim

Numerical association rule mining is a widely used variant of the association rule mining technique, and it has been extensively used in discovering patterns and relationships in numerical data. Initially, researchers and scientists integrated numerical attributes in association rule mining using various discretization approaches; however, over time, a plethora of alternative methods have emerged in this field. Unfortunately, the increase of alternative methods has resulted into a significant knowledge gap in understanding diverse techniques employed in numerical association rule mining -- this paper attempts to bridge this knowledge gap by conducting a comprehensive systematic literature review. We provide an in-depth study of diverse methods, algorithms, metrics, and datasets derived from 1,140 scholarly articles published from the inception of numerical association rule mining in the year 1996 to 2022. In compliance with the inclusion, exclusion, and quality evaluation criteria, 68 papers were chosen to be extensively evaluated. To the best of our knowledge, this systematic literature review is the first of its kind to provide an exhaustive analysis of the current literature and previous surveys on numerical association rule mining. The paper discusses important research issues, the current status, and future possibilities of numerical association rule mining. On the basis of this systematic review, the article also presents a novel discretization measure that contributes by providing a partitioning of numerical data that meets well human perception of partitions.

Association (psychology)

10.48550/arxiv.2307.00662

Cite

Citations (3)

Expected vs. Unexpected: Selecting Right Measures of Interestingness

Lecture notes in computer science (2020)

Rahul Sharma Minakshi Kaushik Sijo Arakkal Peious Sadok Ben Yahia Dirk Draheim

Lift (data mining)

10.1007/978-3-030-59065-9_4

Cite

Citations (17)