Matej Zečević

Technical University of Darmstadt

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Trends

Author Order

Document Type

Co-Authors

Kristian Kersting

Technical University of Darmstadt

Devendra Singh Dhami

Hess (United States)

Moritz Willig

Technical University of Darmstadt

Florian Peter Busch

Technical University of Darmstadt

Sriraam Natarajan

The University of Texas at Dallas

Athresh Karanam

The University of Texas at Dallas

Jonas Seng

Petar Veličković

University of Nis

David Steinmann

Technical University of Darmstadt

Constantin A. Rothkopf

Czech Academy of Sciences, Institute of Psychology

Cooperative Institutions

Technical University of Darmstadt

Hess (United States)

Hessian Agency for Nature Conservation, Environment and Geology

Hessian Center for Artificial Intelligence

Hessisches Landesmuseum Darmstadt

German Research Centre for Artificial Intelligence

University of Bonn

University of Cambridge

Hesse (Germany)

The University of Texas at Dallas

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Field

On the Tractability of Neural Causal Inference

arXiv (Cornell University) (2021)

Matej Zečević Devendra Singh Dhami Kristian Kersting

Roth (1996) proved that any form of marginal inference with probabilistic graphical models (e.g. Bayesian Networks) will at least be NP-hard. Introduced and extensively investigated in the past decade, the neural probabilistic circuits known as sum-product network (SPN) offers linear time complexity. On another note, research around neural causal models (NCM) recently gained traction, demanding a tighter integration of causality for machine learning. To this end, we present a theoretical investigation of if, when, how and under what cost tractability occurs for different NCM. We prove that SPN-based causal inference is generally tractable, opposed to standard MLP-based NCM. We further introduce a new tractable NCM-class that is efficient in inference and fully expressive in terms of Pearl's Causal Hierarchy. Our comparative empirical illustration on simulations and standard benchmarks validates our theoretical proofs.

Graphical model

Source

Cite

Citations (1)

$\chi$SPN: Characteristic Interventional Sum-Product Networks for Causal Inference in Hybrid Domains

arXiv (Cornell University) (2024)

Harsh Poonia Moritz Willig Zhongjie Yu Matej Zečević Kristian Kersting

Causal inference in hybrid domains, characterized by a mixture of discrete and continuous variables, presents a formidable challenge. We take a step towards this direction and propose Characteristic Interventional Sum-Product Network ($\chi$SPN) that is capable of estimating interventional distributions in presence of random variables drawn from mixed distributions. $\chi$SPN uses characteristic functions in the leaves of an interventional SPN (iSPN) thereby providing a unified view for discrete and continuous random variables through the Fourier-Stieltjes transform of the probability measures. A neural network is used to estimate the parameters of the learned iSPN using the intervened data. Our experiments on 3 synthetic heterogeneous datasets suggest that $\chi$SPN can effectively capture the interventional distributions for both discrete and continuous variables while being expressive and causally adequate. We also show that $\chi$SPN generalize to multiple interventions while being trained only on a single intervention data.

10.48550/arxiv.2408.07545

Cite

Citations (0)

Do Not Marginalize Mechanisms, Rather Consolidate!

arXiv (Cornell University) (2023)

Moritz Willig Matej Zečević Devendra Singh Dhami Kristian Kersting

Structural causal models (SCMs) are a powerful tool for understanding the complex causal relationships that underlie many real-world systems. As these systems grow in size, the number of variables and complexity of interactions between them does, too. Thus, becoming convoluted and difficult to analyze. This is particularly true in the context of machine learning and artificial intelligence, where an ever increasing amount of data demands for new methods to simplify and compress large scale SCM. While methods for marginalizing and abstracting SCM already exist today, they may destroy the causality of the marginalized model. To alleviate this, we introduce the concept of consolidating causal mechanisms to transform large-scale SCM while preserving consistent interventional behaviour. We show consolidation is a powerful method for simplifying SCM, discuss reduction of computational complexity and give a perspective on generalizing abilities of consolidated SCM.

Causality

Complex system

Consolidation

10.48550/arxiv.2310.08377

Cite

Citations (0)

We Should Care about Explaining Even Linear Programs

Authorea (Authorea) (2024)

David Steinmann Matej Zečević Devendra Singh Dhami Kristian Kersting

There has been a recent push to make machine learning models more interpretable so that their performance can be trusted. Although successful, this push has primarily focused on deep learning methods, while simpler optimization methods have been essentially ignored. Consider linear programs (LP), a working horse of sciences. Even if LPs can be considered whitebox or clearbox models, they are not easy to understand in terms of relationships between inputs and outputs, contrary to common belief. As a linear program solver only provides the optimal solution to an optimization problem, further explanations are often helpful. We extend the attribution methods for explaining neural networks to linear programs, thereby taking the first step towards what might be called explainable optimization. These attribution methods explain the model by providing relevance scores for the model inputs to show the influence of each input on the output. Alongside using classical gradient-based attribution methods, we also propose a way to adapt perturbation-based attribution methods to LPs. Our evaluations of several different linear and integer problems show that attribution methods can generate helpful explanations for these models. In particular, we demonstrate that explanations can generate interesting insights into large, real-world linear programs.

10.22541/au.172833619.95560618/v1

Cite

Citations (0)

A Taxonomy for Inference in Causal Model Families

arXiv (Cornell University) (2021)

Matej Zečević Devendra Singh Dhami Kristian Kersting

Neurally-parameterized Structural Causal Models in the Pearlian notion to causality, referred to as NCM, were recently introduced as a step towards next-generation learning systems. However, said NCM are only concerned with the learning aspect of causal inference but totally miss out on the architecture aspect. That is, actual causal inference within NCM is intractable in that the NCM won't return an answer to a query in polynomial time. This insight follows as corollary to the more general statement on the intractability of arbitrary SCM parameterizations, which we prove in this work through classical 3-SAT reduction. Since future learning algorithms will be required to deal with both high dimensional data and highly complex mechanisms governing the data, we ultimately believe work on tractable inference for causality to be decisive. We also show that not all ``causal'' models are created equal. More specifically, there are models capable of answering causal queries that are not SCM, which we refer to as \emph{partially causal models} (PCM). We provide a tabular taxonomy in terms of tractability properties for all of the different model families, namely correlation-based, PCM and SCM. To conclude our work, we also provide some initial ideas on how to overcome parts of the intractability of causal inference with SCM by showing an example of how parameterizing an SCM with SPN modules can at least allow for tractable mechanisms. We hope that our impossibility result alongside the taxonomy for tractability in causal models can raise awareness for this novel research direction since achieving success with causality in real world downstream tasks will not only depend on learning correct models as we also require having the practical ability to gain access to model inferences.

Corollary

Causality

Causal model

Impossibility

10.48550/arxiv.2110.12052

Cite

Citations (0)

Interventional Sum-Product Networks: Causal Inference with Tractable Probabilistic Models

arXiv (Cornell University) (2021)

Matej Zečević Devendra Singh Dhami Athresh Karanam Sriraam Natarajan Kristian Kersting

While probabilistic models are an important tool for studying causality, doing so suffers from the intractability of inference. As a step towards tractable causal models, we consider the problem of learning interventional distributions using sum-product networks (SPNs) that are over-parameterized by gate functions, e.g., neural networks. Providing an arbitrarily intervened causal graph as input, effectively subsuming Pearl's do-operator, the gate function predicts the parameters of the SPN. The resulting interventional SPNs are motivated and illustrated by a structural causal model themed around personal health. Our empirical evaluation on three benchmark data sets as well as a synthetic health data set clearly demonstrates that interventional SPNs indeed are both expressive in modelling and flexible in adapting to the interventions.

Causality

Operator (biology)

Causal model

Source

Cite

Citations (4)

Pearl Causal Hierarchy on Image Data: Intricacies & Challenges

arXiv (Cornell University) (2022)

Matej Zečević Moritz Willig Devendra Singh Dhami Kristian Kersting

Many researchers have voiced their support towards Pearl's counterfactual theory of causation as a stepping stone for AI/ML research's ultimate goal of intelligent systems. As in any other growing subfield, patience seems to be a virtue since significant progress on integrating notions from both fields takes time, yet, major challenges such as the lack of ground truth benchmarks or a unified perspective on classical problems such as computer vision seem to hinder the momentum of the research movement. This present work exemplifies how the Pearl Causal Hierarchy (PCH) can be understood on image data by providing insights on several intricacies but also challenges that naturally arise when applying key concepts from Pearlian causality to the study of image data.

Causality

Causation

Patience

Abstraction

10.48550/arxiv.2212.12570

Cite

Citations (0)

Causal Explanations of Structural Causal Models

arXiv (Cornell University) (2021)

Matej Zečević Devendra Singh Dhami Constantin A. Rothkopf Kristian Kersting

In explanatory interactive learning (XIL) the user queries the learner, then the learner explains its answer to the user and finally the loop repeats. XIL is attractive for two reasons, (1) the learner becomes better and (2) the user's trust increases. For both reasons to hold, the learner's explanations must be useful to the user and the user must be allowed to ask useful questions. Ideally, both questions and explanations should be grounded in a causal model since they avoid spurious fallacies. Ultimately, we seem to seek a causal variant of XIL. The question part on the user's end we believe to be solved since the user's mental model can provide the causal model. But how would the learner provide causal explanations? In this work we show that existing explanation methods are not guaranteed to be causal even when provided with a Structural Causal Model (SCM). Specifically, we use the popular, proclaimed causal explanation method CXPlain to illustrate how the generated explanations leave open the question of truly causal explanations. Thus as a step towards causal XIL, we propose a solution to the lack of causal explanations. We solve this problem by deriving from first principles an explanation method that makes full use of a given SCM, which we refer to as SC$\textbf{E}$ ($\textbf{E}$ standing for explanation). Since SCEs make use of structural information, any causal graph learner can now provide human-readable explanations. We conduct several experiments including a user study with 22 participants to investigate the virtue of SCE as causal explanations of SCMs.

Spurious relationship

Causal model

Causal structure

Causal reasoning

Causality

Ask price

10.48550/arxiv.2110.02395

Cite

Citations (2)

Causal Parrots: Large Language Models May Talk Causality But Are Not Causal

arXiv (Cornell University) (2023)

Matej Zečević Moritz Willig Devendra Singh Dhami Kristian Kersting

Some argue scale is all what is needed to achieve AI, covering even causal models. We make it clear that large language models (LLMs) cannot be causal and give reason onto why sometimes we might feel otherwise. To this end, we define and exemplify a new subgroup of Structural Causal Model (SCM) that we call meta SCM which encode causal facts about other SCM within their variables. We conjecture that in the cases where LLM succeed in doing causal inference, underlying was a respective meta SCM that exposed correlations between causal facts in natural language on whose data the LLM was ultimately trained. If our hypothesis holds true, then this would imply that LLMs are like parrots in that they simply recite the causal knowledge embedded in the data. Our empirical analysis provides favoring evidence that current LLMs are even weak `causal parrots.'

Causality

Causal model

Causal analysis

Causal reasoning

Causal structure

10.48550/arxiv.2308.13067

Cite

Citations (6)

The Causal Loss: Driving Correlation to Imply Causation

arXiv (Cornell University) (2021)

Moritz Willig Matej Zečević Devendra Singh Dhami Kristian Kersting

Most algorithms in classical and contemporary machine learning focus on correlation-based dependence between features to drive performance. Although success has been observed in many relevant problems, these algorithms fail when the underlying causality is inconsistent with the assumed relations. We propose a novel model-agnostic loss function called Causal Loss that improves the interventional quality of the prediction using an intervened neural-causal regularizer. In support of our theoretical results, our experimental illustration shows how causal loss bestows a non-causal associative model (like a standard neural net or decision tree) with interventional capabilities.

Causation

Causality

Causal model

Causal structure

Tree (set theory)

Associative property

Information loss

10.48550/arxiv.2110.12066

Cite

Citations (0)