Fair machine learning aims to avoid treating individuals or sub-populations unfavourably based on \textit{sensitive attributes}, such as gender and race. Those methods in fair machine learning that are built on causal inference ascertain discrimination and bias through causal effects. Though causality-based fair learning is attracting increasing attention, current methods assume the true causal graph is fully known. This paper proposes a general method to achieve the notion of counterfactual fairness when the true causal graph is unknown. To be able to select features that lead to counterfactual fairness, we derive the conditions and algorithms to identify ancestral relations between variables on a \textit{Partially Directed Acyclic Graph (PDAG)}, specifically, a class of causal DAGs that can be learned from observational data combined with domain knowledge. Interestingly, we find that counterfactual fairness can be achieved as if the true causal graph were fully known, when specific background knowledge is provided: the sensitive attributes do not have ancestors in the causal graph. Results on both simulated and real-world datasets demonstrate the effectiveness of our method.
The ADMANI datasets (annotated digital mammograms and associated non-image datasets) from the Transforming Breast Cancer Screening with AI programme (BRAIx) run by BreastScreen Victoria in Australia are multi-centre, large scale, clinically curated, real-world databases. The datasets are expected to aid in the development of clinically relevant Artificial Intelligence (AI) algorithms for breast cancer detection, early diagnosis, and other applications. To ensure high data quality, technical outliers must be removed before any downstream algorithm development. As a first step, we randomly select 30,000 individual mammograms and use Convolutional Variational Autoencoder (CVAE), a deep generative neural network, to detect outliers. CVAE is expected to detect all sorts of outliers, although its detection performance differs among different types of outliers. Traditional image processing techniques such as erosion and pectoral muscle analysis can compensate for the poor performance of CVAE in certain outlier types. We identify seven types of technical outliers: implant, pacemaker, cardiac loop recorder, improper radiography, atypical lesion/calcification, incorrect exposure parameter and improper placement. The outlier recall rate for the test set is 61% if CVAE, erosion and pectoral muscle analysis each select the top 1% images ranked in ascending or descending order according to image outlier score under each detection method, and 83% if each selects the top 5% images. This study offers an overview of technical outliers in the ADMANI dataset and suggests future directions to improve outlier detection effectiveness.
ABSTRACT BACKGROUND Little is known about adolescents' food purchasing behaviors in rural areas. This study examined whether purchasing food at stores/restaurants around schools was related to adolescents' participation in school breakfast programs and overall diet in rural Minnesota. METHODS Breakfast‐skippers enrolled in a group‐randomized intervention in 2014 to 2015 (N = 404 from 8 schools) completed 24‐hour dietary recalls and pre/post surveys assessing food establishment purchase frequency. Healthy Eating Index Scores (HEI‐2010) were calculated for each student. Student‐level school breakfast participation (SBP) was obtained from school food service records. Mixed‐effects regression models estimated: (1) whether SBP was associated with store/restaurant use at baseline, (2) whether an increase in SBP was associated with a decrease in store/restaurant use, and (3) whether stores/restaurant use was associated with HEI‐2010 scores at baseline. RESULTS Students with increased SBP were more likely to decrease fast‐food restaurant purchases on the way home from school (OR 1.017, 95% CI 1.005, 1.029), but were less likely to decrease purchases at food stores for breakfast (OR 0.979, 95% CI 0.959, 0.999). Food establishment use was associated with lower HEI‐2010 dairy component scores (p = .017). CONCLUSIONS Increasing participation in school breakfast may result in modest changes in purchases at food establishments.
<div>Abstract<p><b>Purpose:</b> Aberrant DNA methylation, now recognized as a contributing factor to neoplasia, often shows definitive gene/sequence preferences unique to specific cancer types. Correspondingly, distinct combinations of methylated loci can function as biomarkers for numerous clinical correlates of ovarian and other cancers.</p><p><b>Experimental Design:</b> We used a microarray approach to identify methylated loci prognostic for reduced progression-free survival (PFS) in advanced ovarian cancer patients. Two data set classification algorithms, Significance Analysis of Microarray and Prediction Analysis of Microarray, successfully identified 220 candidate PFS-discriminatory methylated loci. Of those, 112 were found capable of predicting PFS with 95% accuracy, by Prediction Analysis of Microarray, using an independent set of 40 advanced ovarian tumors (from 20 short-PFS and 20 long-PFS patients, respectively). Additionally, we showed the use of these predictive loci using two bioinformatics machine-learning algorithms, Support Vector Machine and Multilayer Perceptron.</p><p><b>Conclusion:</b> In this report, we show that highly prognostic DNA methylation biomarkers can be successfully identified and characterized, using previously unused, rigorous classifying algorithms. Such ovarian cancer biomarkers represent a promising approach for the assessment and management of this devastating disease.</p></div>
High-dimensional low sample size (HDLSS) data are becoming increasingly common in statistical applications. When the data can be partitioned into two classes, a basic task is to construct a classifier that can assign objects to the correct class. Binary linear classifiers have been shown to be especially useful in HDLSS settings and preferable to more complicated classifiers because of their ease of interpretability. We propose a computational tool called direction-projection-permutation (DiProPerm), which rigorously assesses whether a binary linear classifier is detecting statistically significant differences between two high-dimensional distributions. The basic idea behind DiProPerm involves working directly with the one-dimensional projections of the data induced by binary linear classifier. Theoretical properties of DiProPerm are studied under the HDLSS asymptotic regime whereby dimension diverges to infinity while sample size remains fixed. We show that certain variations of DiProPerm are consistent and that consistency is a nontrivial property of tests in the HDLSS asymptotic regime. The practical utility of DiProPerm is demonstrated on HDLSS gene expression microarray datasets. Finally, an empirical power study is conducted comparing DiProPerm to several alternative two-sample HDLSS tests to understand the advantages and disadvantages of each method.
Motivated by the prevalence of high dimensional low sample size datasets in modern statistical applications, we propose a general nonparametric framework, Direction-Projection-Permutation (DiProPerm), for testing high dimensional hypotheses. The method is aimed at rigorous testing of whether lower dimensional visual differences are statistically significant. Theoretical analysis under the non-classical asymptotic regime of dimension going to infinity for fixed sample size reveals that certain natural variations of DiProPerm can have very different behaviors. An empirical power study both confirms the theoretical results and suggests DiProPerm is a powerful test in many settings. Finally DiProPerm is applied to a high dimensional gene expression dataset.
Continuous-depth neural networks can be viewed as deep limits of discrete neural networks whose dynamics resemble a discretization of an ordinary differential equation (ODE). Although important steps have been taken to realize the advantages of such continuous formulations, most current techniques are not truly continuous-depth as they assume identical layers. Indeed, existing works throw into relief the myriad difficulties presented by an infinite-dimensional parameter space in learning a continuous-depth neural ODE. To this end, we introduce a shooting formulation which shifts the perspective from parameterizing a network layer-by-layer to parameterizing over optimal networks described only by a set of initial conditions. For scalability, we propose a novel particle-ensemble parametrization which fully specifies the optimal weight trajectory of the continuous-depth neural network. Our experiments show that our particle-ensemble shooting formulation can achieve competitive performance, especially on long-range forecasting tasks. Finally, though the current work is inspired by continuous-depth neural networks, the particle-ensemble shooting formulation also applies to discrete-time networks and may lead to a new fertile area of research in deep learning parametrization.
Continuous-depth neural networks can be viewed as deep limits of discrete neural networks whose dynamics resemble a discretization of an ordinary differential equation (ODE). Although important steps have been taken to realize the advantages of such continuous formulations, most current techniques are not truly continuous-depth as they assume \textit{identical} layers. Indeed, existing works throw into relief the myriad difficulties presented by an infinite-dimensional parameter space in learning a continuous-depth neural ODE. To this end, we introduce a shooting formulation which shifts the perspective from parameterizing a network layer-by-layer to parameterizing over optimal networks described only by a set of initial conditions. For scalability, we propose a novel particle-ensemble parametrization which fully specifies the optimal weight trajectory of the continuous-depth neural network. Our experiments show that our particle-ensemble shooting formulation can achieve competitive performance, especially on long-range forecasting tasks. Finally, though the current work is inspired by continuous-depth neural networks, the particle-ensemble shooting formulation also applies to discrete-time networks and may lead to a new fertile area of research in deep learning parametrization.