Publications

(*) denotes equal contribution

2024

ICML

InterpreTabNet: Distilling Predictive Signals From Tabular Data

Jacob Yoke Hong Si, Wendy Yusi Cheng, Michael Cooper, and Rahul Krishnan

In The 41st International Conference on Machine Learning, 2024.

Abstract PDF Code

Tabular data are omnipresent in various sectors of industries. Neural networks for tabular data such as TabNet have been proposed to make predictions while leveraging the attention mechanism for interpretability. We find that the inferred attention masks on high-dimensional data are often dense, hindering interpretability. To remedy this, we propose the InterpreTabNet, a variant of the TabNet model that models the attention mechanism as a latent variable sampled from a Gumbel-Softmax distribution. This enables us to regularize the model to learn distinct concepts in the attention masks via a KL Divergence regularizer. It prevents overlapping feature selection by promoting sparsity which maximizes the model’s efficacy and improves interpretability to determine the important features when predicting the outcome. To automate the interpretation of feature interdependencies from our model, we employ GPT-4 and use prompt engineering to map from the learned feature mask onto natural language text describing the learned signal. Through comprehensive experiments on real-world datasets, we demonstrate that our InterpreTabNet Model outperforms previous methods for interpreting tabular data while attaining competitive accuracy.

2022

Book Chapter

Assessing Infant Mortality Rate: Problems stemming from Household Living Conditions, Women’s Education and Health

Jacob Yoke Hong Si, and Rohan Alexander

In "Telling Stories with Data: With Applications in R" by Rohan Alexander

Abstract PDF

What areas can be improved in order to promote the well-being of women in India and hence, reduce the infant mortality rate? Utilizing the data from the 1998-1999 India National Family Health Survey provided by the Demographic and Health Survey (DHS) program, we look to depict the demographics of Indian women and infants in different states of India. We have found that the root causes of poor infant mortality rates stem from having poor living conditions that affect the likelihood of women to attain education and understand the importance of antenatal care and birth delivery assistance. We also explore other factors such as potentially inheritable traits (unhealthy body weight and anaemia disease) as well as an infant’s diet. These factors are crucial in the development of an infant and the reduction of the infant mortality rate.