Diptesh Das

Researcher at the Department of Computational Biology and Medical Sciences, The University of Tokyo.

Projects (Selected)

Preference-Optimized Pareto Set Learning (PO-PSL) for Blackbox Optimization

Multi-Objective Optimization (MOO) presents a significant challenge in various real-world applications. For complex problems, it is usually impossible to find a single solution that optimizes all objectives simultaneously. In experimental design scenarios, obtaining the entire Pareto set (PS) is beneficial as it allows for flexible exploration of the design space. We have developed an efficient Pareto set learning (PSL) algorithm that learns the continuous manifold of the Pareto front (PF). This enables a robot or a domain expert to explore the PF in real-time, eliminating the need to reconstruct the PF for new trade-off preferences among objectives.

[paper] [code]

A Confidence Machine for Sparse High-order Interaction Model

The Sparse High-order Interaction Model (SHIM) is an interpretable yet non-linear machine learning model. It is a useful model that can capture the interactions of many features, which is crucial in many real-world applications, such as gene-gene interactions and identifying groups of mutations. However, finding a point prediction in regression is often not enough, and many real-world high-stakes decision-making problems demand a prediction band (or interval) that encloses the point prediction. We developed an efficient algorithm that can produce statistically efficient (narrow) prediction intervals containing the point prediction of a SHIM.

[paper] [code]]

Feature Importance Measurement based on Decision Tree Sampling

Random forest is effective for prediction tasks but the randomness of tree generation hinders interpretability in feature importance analysis. To address this, we proposed DT-Sampler, a SAT-based method for measuring feature importance in a tree-based model. Our method has fewer parameters than random forest and provides higher interpretability and stability for the analysis of real-world problems.

[paper] [code]]

Fast and More Powerful Selective Inference for Sparse High-order Interaction Model

Finding statistically significant (low p-values) high-order feature interactions are challenging because of the intrinsic high dimensionality of the combinatorial effects. Another problem in data-driven modeling is the effect of “cherry-picking” (i.e., selection bias). We developed a fast algorithm using a branch-and-bound tree pruning strategy that can correct the selection bias and provide statistically valid (provides selection bias corrected p-values) high-order feature interactions.

[paper] [code]

Sparse High-order Interaction Model with Rejection option (SHIMR)

SHIMR is an interpretable, non-linear machine learning model that includes a rejection option, which is essential for high-stakes decision-making, such as in medical diagnosis. This model can identify uncertain areas within the data and has the ability to refrain from making a decision when it lacks confidence. For instance, it can automatically pinpoint samples that are close to the decision boundary and choose not to make a decision for those instances. SHIMR is equipped to address class imbalance issues, and its visualization module illustrates the relationships between model scores, feature interactions, and their importance, making it a valuable tool for promoting trustworthy AI.

[paper] [code]