Projects | Kei Ishikawa

Publication

Please refer to the Publication section.

Mini-research projects

Research projects that did not end up in publication but taught me some lessons.

Non-linear Debiasing of Sentence Embeddings with Kernel PCA (report, code)

A project that taught me the importance of not naively believing in whatever a paper claims, even if it is from famous universities and published in a top-tier conference. While trying to extend a paper from EMNLP, we realized that its proposed method is infeasible in practice. It was confirmed by the authors that they intentionally skipped the experiments of the proposed method and replaced them with reasonable alternatives without any mention on the feasibility issue. This is a project conducted with a classmate, where I was responsible for the theoretical analysis and implementation of the kernel PCA.

Fake voice detection (report, code)

This is a project from 2018 when the deepfake had much less recognition than it does nowadays. I generated fake voice clips of former US president Barak Obama using Cyclic GAN and showed that a simple voice verification system using GMMs can be spoofed. Though the implementation is very minimal, it has more stars than any of my code repositories on GitHub. Though I have a tendency to seek more and more technicality in my research, this made me realize less technical work that answers a timely question could be more valuable than highly technical work that merely extends the existing works.

Can Higher-Order Monte Carlo Methods Help Reinforcement Learning? (report, code)

I tried to improve the sample efficiency of policy gradient estimator with quasi-Monte Carlo (QMC) method, which offers faster convergence than the naive Monte Carlo for a sufficiently regular integrand. This project gave me a better understanding of the sample inefficiency in RL due to sparse reward and higher-order integration with QMC. Though I failed to improve the sample efficiency in the project, I later found the following paper which applies QMC to RL in a more reasonable manner, which gives a positive answer to the above research question: Policy Learning and Evaluation with Randomized Quasi-Monte Carlo.

Low-rank GP on discrete domain (report, code)

I conducted theoretical analysis and implementation of low-rank Gaussian processes, from which I gained a good understanding and intuition on matrix decompositions such as SVD and Cholesky decomposition, projection operator, and the kernel method. This knowledge turned out to be quite useful in my recent publication.

Softwares

confounding-robust-inference (code, documentation)

Slightly over-engineered code for our paper A Convex Framework for Confounding Robust Inference, from which I learned how to properly set up a python package, CI/CD, a test suite, and documentation.

pca-impute (code)

A simple but fast missing value imputation with iterative PCA (i.e. iterative SVD) with a scikit-learn style API.

imputax (code)

Bayesian missing value imputation with the probabilistic PCA and the factor model, implemented in Jax.

OSS contributions

scikit-learn (A bug fix for kernel PCA, #19732.)
Scipy(A bug fix for LatinHypercubes, #13654.)
Optuna (Add multivariate TPE sampler, #1767. Add QMC sampler, #2423. Support batched sampling with BoTorch #4591.)

Reading club

Slide decks I presented in previous reading clubs.

"Temporal Parallelization of Bayesian Smoothers", presented on August 2023 (slides)

This paper improves the parallel complexity of Baysian filtering and smoothing from O(n) (of the traditional forward-backward algorithm) to O(log n), which is a quite striking result.

"Reinforcement Learning via Fenchel-Rockafellar Duality", presented on January 2023 (slides)

A summary of the DICE (stationary DIstribution Correction Estimation) techniques in offline RL. The stationary distribution for a fix policy is known to become the solution of a linear operator equation, and they provide a sysmetatic recipe to solve this equation using convex duality.

"Kernel Instrumental Variable Regression", presented on May 2020 (slides)

An extension of the classic linear instrumental variable regression with the kernel methods. One of the pioneering causal ML papers that "kernelized" the classic linear methods for causal inference. An interesting technical point is that their method involves learning a linear operator from a feature space of a kernel to a feature of another kernel, which is not very trivial but is still feasible analytically.

"A brief review on over-smoothing in graph neural networks", presented on May 2021 (slides)

The over-smoothing in graph neural networks (GNNs) is a phenomena where a GNN's performance degrades when the GNNs becomes too deep. I summarized the up-to-date theoretical insights and proposed solutions as of early 2021. In theory, the oversmoothing was attributed to the spectral decay due to the incremental application of the same message passing operator, analogous to the power iteration. Proposed solutions at the time were either residual skip connection or sparsification of the message passing, to reduce the spectral decay of the operator.

My two cents on career and academics

I sometimes provide mentorship/advice to aspiring students (mostly Japanese) interested in STEM education/careers abroad and I’ve written a few articles on these topics, which turned out to be quite popular.

"Tips for studying and working abroad for Japanese students" (posts on github)

I wrote a series of "things I wish I knew when I was 18" type of posts. There are significantly fewer Japanese compared to other groups (such as Chinese and Koreans) in the Western STEM field. This underrepresentation is partially due to a scarcity of information available in Japanese, so posts like these could be helpful. Indeed, they attracted a considerable level of engagement and got nearly 100 stars on github ⭐ (but even more popular than any of my software 🥲)!