What are the fastest dimensional reduction techniques to use out of the box

Question

I am working on an ML project where we would like to visualize movements in a high-dimensional but sparse vector space (e.g. a 1x75 vector where most of the entries are either one-hot encoded binary or modulo 3). Since the visualization is mainly to help us understand how our model is learning/moving through the space of possible outputs over the course of a given training run, we want to use something which is quick to set up in each instance. I am vaguely familiar with tsne and umap, and have a decent amount of experience with auto-encoders, but I would prefer if there was an option that required minimal training overhead and could just quickly map to 2D for visualization purposes (instead of training a new visualization model for each run). What are some quick/simple dimensional reduction methods which might work for this type of application?

score 1 · Answer 1 · answered Sep 20 '24 at 22:05

I'm not sure what your challenge with tsne or umap are in this case however one quick reduction that's computationally inexpensive if interpretability is not of concern might be "random projection". I usually go for PCA though since I often do want to have some interpretability.

dbdb · Answer 2 · 2024-09-25T01:31:47.337

SOM? or their descendants? Self-organizing Maps. There are more developed versions but I think it is quite robust if you can feed it metrics of interest to use in its clustering based on such input (besides the data).

I wonder if one can use that within the latent space of the models using the ordered variable encodings you mention. I hope I understood your question. I do not know of acronyms. Like those, you mentioned, they might already be using such tools. I do not think SOM in its original form is high overhead learning.

What are the fastest dimensional reduction techniques to use out of the box

2 Answers2