1

I am working on an ML project where we would like to visualize movements in a high-dimensional but sparse vector space (e.g. a 1x75 vector where most of the entries are either one-hot encoded binary or modulo 3). Since the visualization is mainly to help us understand how our model is learning/moving through the space of possible outputs over the course of a given training run, we want to use something which is quick to set up in each instance. I am vaguely familiar with tsne and umap, and have a decent amount of experience with auto-encoders, but I would prefer if there was an option that required minimal training overhead and could just quickly map to 2D for visualization purposes (instead of training a new visualization model for each run). What are some quick/simple dimensional reduction methods which might work for this type of application?

pigeon
  • 21
  • 2

2 Answers2

1

I'm not sure what your challenge with tsne or umap are in this case however one quick reduction that's computationally inexpensive if interpretability is not of concern might be "random projection". I usually go for PCA though since I often do want to have some interpretability.

Philipp
  • 722
  • 3
  • 10
0

SOM? or their descendants? Self-organizing Maps. There are more developed versions but I think it is quite robust if you can feed it metrics of interest to use in its clustering based on such input (besides the data).

I wonder if one can use that within the latent space of the models using the ordered variable encodings you mention. I hope I understood your question. I do not know of acronyms. Like those, you mentioned, they might already be using such tools. I do not think SOM in its original form is high overhead learning.

dbdb
  • 1
  • 2