I am working on an ML project where we would like to visualize movements in a high-dimensional but sparse vector space (e.g. a 1x75 vector where most of the entries are either one-hot encoded binary or modulo 3). Since the visualization is mainly to help us understand how our model is learning/moving through the space of possible outputs over the course of a given training run, we want to use something which is quick to set up in each instance. I am vaguely familiar with tsne and umap, and have a decent amount of experience with auto-encoders, but I would prefer if there was an option that required minimal training overhead and could just quickly map to 2D for visualization purposes (instead of training a new visualization model for each run). What are some quick/simple dimensional reduction methods which might work for this type of application?
2 Answers
I'm not sure what your challenge with tsne or umap are in this case however one quick reduction that's computationally inexpensive if interpretability is not of concern might be "random projection". I usually go for PCA though since I often do want to have some interpretability.
- 722
- 3
- 10
SOM? or their descendants? Self-organizing Maps. There are more developed versions but I think it is quite robust if you can feed it metrics of interest to use in its clustering based on such input (besides the data).
I wonder if one can use that within the latent space of the models using the ordered variable encodings you mention. I hope I understood your question. I do not know of acronyms. Like those, you mentioned, they might already be using such tools. I do not think SOM in its original form is high overhead learning.
- 1
- 2