3

I have about 30 lists of unequal length (some of which are triplicates of the data), corresponding to metrics relating to nodes of different graphs. I want to compare their similarity using a distance metric, but was unsure which method I can use given the data lists are of unequal length. I was exploring using dynamic time warping, but was wondering if there is any other more basic method.

For example, I was a considering creating histograms with same bin edges and number of bins for each list and using a distance metric on the frequency, but I am not sure how to go about this using python, or if there's a function/package that does this already. Is this even a "good" way of doing it?

I'm also interested in finding a way to measure the statistical significance of the distance measures between the different graphs.

This is a lot in one question, I am new to this and appreciate any help. Thank you in advance!

user112237
  • 31
  • 2

1 Answers1

1

Just to clarify the question: Do the lists describe different graphs, or do you need the similarities to learn which lists refer to the same graph? Is data tri- or duplicated within single lists or are lists duplicating other lists? Would you consider removing redundant metrics?

Sorry, this was rather a comment than an answer, but I am not yet entitled to comment here, so I am starting the discussion like this.

Frankstr
  • 314
  • 1
  • 11