I would like to discuss how to optimally implement the following:
Given a set of elements,
all of them are distinct. Call this input set N, it has size n. These element each have a weight. Now, you need to create a subset (R) of this N of size r, where r is smaller then or equal to n.
This subset, again, cannot contain repetitions. Every element of R needs to be randomly chosen from N, if an element has a higher weight, the chance of it being chosen needs to be higher. If the total weight of all element in N is W, then the chance that element i is chosen needs to be w_i/W.
One last detail is that the weights are prone to change, but only to increase, they cannot decrease.
I want a neat implementation for this. I am working in Java, but I am hoping to find some language agnostic properties or details (or even pseudo code) that are interesting.
Now, for my own solution: I create an Array List of the original set of elements. I make sure that the weights are natural numbers and I add them n times if the weight of the element is n. Then I shuffle the arraylist (collections.shuffle) and I keep on taking element from the shuffled list and adding them to a Java Set until the size of the set is r. If the weight of an element increases, it gets added some more times to the original array list. Shuffle again, make a new subset.
My implementation needs loads of memory and the shuffle thingy is also slower if the set becomes bigger. Is there a better idea out there?