2.3 Weighted Random Sampling 41
Algorithm 2.8:WeightedRandomSample:Merge(S 1=(τ1,T1,L1),
S
2=(τ2,T2,L2),τ1≥τ2)
1mergeL 1andL 2intoL;
2T←T 1,W←τ 1|T|;
3ford←1to|L 2|do
4 X←∅;
5 run lines 7–19 ofAlgorithm 2.7replacingswiths+|L 2|−d;
6ford←1to|T 2|do
7 X←{T 2[d]};
8 W←W+τ 2;
9 run lines 7–19 ofAlgorithm 2.7;
10returnS=(τ,T,L);
For any subsetQ⊆A,letw(Q) =
x∈Q
wx. One canQuerya
WeightedRandomSampleSforw(Q)for an arbitrary subsetQ⊆A,by
simply returning˜w(Q)=
x∈S∩Q
˜wx, that is, we simply add up all adjusted
weights of the items in the sample that fall inQ. This turns out to be an
unbiased estimator of the true total weightw(Q)with strong guarantees on
the variance, as discussed later.
Example.Suppose we are to maintain a weighted random sample of size
s=4 over a sequence of eight items numbered from 1 to 8 in order. The
following example shows one possible execution of theUpdatealgorithm,
together with the contents ofτ,T,L, andX. We use the notationx:w
xto
denote an itemxwith weightw
xor adjusted weight˜w x. Note that all items in
Thave the same adjusted weightτ.
Initialize: τ=0,L=[2 : 1,3:3,1:4,4:8],T=∅;
Update(5:3):Add5:3to L:L=[2 : 1,3:3,5:3,1:4,4:8]
Newτ=7/2,X=[2 : 1,3:3,5:3]
Deletion probabilities: 2 : 5/7,3:1/7,5:1/7
Suppose item 3 is deleted
T={2,5},L=[1 : 4,4:8];
Update(6:5):Add6:5to L:L=[1 : 4,6:5,4:8]
Newτ=16/3,X=[1 : 4,6:5]
Deletion probabilities: 2 : 11/32 ,5:11/32,1:1/4,6:1/16
Suppose item 5 is deleted
T={1,2,6},L=[4 : 8];
Update(7:1):Add7:1to X:X=[7 : 1]
Newτ=17/3,X=[7 : 1]