# So Who Won?Dynamic Max Discovery with the Crowd

In this paper, we focus on one such function, maximum, that finds the highest ranked object or tuple in a set. In particularm we study two problems: given a set of votes (pairwise comparisons among objects), how do we select the maximum? And how do we improve our estimate by requesting additional votes? We show that in a crowdsourcing DB system, the optimal solution to both problems is NP-Hard. We then provide heuristic functions to select the maximum given evidence, and to select additional votes.

4.Maximum Likelihood Permutations: We begin by considering graph like Figure 1. In particular, we view weighted arcs as mul- the complexity of computing Equation 4. The problem has been tiple arcs. For instance, if there is an arc from vertex A to B with shown to be NP-Hard . We briefly describe the result, before weight 3, we can instead view it as 3 separate edges from A to B. moving on to the Judgment Problem. We use this alternate representation in our proof. Consider Equation 3. Let ψ(πd , W ) denote the number ofX votes An arc i → j respects a permutation if the permutation has oj in W that agree with permutation πd , e.g. ψ(πd , W ) = wij .ranked higher than oi (and does not if the permutation has oi ranked ij:πd (j)<πd (i) higher than oj ). A Kemeny permutation is simply a permutation Let T (W ) denote the total number of votes in W , e.g. T = ||W ||1 . of the vertices (objects), such that the number of arcs that do not Equation 3 can be rewritten as: respect the permutation is minimum. There may be many such permutations, but there always is at least one such permutation. P (W |πd ) = Zpψ(πd ,W ) (1 − p)T (W )−ψ(πd ,W ) (6) The starting vertex (rank 1 object) in any of these permutations is x a−x a Kemeny winner. It can be shown that finding a Kemeny winner In this expression, Z is a constant. Note that p (1 − p) is an is NP-Hard (using a reduction from the feedback arc set problem, increasing function with respect to x, for 0.5 < p < 1 and constant similar to the proof in Hudry et al. ). a. Therefore, maximizing P (W |πd ) is equivalent to maximiz- We now reduce the Kemeny winner determination problem to ing ψ(πd , W ). This implies that computing arg maxi ψ(πi , W ) is one of finding the maximum object. Consider a directed weighted equivalent to computing the most likely permutation(s), arg maxi P (πi |W ). graph G, where we wish to find a Kemeny winner. We show that ψ(·) is known as the Kemeny metric, and has been well studied in with a suitable probability p, which we set, the maximum object the economic and social choice literature . Referring to the (i.e., the solution to the Judgment Problem) in G is a Kemeny win- example in Figure 1, the Kemeny metric is maximized by two per- ner. As before, the probability that a certain object oj is the maxi- mutations of the objects, (D, C, B, A) or (C, D, B, A). For both mum object is the right hand side of Equation 5 with k set to 1. The these permutations, the Kemeny metric is equal to 2+2+1+3 = 8. denominator can be ignored since it is a constant for all j. We set We next show that maximizing the Kemeny metric is equivalent to worker accuracy p to be very close to 1. In particular, we choose a solving a classical NP-Hard problem. value p such that 1−p p < n!1 . Consider the directed graph Gv = (V, A) representing the votes Now, consider all permutations πd that areX not Kemeny permuta- of W (Figure 1 is an example). The minimum feedback arc set of ′ ′ tions. In this case, it can be shown that P (W |πd ) < Gv is the smallest weight set of arcs A ⊆ A such that (V, A\A ) is d:πd is not Kemeny acyclic. Equivalently, the problem can also be seen as maximizing ′ P (W |πs ) for any Kemeny permutation πs . Thus, the object oj that the weight of acyclic graph (V, A\A ). maximizes Equation 5 (for k = 1) has to be one that is a Kemeny Now, supposing we have a method to solve for the minimum winner. feedback arc set, consider a topological ordering πt of the vertices X To see why P (W |πd ) < P (W |πs ) for a Ke- in the resulting acyclic graph. Referring back to Equation 6, the d:πd is not Kemeny permutation πt maximizes the Kemeny metric pψ(π,W ) . Therefore, meny permutation πs , notice that the left hand side is at most n! × solving the minimum feedback arc set problem for Gv is equivalent ′ ′ P (W |πd ) where πd is the permutation (not Kemeny) that has the to our original problem of finding the most likely permutation given least number of arcs that do not respect the permutation. Note that a set of votes. Referring back to our example in Figure 1, the mini- ′ P (W |πd ) is at most P (W |πs )× 1−p , since this permutation has at mum feedback arc set is 2, equivalent to cutting arc (C, B) and one p least one more mistake as compared to any Kemeny permutation. of arcs (C, D) or (D, C). Therefore, we have shown that, for a suitable p, the maximum Finding the minimum feedback arc set in a directed graph is a object in G is a Kemeny winner. Thus, we have a reduction from classical NP-Hard problem, implying that the problem of finding the Kemeny winner problem to the Judgement problem. Since find- the ML permutation given a set of votes is NP-Hard as well. ing a Kemeny winner is NP-Hard, this implies that finding the max- T HEOREM 1. (Hardness of Maximum Likelihood Permutation)  imum object in G is NP-Hard. Finding the Maximum Likelihood permutation given evidence is NP-Hard. #P-Hardness of Probability Computations: In addition to being NP-Hard to find the maximum object, we can show that evaluating Hardness of the Judgment Problem: In Section 2.2, we presented the numerator of the right hand side of Equation 5 (with k = 1) is a formulation for the Judgment Problem based on ML for finding #P-Hard, in other words: computing P (π −1 (1) = j, W ) is #P- the object most likely to be the max (maximum) object in O. Un- Hard. fortunately, the strategy based on that formulation was computa- We use a reduction from the problem of counting the number tionally infeasible, as it required computation across all n! permu- of linear extensions in a directed acyclic graph (DAG), which is tations of the objects in O. We now show that the optimal solution known to be #P-Hard. to the problem of finding the maximum object is in fact NP-Hard using a reduction from the problem of determining Kemeny win- ners . (Hudry et al.  actually show that determining Slater T HEOREM 3. (#P-Hardness of Probability Computation) Com- winners in tournaments is NP-Hard, but their proof also holds for puting P (π −1 (1) = j, W ) is #P-Hard. Kemeny winners. We will describe the Kemeny winner problem P ROOF. A linear extension is a permutation of the vertices, such below.) Our results and proof are novel. that all arcs in the graph respect the permutation (i.e., a linear ex- tension is the same as a Kemeny permutation for a DAG). T HEOREM 2. (Hardness of the Judgment Problem) Finding the Consider a DAG G = (V, A). We add an additional vertex x maximum object given evidence is NP-Hard. such that there is an arc from each of the vertices in G to x, giv- P ROOF. We first describe the Kemeny winner problem. In this ing a new graph G′ = (V ′ , A′ ). We now show that computing proof, we use an alternate (but equivalent) view of a directed weighted P (π −1 (1) = x, W ) in G′ can be used to compute the number of

5.linear extensions in G. Notice that: Given vote matrix W , we construct a complete graph between all ′ |A | objects where arc weights lji are equal to P (π(i) < π(j)|wij , wji ). P (π −1 (1) = x, W ) = X ai pi (1 − p)|A |−i ′ lji reflects the probability that oi is greater than oj given the local i=0 evidence wij and wji . It is important to note that this method is a ′ heuristic, e.g., we compute P (π(i) < π(j)|wij , wji ), rather than |A | |A′ | X 1 − p |A′ |−i P (π(i) < π(j)|W ), which requires full enumeration over all n! =p × ai ( ) (7) permutations. i=0 p How do we compute arc weight P (π(i) < π(j)|wij , wji )? where ai is the number of permutations where there are i arcs that respect the permutation. Clearly, the number that we wish to de- P (π(i) < π(j)|wij , wji ) = termine is a|A′ | , since that is the number of permutations that cor- P (wij , wji |π(i) < π(j))P (π(i) < π(j)) respond to linear extensions. Equation 7 is a polynomial of degree P (wij , wji ) |A′ | in 1−p p , thus, we may simply choose |A′ | + 1 different values (8) 1−p of p , generate |A′ | + 1 different graphs G′ , and use the probabil- Assuming that a priori all permutations π are equally likely, P (π(i) < ity computation in Equation 7 to create a set of |A′ | + 1 equations π(j)) = P (π(j) < π(i)) by symmetry. Using Equation 8 to involving the ai coefficients. We may then derive the value of a|A| find expressions for P (π(i) < π(j)|wij , wji ) and P (π(i) > using Lagrange’s interpolation formula. π(j)|wij , wji ), we can derive the following: Since vertex x is the only maximum vertex in G′ , by comput- ing P (π −1 (1) = x, W ) in G′ , we count the number of linear P (π(i) < π(j)|wij , wji ) P (wij , wji |π(i) < π(j)) = (9) extensions in DAG G. Since counting the number of linear ex- P (π(i) > π(j)|wij , wji ) P (wij , wji |π(i) > π(j)) tensions in a DAG is #P-Hard, this implies that the computation of P (π −1 (1) = x, W ) in G′ is #P-Hard, which implies that the com- Since P (π(i) < π(j)|wij , wji )+P (π(j) < π(i)|wij , wji ) = 1, putation of P (π −1 (k) = j, W ) for directed graph Gv (associated we can simplify Equation 9 to get the following expression: with vote matrix W ) is #P-Hard. P (π(i) < π(j)|wij , wji ) = P (wij , wji |π(i) < π(j)) (10) Strategy 1 Maximum Likelihood P (wij , wji |π(i) < π(j)) + P (wij , wji |π(i) > π(j)) Require: n objects, probability p, vote matrix W Using Equation 2, P (wij , wji |π(i) < π(j)) and P (wij , wji |π(i) < Ensure: ans = maximum likelihood maximum object π(j)) can be computed directly from the binomial distribution p.m.f. s[·] ← 0 {s[i] is used to accumulate the probability that i is the Therefore, we can compute the arc weight lji = P (π(i) < π(j)|wij , wji ) maximum object} needed for the Indegree scoring function. It should be clear that for each permutation π of n objects do lij + lji = 1. Also, if wij and wji are both equal to zero, then both prob ← 1 {prob is the probability of permutation π given lij and lji are computed to be 0.5. vote matrix W } for each tuple (i, j) : i < j do Strategy 2 Indegree if π(i) < π(j) then` prob ← prob × wijw+w ji p ji (1 − p)wij ´ w Require: n objects, probability p, vote matrix W ij Ensure: ans = predicted maximum object else s[·] ← 0 prob ← prob × wijw+w ´ w p ij (1 − p)wji ` ji ji for i : 1 . . . n do end if for j : 1 . . . n, j = i do end for s[i] ← s[i] + lji {lji = P (π(i) < π(j)|wij , wji )} s[π −1 (1)] ← s[π −1 (1)] + prob end for end for end for ans ← argmaxi s[i] ans ← argmaxi s[i] The Indegree scoring function, displayed in Strategy 2, computes 2.4 Heuristic Strategies P the score of object oj as: s(j) = i lij . Intuitively, vertices with The ML scoring function is computationally inefficient and also higher scores correspond to objects which have compared favorably requires prior knowledge of p, the average worker accuracy, which to other objects, and hence should be ranked higher. The predicted is not available to us in real-world scenarios. We next investigate ranking has been shown to be a constant factor approximation to the performance and efficiency of four heuristic strategies, each of the feedback arc set for directed graphs where all arcs (i, j) are which runs in polynomial time. The heuristics we present, exclud- present and lij + lji = 1 . The running time of this heuristic is ing the Indegree heuristic, do not require explicit knowledge of the dominated by the time to do the final sort of the scores. worker accuracy p. Let us walk through the example graph in Figure 1. First, for Indegree Strategy: The first heuristic we consider is an Indegree those pairs of vertices that do not have any votes between them, scoring function proposed by Coppersmith et al.  to approx- we have lAC = 0.5, lCA = 0.5, lAD = 0.5, and lDA = 0.5. By imate the optimal feedback arc set in a directed weighted graph symmetry, lCD = 0.5 and lDC = 0.5. Given a value of p, we where arc weights lij , lji satisfy lij + lji = 1 for each pair of use Equation 10 to compute the rest of the arc weights. For p = vertices i and j. In this section, we describe how to transform our 0.55, we have lAB = 0.599, lBA = 0.401, lBC = 0.55, lCB = vote matrix W to a graph where this Indegree scoring function can 0.45, lBD = 0.646, and lDB = 0.354. With these computed arc be applied. Let π(i) denote the rank, or index, of object oi in the weights, we obtain the scores: s(A) = 1.401, s(B) = 1.403, s(C) = permutation associated with random variable π. 1.55, and s(D) = 1.65, generating a predicted ranking of (D, C, B, A),

7.Strategy 4 PageRank Heuristic Prediction Require: n objects, vote matrix W , K iterations ML (D, C, B, A) and (C, D, B, A) Ensure: ans = predicted maximum object Indegree (D, C, B, A) construct Gv = (V, A) from W Local (D, C, B, A) compute d+ [·] for each vertex {compute all outdegrees} for i : 1 . . . n do PageRank Maximum object = C if d+ [i] == 0 then Iterative (C, D, B, A), (C, D, A, B), (D, C, B, A), or wii ← 1 (D, C, A, B) end if end for Table 1: Predictions using each heuristic for Figure 1. 1 pr0 [·] ← n {pr0 is the PageRank vector in iteration 0} for k : 1 . . . K do ject C to be the maximum object in O. for i : 1 . . . n do Iterative Strategy: We next propose an Iterative heuristic strategy for j : 1 . . . n, j = i do to determine the maximum object in O. The general framework is w prk [i] ← prk [i] + d+ji [j] prk−1 [j] the following: end for 1. Place all objects in a set. end for 2. Rank the objects in the set by a scoring metric. end for compute period[·] of each vertex using final iterations of pr[·] 3. Remove the lower ranked objects from the set. for i : 1 . . . n do 4. Repeat steps 3 and 4 until only one object remains. s[i] ← 0 {s[·] is a vector storing average PageRank} There are two parameters we can vary in this framework: the for j : 0 . . . period[i] − 1 do scoring metric and the number of objects eliminated each itera- s[i] ← s[i] + prK−j [i] tion. Let us define the dif (i) metric of object oi to be equal to end for wins(i) − losses(i). An implementation of the Iterative strategy s[i] s[i] ← period[i] using the dif metric is displayed in Strategy 5. In our particular im- end for plementation, we emphasize computational efficiency and remove ans ← argmaxi s[i] half of the remaining objects each iteration. The Iterative strat- egy relies upon the elimination of lower ranked objects before re- ranking higher ranked objects. With each iteration, as more objects in terminal SCCs. To handle the oscillating PageRank in terminal are removed, the dif s of the higher ranked objects separate from SCCs, we execute our PageRank update equation (Equation 12) for the dif s of the lower ranked objects. Basically, by removing lower a large number of iterations, denoted as K in Strategy 4. Then, we ranked objects, the strategy is able to more accurately rank the re- examine the final iterations, say final 10%, of the PageRank vector maining set of objects. The strategy can be thought of as iteratively to empirically determine the period of each vertex, where we de- narrowing in on the maximum object. fine the period as the number of iterations for the PageRank value It is important to note that other scoring metrics can be used with of a vertex to return to its current value. In practice, we find that this Iterative strategy as well. For example, by iteratively ranking running PageRank for K iterations, where K = O(n), is sufficient with the Local heuristic, we were able to achieve (slightly) better to detect the period of nearly all vertices in terminal SCCs. For performance than the simple dif metric. Our method is similar to example, consider a graph among 3 objects A, B, C with 3 arcs: the Greedy Order algorithm proposed by Cohen et al. , who con- (A, B), (B, C), and (C, B). All vertices initially have 31 PageR- sidered a problem related to feedback arc set. Our strategy differs ank probability. After 1 iteration, the PageRank vector is (0, 32 , 13 ). in that it is more general (e.g., it can utilize multiple metrics), and After 2 iterations, the PageRank vector is (0, 13 , 32 ). And so on. In our strategy can be optimized (e.g., if we eliminate half of the ob- this example, object B and C each have periods of 2. jects each iteration, we require only a logarithmic number of sorts, With the periods computed for each vertex, we compute an av- as opposed to a linear number). erage PageRank value for each vertex over its period. This aver- The Iterative strategy can also be viewed as a scoring function age PageRank is used as the scoring function s(·) for this strat- s(·), like the prior heuristics we have examined. Denoted as s[·] egy. After the termination of PageRank, we sort the vertices by in Strategy 5, we can assign each object a score equal to the it- decreasing order of s(·), and predict that the vertex with maximum eration number in which it was removed from set T . Using this average PageRank corresponds to the maximum object in O. Note scoring function s(·), the predicted maximum object is then simply that our PageRank heuristic is primarily intended to predict a maxi- argmaxi s(i). mum object, not to predict a ranking of all objects (as many objects Returning to the example graph in Figure 1, the Iterative heuris- will end up with no PageRank). However, for completeness, when tic first computes the dif metric for each object: dif (A) = −2, evaluating PageRank in later experiments, we still do consider the dif (B) = −2, dif (C) = 1 and dif (D) = 3. The objects are predicted ranking induced by PageRank. The details of our imple- then placed in a set and sorted by dif . In the first iteration, objects mentation are displayed in Strategy 4. A and B are assigned ranks 3 and 4 and removed from the set. To illustrate our PageRank heuristic, consider again the example Then, dif is recomputed among all remaining objects in the set, in Figure 1. There are 2 SCCs in the graph: (A) and (B, C, D), dif (C) = 0 and dif (D) = 0. In the second iteration, either object with (B, C, D) being a terminal SCC. Each of the 4 vertices is ini- C or D is removed and assigned rank 2. In the third iteration, the tialized with 0.25 PageRank. After the first iteration, the PageRank remaining object is removed and assigned rank 1. Therefore, the vector is (0, 0.375, 0.35, 0.275). After the second iteration, the predicted ranking of the Iterative heuristic is equally likely to be PageRank vector is (0, 0.375, 0.35, 0.275). After ~20 iterations, (C, D, B, A), (C, D, A, B), (D, C, B, A), or (D, C, A, B), with the PageRank vector oscillates around (0, 0.217, 0.435, 0.348). With the predicted maximum object of the heuristic being object C or D a sufficiently large number of iterations and an appropriately cho- with equal probability. sen convergence threshold, the heuristic determines a period of To summarize, Table 1 displays the predictions for ML and our 1 for both SCCs and computes an average PageRank vector of four heuristics for the example displayed in Figure 1. How can we (0, 0.217, 0.435, 0.348). The PageRank heuristic then predicts ob- determine which heuristic strategy is superior to the others?

8. 0.18 1 1 0.16 0.9 0.9 0.14 0.8 0.8 0.7 0.12 0.7 P at 1 P at 1 P at 1 0.6 0.1 0.6 0.5 0.08 0.5 DEG 0.4 DEG DEG 0.06 LOC 0.3 LOC 0.4 LOC 0.04 PR 0.2 PR 0.3 PR ITR ITR ITR 0.02 0.1 0.2 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 Edge Coverage Edge Coverage Edge Coverage Figure 3: Precision at 1 (P@1) versus Edge Coverage. 100 objects. p=0.55 (left), p=0.75 (middle), p=0.95 (right). Strategy 5 Iterative once. So 5n(n − 1) votes is equivalent to v = 10x Edge Coverage Require: n objects, vote matrix W in our experiments. Ensure: ans = predicted maximum object Each data point (given n, p, v values) in our results graphs is dif [·] ← 0 {dif [·] is the scoring metric} obtained from 5,000 runs. Each run proceeds as follows: We ini- for i : 1 . . . n do tialize W as an n × n null matrix and begin with an arbitrary true for j : 1 . . . n, j = i do dif [j] ← dif [j] + wij ; dif [i] ← dif [i] − wij permutation π ∗ of the objects in O. Let U denote the set of all tu- end for ples (i, j) where i = j. We randomly sample v tuples from U with end for replacement. After sampling a tuple (i, j), we simulate the human initialize set Q {which stores objects} worker’s comparison of objects oi and oj . If π ∗ (i) < π ∗ (j), with for i : 1 . . . n do probability p, we increment wji , and with probability 1 − p, we Q ←Q∪i increment wij . If π ∗ (j) < π ∗ (i), with probability p, we increment end for while |Q| > 1 do wij , and with probability 1 − p, we increment wji . sort objects in Q by dif [·] For each generated matrix W in a run, we apply each of our |Q| for r : ( 2 + 1) . . . |Q| do heuristic strategies to obtain predicted rankings of the objects in remove object i (with rank r) from Q O. Comparing the predicted ranking with π ∗ we record both (a) for j : j ∈ Q do a “yes” if the predicted maximum agrees with the true maximum, if wij > 0 then and (b) reciprocal rank, the inverse rank of the true maximum ob- dif [j] ← dif [j] − wij ; dif [i] ← dif [i] + wij ject in the predicted ranking. Finally, after all runs complete, we end if if wji > 0 then compute (a) Precision at 1 (P@1), the fraction of “yes” cases over dif [i] ← dif [i] − wji ; dif [j] ← dif [j] + wji the number of runs, and (b) the Mean Reciprocal Rank (MRR), the end if average reciprocal rank over all runs. end for As a first experiment, we consider the prediction performance end for of Maximum Likelihood (ML) and the four heuristics for a set of 5 end while objects with p = 0.75, displayed in Figure 2. We choose a small set ans ← S {S is the final object in S} of objects, so that ML can be computed. Looking at Figure 2(left), we find that as the number of votes sampled increases, the P@1 of 1 1 all heuristics (excluding PageRank) increase in a concave manner, 0.9 0.95 approaching a value of 0.9 for 10x Edge Coverage. In other words, 0.8 0.9 if 5n(n−1) votes are uniformly sampled, the heuristics can predict 0.85 the maximum object 90% of the time, even though average worker P at 1 0.7 MRR 0.8 0.6 ML 0.75 ML accuracy is 0.75. Similar prediction curves measuring MRR are 0.5 DEG DEG 0.7 0.4 LOC PR 0.65 LOC PR displayed in Figure 2(right). ITR ITR 0.3 0.6 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 ML has better performance than all the four heuristics. Edge Coverage Edge Coverage Figure 2: Comparison of ML and heuristics. Prediction performance ver- As expected, ML performs the best in Figure 2, but recall that sus Edge Coverage. 5 objects, p=0.75. P@1 (left), MRR (right). ML requires explicit knowledge of p, and it is computationally very expensive. Still, the ML curve is useful, since it tells us how far the 2.5 Experiments heuristics are from the optimal feasible solution (ML). Also, note In this section, we experimentally compare our heuristic strate- that PageRank (PR) performs poorly in Figure 2, indicating that gies: Indegree (DEG), Local (LOC), PageRank, (PR), and Itera- PageRank is poor when the number of objects is small. tive (ITR). We also compare them with the Maximum Likelihood (ML) Strategy, which we consider the best possible way to select Iterative is the best of the four heuristics when the number the maximum. However, since ML is computationally very ex- of votes sampled is n(n−1)2 , e.g. 1x Edge Coverage. pensive, we only do this comparison on a small scenario. For our For a larger experiment, we consider the problem of prediction experiments, we synthetically generate problem instances, varying for n = 100 objects in Figure 3. ML is necessarily omitted from : n (the number of objects in O), v (the number of votes we sample this experiment. Looking at the graphs, we first note that the It- for W ), and p (average worker accuracy). We prefer to use syn- erative (ITR) heuristic performs significantly better than the other thetic data, since it lets us study a wide spectrum of scenarios, with heuristics, particularly when p = 0.55 or p = 0.75. This is best highly reliable or unreliable workers, and with many or few votes. demonstrated by Figure 3(middle), which shows that for p = 0.75 In our base experiments, we vary the number of sampled votes and 10x Edge Coverage, the Iterative heuristic has a P@1 of over v, from 0 to 5n(n − 1) and vary worker accuracy p from 0.55 to 0.9, whereas the second best heuristic, Indegree (DEG), only has 0.95. As a point of reference, we refer to n(n−1) 2 votes as v = 1x a P@1 of approximately 0.5. Looking at the middle graph again, Edge Coverage, e.g. each pair of objects is sampled approximately note how the performance gap between the Iterative heuristic and