# У┐ЉС╝╝у«ЌТ│Ћ--тЊЂУ┤етЏаТЋ░.ТђДУЃйТ»ћ

У┐ЉС╝╝у«ЌТ│ЋТў»ТїЄућеТЮЦтЈЉуј░У┐ЉС╝╝Тќ╣Т│ЋТЮЦУДБтє│С╝ўтїќжЌ«жбўуџёу«ЌТ│ЋсђѓтЂЄУ«ЙТѕЉС╗гТГБтюеуаћуЕХСИђСИфС╝ўтїќжЌ«жбў№╝їтЁХСИГТ»ЈСИфТйютюеуџёУДБтє│Тќ╣ТАѕжЃйТюЅТГБТѕљТюг№╝їТѕЉС╗гтИїТюЏТЅЙтѕ░СИђСИфТјЦУ┐ЉТюђС╝ўуџёТѕљТюгсђѓтйЊуЏ«ТаЄТў»Тюђт░ЈтїќТѕљТюгТЌХ№╝їТјЦУ┐ЉТюђС╝ўуџёТђ╗Тў»УХЁУ┐ЄТюђС╝ўсђѓтйЊуЏ«ТаЄТў»Сй┐тѕЕТХдТюђтцДтїќТЌХ№╝їТјЦУ┐ЉТюђС╝ўуџёТЅЇТў»ТюђС╝ўуџёсђѓ
т▒Ћт╝ђТЪЦуюІУ»дТЃЁ

1. Approximation Algorithms Subhash Suri November 27, 2017 1 Figure of Merit: Performance Ratio Рђб Suppose we are working on an optimization problem in which each potential solution has a positive cost, and we want to find one of near-optimal cost. When the objective is to minimize the cost, the near-optimal is always going to be more than the optimal. When the objective is to maximize profit, the near-optimal is less that the optimal. Рђб The standard measure (figure of merit) in Theory of Algorithms to measure the quality of approximation is the ratio: cost(A) ¤Ђ(n) = max , cost(OP T ) where the max is over all input instances of size n, and the objective function is minimizing. If the objective is a maximization, then we choose profit(OP T ) ¤Ђ(n) = max profit(A) 2 Approximating the Vertex Cover Рђб Given a graph G = (V, E), find a vertex cover C Ріє V of minimize size; that is, for each edge (u, v) Рѕѕ E, either u or v is in C. РђЊ cost(A) = size of VC found by our algorithm A; РђЊ cost(OP T ) = optimal VC size РђЊ ¤Ђ(n) = max (worst-case) value of cost(A)/cost(OP T ) over all n node graphs 1

2.Рђб This problem seems well-suited for greedy strategies. Perhaps the most obvious and natural greedy scheme is the following: РђЊ repeatedly choose the vertex that covers most edges until no edges left uncovered. Рђб The number of edges covered by a vertex v is its degree d(v). So, we repeatedly choose the max-deg vertex, delete all of its incident edges, and repeat, until no edges left. Рђб An example graph. Рђб When does it not find the optimal? How poorly can this algorithm perform in worst case? Is the worst-case ratio bounded (doesnРђЎt grow with n)? Рђб Unfortunately not! Рђб Counter-example is bipartite graph: Gn = (L + R, E), where L is a set of n vertices 1, 2, . . . , n. Рђб Then, for each i = 2, 3, . . . , n, we add another set of vertices Ri where |Ri | = n/i , and each vertex of Ri is then joined to i distinct vertices of L. Thus, each vertex of Ri has degree i. Рђб Rn consists of 1 vertex, connected to everyone in L. RiРѕњ1 also consists of a single vertex, connected to 1, 2, . . . , n Рѕњ 1. R2 has n/2 vertices, each of degree 2. Рђб If we run the greedy algorithm on this example, we will first choose Rn , then RnРѕњ1 , and so on until we choose all the vertices of R. How many vertices are in R? n/2 + n/3 + n/4 + . . . 1 = Рёд(n log n). Рђб So, cost(A) = ╬ў(n log n). Рђб However, OPT could have just chosen all the vertices of L, of which there are only n. Рђб So, the ratio is at least Рёд(log n)! Рђб Now, instead consider another simple greedy scheme: РђЊ while E not empty, 1. pick an arbitrary edge e = (u, v) 2. add both u and v to C 3. delete all edges from E incident to u or v 2

3. Рђб Example. Рђб Theorem: The second greedy Vertex Cover algorithm has size at most twice the optimal. Рђб Analysis. Let A be the set of edges picked by the greedy. Since no two edges in A can share a vertex, each of them requires a separate vertex in OPT to cover. So, OP T РЅЦ |A|. On the other hand, our greedy cover has size |C| РЅц 2|A|. QED. Рђб Interestingly, the more natural greedy strategy of repeatedly picking the vertex of max degree can only achieve an approximation ratio log n. 3 Approximating the Traveling Salesman Tour Рђб Input: G = (V, E), a complete graph, where each edge e has a positive cost w(e). Рђб Output: The minimum cost tour visiting all the vertices of G exactly once. Рђб General Impossibility Theorem. There is no polynomial-time algorithm to ap- proximate TSP to any factor unless N P = P . Рђб Proof. РђЊ Suppose there is an algorithm that guarantees factor X approximation for TSP in polynomial time. We prove its impossibility by showing that such an algorithm can solve the HAMILTON cycle problem in polynomial time. РђЊ Given an instane G = (V, E) of the HAM problem, construct a TSP instance G as follows. РђЊ Set w(e) = 1 if e Рѕѕ E, and w(e) = (nX + 1) if e Рѕѕ E. РђЊ Now ask if G contains a TSP of cost РЅц n. РђЊ If G contains a HAM cycle, then the corresponding TSP has cost n, using the HAM cycle edges, each of cost one. РђЊ The approximation algorithm A must find a tour with cost РЅц nX. РђЊ Since each edge that is not in G has cost > nX, it cannot use that edge. So, the only tours that lead to acceptable approximation are the HAM cycles in G. РђЊ Conversely, if TSP returns a tour that costs more than nX, it must mean that G does not contain a HAM cycle. 3

4.3.1 TSP with Triangle Inequality Рђб The good news about TSP is that if the edge costs satisfy triangle inequality, that is, w(a, b) РЅц w(a, c) + w(c, b), for any a, b, c, then we can approx the tour quite well. Рђб Factor 2 approximation using MST. 4 Load balancing: minimizing MakeSpan Рђб Problem: Given a set of m machines M1 , . . . , Mm , and a set of n jobs, where job j needs tj time for processing, the goal is to schedule the jobs on these machines so all the jobs are finished as soon as possible. Рђб That is, find the minimum time (called MakeSpan) T by which all jobs can be executed collectively. Рђб Let Ai be the set of jobs assigned to machine Mi . Then, Mi needs time Ti = tj jРѕѕAi this is called the load on machine Mi . We wish to minimize T = maxi {Ti }, which is called the makespan. Рђб The decision problem is N P -complete: itРђЎs in N P because we can decide whether the makespan is T or not. For completeness, we can reduce subset sum to it (scheduling on two machines). Рђб We show that the following simple greedy method gives good approximation. The algorithm makes one pass over the jobs in any order, and assigns the next job j to the machine with the lightest current load. Рђб for j = 1, . . . , n 1. Let Mi be the machine with the minimum Ti among the machines 2. Assign j to Mi 3. Ai = Ai Рѕф {j} 4. Ti = Ti + tj 4

5.Рђб For instance, given 6 jobs, with sizes 2, 3, 4, 6, 2, 3, and 3 machines, the algorithm gives (2, 6), (3, 2), (4, 3), for the makespan of 8. The optimal makespan is 7: (3, 4), (6), (2, 2, 3). Рђб Analysis: Let T be the makespan produced by the algorithm, and T РѕЌ the optimal. Рђб Note that, unfortunately, we do not know the value of T РѕЌ . Nevertheless, we can use a lower bound for T РѕЌ to achieve an approximation guaranteee. Рђб A simple lower bound comes from the following trivial observation: since there is a total of j tj amount of work to be done by the m machines collectively, the makespan cannot be less than the avg load. Thus, 1 TРѕЌ РЅЦ tj (1) m j Рђб Unfortunately, this lower bound itself can be too weak: for instance, if we have one extremely long job, then the best we can do is to put it by itself on a single machine, but still the makespan has to be as long as this job. This greedy solution is optimal, but the lower bound may be way off, if the remaining jobs are all very small. But this suggests another lower bound: T РѕЌ РЅЦ maxj {tj } (2) Рђб Theorem: The greedy makespan produces an assignment with makespan T РЅц 2T РѕЌ . Рђб Proof. Look at the machine that attains the max load (makespan). Let this be Mi . Рђб Let j be the last job assigned to Mi . Рђб Key Observation: When j was assigned, Mi was the machine with the least load! Its load just before assignment of j was Ti Рѕњ tj . Рђб Since this was the lightest load, every other machine has load at least Ti Рѕњ tj . Thus, adding up all these load, we have Tk РЅЦ m ├Ќ (Ti Рѕњ tj ) k equivalently, 1 Ti Рѕњ tj РЅц Tk m k 5

6. Рђб But the sum on the right is just the sum of all the jobs, since each job is assigned to some machine. The quantity on the RHS is just the lower bound from (1), and thus Ti Рѕњ tj РЅц T РѕЌ Рђб Now we account for the last job. Here we simply use the inequality (2), which says that tj <= T РѕЌ . Thus, Ti = (Ti Рѕњ tj ) + tj РЅц 2T РѕЌ . QED. Рђб It is easy to construct examples where this bound is tight. There is a better approx- imation: if we first sort the jobs in the decreasing order of lengths, and assign them using the greedy strategy, then one can show the approximation factor 3/2. 5 Set Cover Рђб There is set U of n elements, and a list S1 , . . . , Sm of m subsets of U . A Set Cover is a collection of these sets whose union is U . Each set Si has a cost (or weight) wi , and our goal is to find a Set Cover of minimum weight. Рђб Imagine U is a set of n baseball cards you wish to collect. The market offers bundles S1 , . . . , Sm that are subsets of U , at prices w1 , . . . , wm . You wish to collect all the cards in U at the minimum possible total cost. This is the set cover problem. Рђб This is a fundamental NP-complete problem. We will develop a simple greedy algo- rithm for it, although the analysis is not trivial. Рђб The first idea for the greedy is to repeatedly choose the largest set in the list. But this may not be good if the next set includes most of the items already covered (procured). So, a better idea may be to choose the next set that covers most items currently not covered! Рђб Greedy-Set-Cover 1. Initialize R = U ; (set of remaining uncovered items) 2. while R not empty (a) Select set Si that minimizes |SiwРѕЕR| i (b) Delete elements of Si from R 3. return selected sets Рђб Picture 6

7.Analysis of the algorithm Рђб The sets chosen by the algorithm clearly form a Set Cover. The main question is how much larger is the weight of this cover than the optimal wРѕЌ . Рђб Like the load balancing problem, we will need a lower bound for the optimal to compare, but unlike that problem, finding a good lower bound is more subtle and non-trivial. Рђб Let us look at our greedy РђЮheuristicРђЮ and investigate the intuitive meaning of the ratio: wi /|Si РѕЕ R| Рђб We can think of this as the cost paid for covering each new item. Suppose we РђюchargeРђЮ this cost cs to each of the items s newly covered. Note that each item is charged a cost only onceРђћwhen it is covered for the first time. Рђб For bookkeeping only purpose, letРђЎs add this line to the Greedy Algorithm when Si is added, to account for this cost. Рђб First note that when Si is added, its weight is distributed even among the newly covered elements. Thus these costs simple account for the weights of the sets in the cover. Рђб (Lemma 1:) If C is the greedy set cover, then wi = cs Si РѕѕC sРѕѕU Рђб The key to the proof is to ask: fix a particular set Sk . How much cost cs all the elements of Sk can incur? Рђб In other words, compared to wk , how large is sРѕѕSk cs ? We derive an upper bound for any set Sk , even those not selected by the greedy. Рђб (Lemma 2:) For any set Sk , cs РЅц wk ├Ќ H(|Sk |), sРѕѕSk where H(n) = 1 + 1/2 + 1/3 + . . . + 1/n is the Harmonic number. Рђб (Proof.) To simplify the notation, lets assume that the items of Sk are the first d = |Sk | items: s1 , s2 , . . . , sd . Рђб Further, also assume that these items are labeled in the order in which they are assigned cost c by the greedy. (This is just renaming of items.) 7

8.Рђб Now consider the iteration in which item sj is coverd by the greedy, for some j РЅц d. At the start of the iteration, sj , sj+1 , . . . , sd Рѕѕ R (because of our labeling). Рђб Thus, |Sk РѕЕ R| is at least (d Рѕњ j + 1). Therefore, the average cost of items covered by Sk is wk wk РЅц |Sk РѕЕ R| dРѕњj+1 Рђб This is not necessarily an equality because several elements j , j + 1, . . . , j, j + 1, . . . may get covered in one step. Рђб Suppose the set chosen by the Greedy for this iteration is Si , so the average cost of Si has to be less than the average cost of Sk . Рђб The key observation is that itРђЎs the average cost of Si that get assigned to sj . So, we have csj = wi /(|Si РѕЕ R|) РЅц wk /(|Sk РѕЕ R|) РЅц wk /(d Рѕњ j + 1) Рђб We now add up all these costs for all items of Sk : d cs = cs j sРѕѕSk j=1 d РЅц wk /(d Рѕњ j + 1) j=1 = wk (1/d + 1/d Рѕњ 1 + ...1/2 + 1) = wk ├Ќ H(d). Рђб Let dРѕЌ = max |Si |, the size of the largest set. Now comes the final part: Рђб (Theorem:) The greedy set cover has weight at most H(dРѕЌ ) ├Ќ wРѕЌ . Рђб Proof. Let C РѕЌ be the optimal set cover, so wРѕЌ = Si РѕѕC РѕЌ wi . Рђб For each of these sets, the previous result implies that wi РЅЦ 1/H(dРѕЌ ) ├Ќ cs sРѕѕSi 8

9.Рђб But these sets form a set cover, so cs РЅЦ cs Si РѕѕC РѕЌ sРѕѕSi sРѕѕU Рђб So, we now have wРѕЌ = wi Si РѕѕC РѕЌ РЅЦ (1/H(dРѕЌ ) ├Ќ cs ) Si РѕѕC РѕЌ sРѕѕSi РЅЦ 1/H(dРѕЌ ) ├Ќ cs sРѕѕU РЅЦ 1/H(dРѕЌ ) wi Si РѕѕC РЅЦ 1/H(dРѕЌ )wC 9

ТѕЉт░▒Тў»ТѕЉ№╝Ђ
ти▓т░єжЊЙТјЦтцЇтѕХУЄ│тЅфУ┤┤ТЮ┐