- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
05Data Structures and Algorithms--BSP model
展开查看详情
1 .Data Structures and Algorithms in Parallel Computing Lecture 5
2 .BSP Processors + network + synchronization Superstep Concurrent parallel computation Message exchanges between processors Barrier synchronization All processors reaching this point wait for the rest
3 .Supersteps A BSP algorithm is a sequence of supersteps Computation superstep Many small steps Example: floating point operations (addition, subtraction, etc.) Communication superstep Communication operations each transmitting a data word Example: transfer a real number between 2 processors In theory we distinguish between the 2 types of supersteps In practice we assume a single superstep
4 .Some applications Pagerank Single Source Shortest Path (SSSP) Connected Components
5 .Pagerank Analysis algorithm to determine the importance of a document Based on the number of references to it and the importance of the source documents Named after Larry Page
6 .Pagerank
7 .Pagerank Source: wikipedia
8 .Solving Pagerank System of linear equations Iterative loop till convergence
9 .Pagerank in Pregel
10 .Experimental results On Apache Giraph Taken from http://muratbuffalo.blogspot.ro/2015/09/one-trillion-edges-graph-processing-at.html
11 .SSSP Find shortest path between a single source vertex and every other vertex in the graph Dijsktra’s algorithm for sequential computations
12 .Sequential SSSP Source: wikipedia
13 .SSSP in Pregel
14 .Experimental results Binary trees
15 .Connected components (recap) Label 2 vertices with same label iff there is a path between the two Sequentially it can be achieved by depth first or breadth first search
16 .CC in Pregel Use graph contraction Algorithm Each vertex starts with a label Each vertex sends its label to all neighbors Each vertex replaces its label with the minimum (maximum) value it receives from neighbors Algorithm stops when convergence is achieved
17 .Experimental results
18 .Apache Giraph Pregel is proprietary Giraph is an open source Pregel implementation Runs on standard Hadoop Computation is executed in memory Can be a job in a pipeline (MapReduce) Uses Zookeeper for synchronization
19 .Building an application Create a custom vertex by extending BasicVertex Create a custom input format Adjacency list where each line looks like vertexID neighborID1 neighborID2 … Extend the TextVertexInputFormat Create a custom output format Extend the TextVertexOutputFormat
20 .What’s next? Vertex centric vs. subgraph centric Load balancing Importance of partitioning and graph type ...