16-Computer Architecture and Structured Parallel Programming

Computer Architecture & Structured Parallel Programming • review aspects of computer architecture that are critical to high performance computing • discuss how to think about best algorithm design using structured parallel programming techniques • task vs. data parallelism and why data parallelism is key • introduce TBB, OpenMP* • introduce Intel® Xeon Phi™ architecture.
展开查看详情

1. Computer Architecture and Structured Parallel Programming James Reinders, Intel Parallel Computing CIS 410/510 Department of Computer and Information Science Lecture 17 – Manycore Computing and GPUs

2.

3. Computer Architecture & Structured Parallel Programming • review aspects of computer architecture that are critical to high performance computing HARDWARE • discuss how to think about best algorithm SOFTWARE design using structured parallel programming techniques SOFTWARE • task vs. data parallelism and why data parallelism is key SOFTWARE • introduce TBB, OpenMP* • introduce Intel® Xeon Phi™ architecture. HARDWARE © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune , Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the pro

4.

5. See the Forest © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune , Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed 4 as the pro

6. See the Forest A cliché about someone missing the “big picture” because they focus too much on details: They “cannot see the forest for the trees.” © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune , Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed 5 as the pro

7. See the Forest I architecture. © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune , Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed 6 as the pro

8. See the Forest I architecture. but… © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune , Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed 7 as the pro

9. See the Forest Can you teach parallel programming without first teaching computer architecture? © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune , Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed 8 as the pro

10. See the Forest Can you teach parallel programming without first teaching computer architecture? (Or without just teaching a single API?) © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune , Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed 9 as the pro

11. See the Forest TREES Cores HW threads Vectors Offload Heterogeneous Cloud Caches NUMA 10as the pro © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune , Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed

12. See the Forest TREES FOREST Cores Parallelism, Locality HW threads Parallelism, Locality Vectors Parallelism, Locality Offload Parallelism, Locality Heterogeneous Parallelism, Locality Cloud Parallelism, Locality Caches Parallelism, Locality NUMA Parallelism, Locality 11as the pro © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune , Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed

13. See the Forest TREES Advice: FORESTproper abstractions Cores Use tasks Locality Parallelism, HW threads Use tasks Locality Parallelism, Vectors Use SIMD (10:30 Parallelism, talk) Locality Offload Avoid, Use TARGET Parallelism, Locality Heterogeneous Avoid via neo-hetero Parallelism, Locality Cloud What’s a cloud? Parallelism, Locality Caches Use abstractions Parallelism, Locality NUMA Use abstractions Parallelism, Locality 12as the pro © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune , Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed

14. See the Forest TREES FOREST Cores Parallelism, Locality HW threads Parallelism, Locality Vectors Parallelism, Locality Offload Parallelism, Locality Heterogeneous Parallelism, Locality Cloud Parallelism, Locality Caches Parallelism, Locality NUMA Parallelism, Locality 13as the pro © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune , Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed

15. Teach the Forest Increase exposing parallelism. Increase locality of reference. 14as the pro © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune , Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed

16. Teach the Forest Increase exposing parallelism. Increase locality of reference. Why? Because it’s programming that addresses the universal needs of computers today and in the future future. 15as the pro © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune , Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed

17. Teach the Forest Increase exposing parallelism. Increase locality of reference. THIS IS YOUR MISSION 16as the pro © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune , Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed

18. Why so many cores? 17as the pro © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune , Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed

19. Why Multicore? The “Free Lunch” is over, really. But Moore’s Law continues! © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune , Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the pro

20. Processor Clock Rate over Time Growth halted around 2005 © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune, Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.

21. Transistors per Processor over Time Continues to grow exponentially (Moore’s Law) © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune, Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.

22. Moore’s Law Number of components (transistors) doubles about every 18-24 months. © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune , Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the pro

23. MIC AVX-512 AVX SSE MMX 80386 8086,  8088… 8008, 8080 4004 © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune, Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.

24. 61 © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune, Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.

25. Is this the Architecture Track? 24as the pro © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune , Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed

26. CPU These were simpler times. Memory CPU © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune, Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 25

27. CPU + cache Memories got “further away” Memory CPU (meaning: CPU speed increased Cache faster than memory speeds) A closer “cache” for frequently used data helps performance when memory is no longer a single clock cycle away. © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune, Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 26

28. CPU + caches Memory Memories keep getting “further away” CPU (L1) L2 (this trend continues today). Cache Cache More “caches” help even more (with temporal reuse of data). © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune, Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 27

29. CPU with caches As transistor density increased (Moore’s Law), cache capabilities were integrated onto CPUs. Memory CPU L1 L2 Higher performance external (discrete) caches persisted for some time while integrated cache capabilities increase. © 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Cilk, VTune, Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 28