- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
FALCON: An Optimizations Java JIT
展开查看详情
1 .Philip Reames Azul Systems Falcon: An optimizing Java JIT
2 .Agenda Intro to Falcon Why you should use LLVM to build a JIT Common Objections (and why they’re mostly wrong) 2
3 .What is Falcon? Falcon is an LLVM based just-in-time compiler for Java bytecode. Shipping on-by-default in the Azul Zing JVM. Available for trial download at: www.azul.com/zingtrial/ 3
4 .This talk is about lessons learned, both technical and process 4
5 .Zing VM Background A hotspot derived JVM with an awesome GC (off topic) Multiple tiers of execution Interpreter Tier1 Compiler Tier2 Compiler Rapidly generated code to collect profiling quickly Run once bytecode and rare events Compile hot methods for peak performance 5
6 .Business Need The existing C2 compiler is aging poorly vectorization (a key feature of modern x86_64) is an afterthought very complicated codebase; "unpleasant" bug tails the norm d ifficult to test in isolation Looking to establish a competitive advantage Long term goal is to outperform competition Velocity of performance improvement is key 6
7 .Development Timeline April 2014 Proof of concept completed (in six months) Feb 2015 Mostly functionally complete April 2016 Alpha builds shared with selected customers Dec 2016 Product GA (off by default) April 2017 On by default Team size: 4-6 developers, ~20 person years invested 7
8 .Team Effort Falcon Development Bean Anderson Philip Reames Chen Li Sanjoy Das Igor Laevsky Artur Pilipenko Daniel Neilson Anna Thomas Serguei Katkov Maxim Kazantsev Daniil Suchkov Michael Wolf Leela Venati Kris Mok Nina Rinskaya + VM development team + Zing QA + All Azul (E-Staff, Sales, Support, etc..) 8
9 .9 Various application benchmarks + SPECjvm + Dacapo Collected on a mix of haswell and skylake machines
10 .Why you should use LLVM to build a JIT Proven stability, widespread deployments Active developer community, support for new micro-architectures Proven performance (for C/C++) Welcoming to commercial projects 10
11 .Why you should use LLVM to build a JIT Proven stability, widespread deployments Active developer community, support for new micro-architectures Proven performance (for C/C++) Welcoming to commercial projects 10
12 .Why you should use LLVM to build a JIT Proven stability, widespread deployments Active developer community, support for new micro-architectures Proven performance (for C/C++) Welcoming to commercial projects 10
13 .First, a bit of skepticism... 13 Is this something you can express in C? If so, LLVM supports it. e .g. deoptimization via spill to captured on-stack buffer buf = alloca (…); buf [0] = local_0; … a->foo( buf , actual_args …) The real question, is how well is it supported?
14 .Functional Corner-cases If you can modify your ABI (calling conventions, patching sequences, etc..), your life will be much easier. You will find a couple of important hard cases. Ours were: “anchoring" for mixed stack walks r ed-zone arguments to assembler routines support for deoptimization both checked and async GC interop 14
15 .Be wary of over design 15 Common knowledge that quality of safepoint lowering matters. gc.statepoint design Major goal: allow in register updates Topic of 2014 LLVM Dev talk 2+ person years of effort It turns out that hot safepoints are inliner bug .
16 .Get to Functional Correctness First Priority 1: Implement all interesting cases, get real code running Priority 2: Add tests to show it continues working Design against an adversarial optimizer . Be wary of over-design, while maintaining code quality standards. 16 See backup slides for more specifics on this topic.
17 .LLVM is a huge dependency Objection 2 of 6 17
18 .18 Potentially a real issue, but depends on your definition of large. From our shipping product: libJVM = ~200mb libLLVM = ~40mb So, 20% code size increase.
19 .We added an LLVM backend; it produced poor code Objection 3 of 6 19
20 .Importance of Profiling Tier 1 collects detailed profiles; Tier 2 exploits them 25% or more of peak application performance 20
21 .Prune Untaken Paths Handle rare events by returning to lower tier, reprofiling , and then recompiling. Interpreter Tier1 Code Tier2 Code Rare Events %ret = call i32 @ llvm.experimental.deoptimize () [" deopt "(..)] ret i32 %ret 21
22 .Predicated devirtualization switch (type(o)) case A: A::foo(); case B: B::foo(); default: o->foo(); Critical for Java, where everything is virtual by default switch (type(o)) case A: A::foo(); case B: B::foo(); default: @ deoptimize () [“ deopt ”(…)] 22
23 .Implicit Null Checks % is.null = icmp eq i8* %p, null br i1 % is.null , label %handler, label % fallthrough , ! make.implicit !{} test rax , rax jz <handler> rsi = ld [rax+8] rsi = ld [rax+8] fault_pc __ llvm_faultmaps [ fault_pc ] -> handler handler: call @__ llvm_deoptimize 23
24 .Local Code Layout branch_weights for code layout Sources of slow paths: GC barriers, safepoints handlers for builtin exceptions result of code versioning 50%+ of total code size Hot cold hot cold h ot cold 24 hot hot h ot cold c old cold
25 .Local Code Layout branch_weights for code layout Sources of slow paths: GC barriers, safepoints handlers for builtin exceptions result of code versioning 50%+ of total code size Hot cold hot cold h ot cold 24 hot hot h ot cold c old cold
26 .Exploiting Semantics LLVM supports a huge space of optional annotations Both metadata and attributes ~6-12 month effort Performance Analysis Root Cause Issue Add Annotation Fix Uncovered Miscompiles 26
27 .Exploiting Semantics LLVM supports a huge space of optional annotations Both metadata and attributes ~6-12 month effort Performance Analysis Root Cause Issue Add Annotation Fix Uncovered Miscompiles 26
28 .Expect to become an LLVM developer You will uncover bugs, both performance and correctness You will need to fix them. Will need a downstream process for incorporating and shipping fixes. 28 See backup slides for more specifics on this topic.
29 .Status Check Weve got a reasonably good compiler for a c-like subset of our source language. Were packaging a modified LLVM library. This is further than most projects get. 29