Are we there Yet

在我们不断的研究中,我们发现从Oracle、MS SQL或任何其他封闭源代码数据平台迁移对于大型组织来说越来越重要。


1. Are we there Yet?? (The long journey of Migrating from close source to opensource solution) Marco (the Grinch) Tusa Percona

2. About me • Open source enthusiast • Consulting team manager • Principal Architect • Working in DB world from 25 years • Open source developer and community contributor 2

3.Section header Click to add text

4.Gartner predicts by 2022: § Half of existing commercial database instances will have migrated or will be in process of migrating to an Open Source DBMS § More than 70% of greenfield applications will use an Open Source DBMS

5.Migration Services - Mitigating the Risk - Getting it Right Motivations § Average annual spend for enterprise database support: 5M-20M § Annually compounding support costs are no longer tolerable Challenges and Risks § 78% - Reported migration projects more difficult than expected § 65% - Expect the planning to take more than 6 months § 90% - Reported migration projects took longer than expected

6.Three Questions Why migrate What can migrate How to succeed

7.Common reasons to migrate Cost 60% Business Reasons (Politics) 35% Technical requirements 5%

8.Migration was a THING from long • Now a huge business • There is a lot of confusion (too many simplifications) • Business/Politics often overcome Technical reasons and common sense Correct expectations is the key

9.This is not a new thing Migration from close source to open source is OLD 2008 pre SUN 2010 Oracle

10.Migrate to what? Migration from premises to cloud is not new as well Premises VS Cloud Or The place where you know you need SAs and DBAs and fantasyland where all works by magic

11.Migrate to what? Migration to MySQL/MariaDB or PG ? Or something else???

12.What are the most common steps? First of all, do not rush take the right time to do all well • Understand • Map 4 Main phases: • Rewrite 1. Assessment • Code/SQL 2. POC • Move data 3. Migrate • Test 4. Go Live • Go Live

13.Assessment The journey will be long … do not start if you are not sure about Not everything can be migrated, but you don’t know Not all the schemas will take the same time, but you don’t know Not all the code can be rewritten, but you do not know You need to KNOW before you move a finger

14.POC A Proof Of Concept is: evidence sufficient to establish a thing as true, or to produce belief in its truth. As such is a critical moment to define if what was Assess can be done or not. Test … test … test Not only port the data Performance & functionalities Be courageous, and if it will not work, be ready to drop

15.Migrate Apply knowledge acquire during POC to Transform schemas Rewrite or export code Move data Implement DBOps procedure (db optimizations; backup/restore: etc) Test applications

16.Go live Replicate data Keep it up to date Cutover time Sounds simple right? I have news for you, it is not!

17.Defining the Process - Migration Methodology Analyze Understand Map Src/Dest Re/Design Extract Src Something Breaks Convert Import Schemas Logic data Validate Test/POC Index Partition Success

18.Why MySQL Or Postgres I like to start from : • Scalability and Flexibility • High Performance • High Availability • Robust Transactional Support • Web and Data Warehouse Strengths • Strong Data Protection • Comprehensive Application Development • Management Ease • Open Source Freedom and 24 x 7 Support • Lowest Total Cost of Ownership 18

19. 10 things to know about MySQL 1 Subqueries are poorly optimized (still true) 2 Data integrity checking is very weak, and even basic integrity constraints cannot always be enforced. (replication) 3 Most queries can use only a single index per table; multi-index query plans exist in certain cases, but the cost is usually underestimated by the query optimizer, and they are often slower than a table scan. 4 Foreign keys are not supported in most storage engines. 5 Execution plans are not cached globally, only per-connection. 6 There are no integrated or add-on business intelligence, OLAP cube, etc packages. 7 There are no materialized views (also if we can use Event scheduler) 8 Native replication is asynchronous and has many limitations and edge cases. 9 DDL such as ALTER TABLE or CREATE TABLE is non-transactional. It commits open transactions and cannot be rolled back or crash-recovered. 10 Each storage engine can have widely varying behavior, features, and properties. (positive and negative) 19

20.4 things to know about Postgres 1. Changing Primary on a secondary require service restart 2. Partitions implemented as separate tables 3. Also minor version upgrade can be difficult (start from the Primary VS MySQL start from the last Slave) 4. Global Temporary table do not exists 20

21.The Motto Use the right tool for the job 21

22.Let us see some more details Click to add text

23.The assessment Prepare a plan, and do not improvise • Talk with Stakeholders (business/DBA/developers) • Analyze the source (from application to data design) • Identify show stoppers • Identify how to map what to what • Identify how to organize the target Most important: Be ready to do not force migration. If it does not make sense to proceed, STOP!

24.Most common source cases • Database is used only to store data all the logic reside in the application • Database contains logic such as stored procedure and complex package • Database containing data for data warehouse • Real time data and historical records (telephone company) 24

25.The assessment Mitigating risk of failure When analyzing the source database(s) what should be the outcome? • Easy to understand list of what is and out • Identify Source type (Simple data move; data + code; etc.) • In detail review per schema of complexity • Assessment of modification and effort database objects • Assessment of functions/functionalities used (also in the application) not always possible • Application assessment and review 25

26.The assessment how it looks like Migration levels (ML): 1. Migration that can be run automatically 2. Migration with code rewrite and a human-days cost up to 10 days 3. Migration with code rewrite and a human-days cost up to 30 days 4. Migration with code rewrite and a human-days cost up to 60 days 5. Migration with code rewrite and a human-days cost above 60 days 6. Not portable: Application code is dependent by the DB platform and is not possible to change the application code Sub technical levels: ML.1 = trivial: no stored functions and no triggers ML.2 = easy: no stored functions but with triggers, no manual rewriting ML.3 = simple: stored functions and/or triggers, no manual rewriting ML.4 = manual: no stored functions but with triggers or views with code rewriting ML.5 = difficult: stored functions and/or triggers with code rewriting

27.The assessment how it looks like Man days Hestimated Difficulty Color Schema Application Man days per schema instances Level code type ABC Drupal 2.0 1 2 1-1 DEF wordpres 1.0 1 1 1-1 GHI My Java app 1.5 1 1.5 1-1 LMN Sales App 2.0 1 2 1-1 OPQR Booking 2.0 1 2 1-1 STV Insurances 5.0 12 30 3-3 Z1 Tikets 12 10 48 4-5(*)(**) AA1 Advertise 2.00 19 30 1-1* AA2 Network App 2.0 200 80 1-1* *Per instance **Schema may take short time, but Store procedure require rewrite

28. The assessment how it looks like Color Difficulty Color code Schema Application Man days code Level (My) (PG) Schema1 App1 64 5.5 Schema2 App2 1 1.1 Schema3 App2 200 5.5 Schema4 App2 10 2.5 Schema5 App3 250 6.5 Schema6 App4 150 5.4

29. The assessment how it looks like MYSQL COMPLEXITY DISTRIBUTION High 23 high, 8 DBMS_UTILITY 1 UTL_FILE 21 Virtual Column Partitions 1 Low 190 Medium, 30 BTree Indexes 1 Bulk Collect 5 Low, 63 CLOB Data Types 6 Cursors 2 DBMS_OUTPUT 33 Float DataTypes 1 Raw Data Type 3 Sequences 133 Table Triggers 4 Views 2 Medium 90 Functions 51 Stored Procedures 33 User Defined Types 6 Grand Total 303