Embedding SQL Engine to Your Application

-比较流行的嵌入式数据库引擎(Berkeley DB、SQLite、Firebird Embedded、已弃用的libmysqld Embedded)。


1.Embedding SQL Engine to Your Application Iwo Panowicz Percona

2.What’s an Embedded Database? ● A library embedded in the application, that implements methods to access and manipulate data. ● A database running on an embedded computer mostly adapted for embedded purposes. 2

3.Libraries embedded in the application

4.What’s an Embedded Database? ● … but why not just flat flies/xml/json? ○ creating a database management system is harder than you might thought ○ saving time, ○ not reinventing the wheel. ● Mobile applications ● IoT ● Microservices 4

5.Libraries ● Usually lightweight. ○ SQLite can be fit under 300KB and be made to run in minimal stack space (~4KB) and heap space (~100KB). ● Doesn’t require any connection to a database instance. ● Widely used in IoT and mobile applications. ● Low latencies. ● Usually used with much smaller data sets that with conventional DBMS. ● Help processing data ○ In-memory databases are often used as temporary databases that require no permanent storage. ● DBA not needed. 5

6.Sample use case ● Mobile contacts manager ● E-book reader database ● Web browser cookie storage ● … anything that stores data. ○ like airplanes 6

7.MySQL Embedded, libmysqld ● The embedded MySQL server library makes it possible to run a full-featured MySQL server inside a client application. ● The main benefits are increased speed and more simple management for embedded applications. ● Written in C/C++ and available only for C/C++. ● The API is identical for the embedded MySQL version and the libmysqlclient. ● Deprecated in 5.7 and fully removed in 8.0. 7

8.MySQL Embedded, major restrictions ● No user-defined functions (UDFs). ● No networking (handled by MySQL). ○ No Replication. ● No Event Scheduler. ● No Performance Schema. 8

9.MySQL Embedded, restrictions ... #include "mysql.h" MYSQL *mysql; MYSQL_RES *results; MYSQL_ROW record; ... int main(void) { [1] mysql_library_init(num_elements, server_options, server_groups); [2] mysql = mysql_init(NULL); [3] mysql_options(mysql, MYSQL_READ_DEFAULT_GROUP, "libmysqld_client"); [4] mysql_options(mysql, MYSQL_OPT_USE_EMBEDDED_CONNECTION, NULL); 9

10.MySQL Embedded, restrictions [5 ] mysql_real_connect(mysql, NULL,NULL,NULL, "database1", 0,NULL,0); [6 ] mysql_query(mysql, "SELECT column1, column2 FROM table1"); [7 ] results = mysql_store_result(mysql); while((record = mysql_fetch_row(results))) { printf("%s - %s \n", record[0], record[1]); } [8 ] mysql_free_result(results); [9 ] mysql_close(mysql); [10] mysql_library_end(); return 0; } 1 0

11.SQLite ● Small. Fast. Reliable. Choose any three. ● Extremely popular embedded database systems. ProxySQL uses it. ● SQLite author claims that reads and writes small blobs 35% faster than the same blobs can be read from or written to individual files on disk using fread() or fwrite(). YMMV. ● SQLite database holding 10-kilobyte blobs uses about 20% less disk space than storing the blobs in individual files. YMMV. 1 1

12.SQLite ● Zero-Configuration apart for a schema. ● In-Memory databases ● Single Database Files ● SQL statements compile into virtual machine code ● Implements most of the SQL-92 ○ basic support of triggers ○ basic support ALTER TABLE 1 2

13.SQLite 1 3

14.Interface ● SQLite interface elements can be grouped into three categories: ○ List of objects ■ list of all abstract objects/datatypes used (sqlite3_stmt). ○ List of Constants ■ list of numeric constants (#define SQLITE_OK) ○ List of Functions ■ List of all functions and methods (sqlite3_initialize). Most applications only use a handful. 1 4

15.Tokenizer ● When a SQL statement is to be evaluated it is first sent to tokenizer. ● The tokenizer breaks the SQL text into tokens and hands those tokens one by one to the parser. 1 5

16.Parser ● The parser assigns meaning to tokens based on their context. ○ Assembles tokens into a parse tree. ● SQLite uses own solutions, called Lemon. ○ … which does the same job as YACC/BISON. ● Lemon is thread-safe. ● Lemon and its grammar file define the SQL language that SQLite understands. 1 6

17.Code Generator ● Code Generator analyzes the parser tree and generates opcode that performs the work of the SQL statement. ● The opcode is not SQLite’s API. ○ Details about the opcode change between releases. ● Sample opcode: sqlite> EXPLAIN SELECT 1; addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Init 0 1 0 00 Start at 1 1 Integer 1 1 0 00 r[1]=1 2 ResultRow 1 1 0 00 output=r[1] 3 Halt 0 0 0 00 1 7

18.Virtual Machine ● Executes the Code Generator program. ● Some functions are implemented/inlined as bytecode directly by the code generator: ○ typeof(), coalesce() ● Some opcodes that Virtual Machine is executing are self-altering. 1 8

19.B-Tree ● SQLite stores ○ each table in a separate Binary Tree ○ each index in a separate Binary Tree ● All Binary Trees are stored in a single file. ● The file format is stable and is guaranteed to be compatible moving forward. 1 9

20.Pager ● Records are stored in fixed-sized pages. ○ Page size can be any power of 2 between 512 and 65536. ○ The default is 4096 (4KB). ● It is responsible for reading, writing and caching pages. ● It provides the rollback and atomic commit abstraction ● It takes care of locking of the database file. 2 0

21.OS Interface ● It is called VFS. ● VFS is what makes SQLite portable across operating systems ○ provides methods for reading/writing data from a file ○ obtaining randomness, ○ finding current time, etc. ● Whenever any of the other modules in SQLite needs to communicate with the operating system, they invoke methods in the VFS. 2 1

22.Accessories ● All helpers, that SQL uses: ○ memory allocation, ○ caseless string comparisons, ○ and even own printf() implementation. 2 2

23.SQLite; when not to use ● While it works great for many use-cases, it is not recommended for cases when: ○ Data is separated from the application (client/server). ■ NFS is NOT an option. ● but if you really need it then check if fnctl() works. ● Async I/O might be faster for NFS. ○ The dataset is very large or complex. ○ High concurrency is needed. 2 3

24.SQLite 2 4

25.SQLite 2 5

26.Berkeley DB ● Initially released in 1996. ● Key-value database. ● Database objects can use various access methods: btree, hash, heap, queue, recno. ● Currently, Berkeley DB name is given to three different products: ○ Berkeley DB (C edition) ○ Berkeley DB Java Edition ○ Berkeley DB XML 2 6

27.Berkeley DB ● Each major release cycle has introduced a single new major feature. Most Notable features: 1.85 Last UCB (University of California, Berkeley) release 2.0 Transactions, recovery 3.0 Queue AM, POSIX threads, subdatabases 4.0 Secondary Indices 4.1 Replication (master-slave) 4.2 Encryption 4.4 Sequence numbers 4.5 MVCC 5.0 Full Text, R-Tries, SQL Api, JDBC 2 7

28.Berkeley DB ● Oracle provides four Berkeley DB products: ○ Berkeley DB Data Store ○ Berkeley DB Concurrent Data Store ○ Berkeley DB Transactional Data Store ○ Berkeley DB High Availability 2 8

29.Berkeley DB ● Berkeley DB Data Store ○ An embeddable, high-performance data store. ○ Supports ■ multiple concurrent threads of control ■ multiple concurrent processes of control ○ Does not support locking. ■ Concurrent writes need to be locked on an application side. 2 9