overlayfs: Current Status and Upcoming Future

overlay filesystem is a kind of union-filesystem for Linux which has been widely used, such as container. It is base on basic filesystem(s) and merge different layers(filesystems/directories) hierarchically into one, and share objects in basic layers between different overlay filesystems. In the past year, a lot of defects have been repaired and some new features has been added into the Linux kernel upstream, such as "constant inode number", "prevent hardlink break" and "nfs export" by Amir Goldstein, and user space program fsck.overlay is under way now. However, there are still many drawbacks for various technical reasons.

1.Overlayfs Current status and upcoming future www.huawei.com Author/ Email: Yi Zhang / yi.zhang@huawei.com Version: 2018.6 HUAWEI TECHNOLOGIES CO., LTD.

2.Overlayfs  Introduction  What is overlayfs ?  Overlayfs use cases  Layers  Current status  Upcoming future HUAWEI TECHNOLOGIES CO., LTD. 2

3.Overlayfs - Introduction  What is overlayfs ?  OverlayFS is a modern union fs implementation for Linux (v3.18)  Cover up file in lower dir  Merge dir  Copy up  Mulit lower layer (v4.0) Overlayfs A Overlayfs B  Use cases Merge Layer D A B C A B C D  Docker  … Upper Layer D B D Lower Layer 1 A C Lower Layer 2 A B HUAWEI TECHNOLOGIES CO., LTD. 3

4.Overlayfs - Layers APP APP APP LIB User space Kernel VFS Overlay EXT4 XFS ... Block Layer Device Driver HUAWEI TECHNOLOGIES CO., LTD. 4

5.Overlayfs  Introduction  Current status  Basic features: copy up, whiteout, opaque dir.  New features: redirect dir, clone up, concurrent copy up, consistent st_ino & st_dev & d_ino, index, nfs export, consistent fd, delayed copy up.  Upcoming future HUAWEI TECHNOLOGIES CO., LTD. 5

6.Overlayfs - Basic features  Copy up  Copy when open for write  Copy the entire file to upper layer  Inconsistent fd, st_dev & st_ino  Hard link breakup A A rename A Tmp Work dir copy up A A HUAWEI TECHNOLOGIES CO., LTD. 6

7.Overlayfs - Basic features  Whiteout  Cover a deleted object in lower layer  Requires underlying fs support dtype and exchange rename  Opaque dir  New dir cover deleted object in lower  Save opaque xattrs in upper object C B C whiteout opaque B D HUAWEI TECHNOLOGIES CO., LTD. 7

8.Overlayfs – Redirect dir feature  Redirect dir (Since v4.10)  Support dir rename which contain dirs from lower layer  Save origin path in redirect xattr on upper dir  Lookup by redirect path Generic merge dir Redirect merge dir A B D C DirA DirC B DirB C Redirect to DirB whiteout DirA DirC A D DirA DirB HUAWEI TECHNOLOGIES CO., LTD. 8

9.Overlayfs – Clone up & Concurrent copy up  Clone up (v4.10)  Requires underlying fs support reflink (xfs, btrfs)  Lower layers and upper on the same file system  Concurrent copy up (v4.11)  Base on tmpfile in underlying fs  Relax global copy up lock Before Concurrent copy up A B C A B C ... ... X Y Z X Y Z Workdir Workdir A B C A B C ... ... ... ... ... A B C A B C ... ... ... X Y Z X Y Z HUAWEI TECHNOLOGIES CO., LTD. 9

10.Overlayfs – Consistent st_dev & st_ino & d_ino  Origin Feature (Since v4.12) Before copy up After copy up  Requires underlying fs support file handle  Use overlay st_dev and origin st_ino A A  Impure dir Feature (Since v4.12)  Dir may contain copied up object Impure  Save impure xattr on upper parent dir A  Xino Feature (Since v4.17) File handle  Require underlying fs has enough unused bits in inode number (ext4, xfs) A A  Combine lower fsid and origin st_ino HUAWEI TECHNOLOGIES CO., LTD. 10

11.Overlayfs – Index (avoid hard link breakup)  Index feature (Since v4.13)  Requires underlying fs support file handle  Copy up as index file  Save file handle and nlink xattr in index  Link index to upper target No Index Index A B C A B C Index dir A A IDX nlink=U+1 File handle A B C A B C HUAWEI TECHNOLOGIES CO., LTD. 11

12.Overlayfs – Constant file descriptor  Stack file ops (v4.18 ?)  Make overlay file struct to unify to overlay fd  Find real file and switch to operate the real underlying file Before Stack file operations Fd1 Fd2 Fd1 Fd2 A A A f_ops A f_ops f_ops A A Copy up Copy up f_ops A A A A HUAWEI TECHNOLOGIES CO., LTD. 12

13.Overlayfs – Delayed copy up  Delayed copy up (metadata copy up, v4.18 ?)  Copy up metadata only if no data write (chattr, chmod, rename...)  Save meta copy xattr on upper file  Look up real data by name or redirect xattr  Multi metadata copied up files A C Meta data Meta data Real data Real data A B C Meta copy up Redirect A B HUAWEI TECHNOLOGIES CO., LTD. 13

14.Overlayfs  Introduction  Current status  Upcoming future  Overlay Feature set  Offline layer check tool  Merge file (Partial data copy up) HUAWEI TECHNOLOGIES CO., LTD. 14

15.Overlayfs – Upcoming future  Overlay feature set (work in progess)  Enable features by mount options  Cannot disable some features completely once it was enabled now  Mark compatible, ro-compatible and incompatible feature set  Compatible: origin, impure dir  Ro-compatible: index, nfs_export  Incompatible: redirect dir, meta data copy up  Refuse to mount if unsupported feature detected HUAWEI TECHNOLOGIES CO., LTD. 15

16.Overlayfs – Upcoming future  Offline layer check tool (RFC)  Overlayfs-progs: fsck.overlay  Auto check and fix underlying dir inconsistency  Invalid whiteouts: whiteout exposure, fail to remove dir...  Invalid/duplicate redirect xattr: merge with the wrong dir...  Missing impure xattr: inconsistent d_ino...  Test cases: xfstests A A C A D Whiteout DirB DirC A C D Whiteout DirB DirC Redirect to DirA A Redirect to DirA DirA HUAWEI TECHNOLOGIES CO., LTD. 16

17.Overlayfs – Upcoming future  Offline layer check tool: Todo  Auto fix the missing feature set  Detect the overlayfs is already mounted  Check and fix origin xattr, index object, metadata copy up  Set or clear the indicated overlay features  Release the first version! HUAWEI TECHNOLOGIES CO., LTD. 17

18.Overlayfs – Merge file (Partial data copy up)  Merge file (work in progress)  Copy up across fs (or on the fs not support reflink) is time-consuming and waste of space now  Copy up metadata and partial data blocks instead of whole file  Create data map and merge data blocks between each layer A DataA DataB DataC DataD A DataB DataD A DataA DataB DataC HUAWEI TECHNOLOGIES CO., LTD. 18

19. Thank you www.huawei.com Copyright© 2011 Huawei Technologies Co., Ltd. All Rights Reserved. The information in this document may contain predictive statements including, without limitation, statements regarding the future financial and operating results, future product portfolio, new technology, etc. There are a number of factors that could cause actual results and developments to differ materially from those expressed or implied in the predictive statements. Therefore, such information is provided for reference purpose only and constitutes neither an offer nor an acceptance. Huawei may change the information at any time without notice.

20. Huawei OS Kernel Lab Huawei Operating System R&D Department - OS Kernel Lab  Linux Kernel (ARM/x86/ heterogeneous platforms) R&D and Innovation  R&D on a Next-generation OS kernel with Low Latency, High Security, Strong Reliability, Intelligence, etc. Job Vacancy Contact us Next-generation Operating System Researcher and Senior Engineer Tel: Mr. Wang/18658102676 Formal Verification Researcher and Senior Engineer Email:hr.kernel@huawei.com Linux Kernel Architect and Senior Engineer Locations Hangzhou, Beijing, Shanghai HUAWEI TECHNOLOGIES CO., LTD. 20