Linux SCTP is catching up and going above

sctp是一种传输协议,类似于tcp和udp,起源于2000年初的sigtran ietf工作组,最初的目标是支持在ip网络上传输pstn信令。它从一开始就具有多宿主和多流的特点,从那时起,有许多改进也有助于它服务于其他目的,例如支持部分可靠性和流调度。
linux sctp迟到并被卡住。它不像发布的rfc那样是最新的,同时它也远远落后于bsd等其他系统,并且还受到性能问题的困扰。在过去的两年中,我们致力于确保这些特性得到解决,并集中精力进行许多改进。现在,发布的rfc的所有功能都在linux中得到了完全支持,草案rfc中的一些功能已经在进行中。此外,我们还看到了在各种场景中性能的明显改善。
在本次讲座中,我们将首先对SCTP基础知识进行快速回顾,包括:
背景:为什么sctp用于pstn信令传输,为什么其他应用程序正在或将要使用sctp。
体系结构:在linux内核中实现的通用sctp结构和过程。
vs-tcp/udp:sctp、tcp和udp的功能和适用性概述。
然后回顾过去两年中所做的改进,包括:
linux中sctp相关项目:除了内核部分,还有lksctp工具、sctp测试、tahi sctp等。
最近实现的特性:rfc特性,如流调度、消息交织、流重新配置、部分可靠策略,以及许多cmsg、sndinfos、socketapi。
最近的改进:大补丁集,如sctp卸载、传输哈希表、sctp diag和完全的selinux支持。
vs-bsd:我们现在来看看linux和bsd在sctp方面的区别。你会惊讶地发现我们比其他系统走得更远。我们将通过查看雷达上的内容列表以及接下来的步骤来完成,例如:
正在进行的特性:sctp-nat和sctp-cmt,这两个重要的特性正在进行并且已经形成,内核中的更多性能改进也已经开始。
代码重构:将引入新的拥塞框架,使sctp能够更灵活地扩展更多的拥塞算法。
硬件支持:hw-crc校验和和和gso肯定会使性能更好,为此需要一个新的段逻辑,段和hw都适用于sctp块。
RFC文档的改进:我们相信更多的扩展和修订将使SCTP更加广泛。由于sctp的强大性和复杂性,它注定会面临许多挑战和威胁,但我们相信,我们已经并将继续使它优于其他系统,也优于其他传输协议。请加入我们,Linux SCTP也需要您的帮助!

展开查看详情

1.Linux SCTP is catching up and going above Red Hat, Inc. Marcelo Ricardo Leitner, Xin Long Linux Plumber Conference in Vancouver, 2018 1 / 31

2. What and Why is SCTP Outline 1 What and Why is SCTP Architecture SCTP vs TCP 2 What We’ve Done on Linux Projects Improvements Made Recently Features Implemented Lately LINUX vs BSD 3 What’s the Next Features Development Code Refactor Hardware Support 2 / 31

3. What and Why is SCTP Architecture Structures overview 1 Endpoint 2 Association 3 Transport 4 Stream 5 Msg 6 Packet 7 Chunk 3 / 31

4. What and Why is SCTP Architecture SCTP Structures in Linux 4 / 31

5. What and Why is SCTP Architecture SCTP Procedures in Linux 5 / 31

6. What and Why is SCTP SCTP vs TCP SCTP vs TCP/UDP on Feature 6 / 31

7. What and Why is SCTP SCTP vs TCP SCTP vs TCP on Performance Performance ? 7 / 31

8. What We’ve Done on Linux Outline 1 What and Why is SCTP Architecture SCTP vs TCP 2 What We’ve Done on Linux Projects Improvements Made Recently Features Implemented Lately LINUX vs BSD 3 What’s the Next Features Development Code Refactor Hardware Support 8 / 31

9. What We’ve Done on Linux Projects lksctp-tools (lib and unit test) MANIFEST ——– - sctp_darn, sctp_test . - sctp_status, sctp_xconnect |– bin - peel_client, peel_server |– doc - bindx_test, myftp, nagle_rcv, nagle_snd |– man |– src ... |– apps """ ... |– func_tests Unit Test: Look in src/func_tests and in lksctp-tests package for examples of ... |– include of tests. Please do not submit code that fails its own tests or any of the unit tests. If it fails a functional test, please document that with the submission. ... |... ‘– netinet ... |– lib """ ... |– testlib ... ‘– withsctp - sctp_send, sctp_sendmsg, sctp_recvmsg - sctp_connectx_orig, sctp_connectx2, sctp_connectx3 - sctp_bindx, sctp_opt_info - sctp_peeloff, sctp_peeloff_flags 9 / 31

10. What We’ve Done on Linux Projects sctp-tests (regression test): 27 test cases so far 10 / 31

11. What We’ve Done on Linux Projects tahi-sctp (conformance test) RFC4960: Association Initialization RFC4960: Association Termination RFC4960: Fault Management RFC4960: Error Cause RFC4960: Chunk Bundling RFC4960: User Data Transfer RFC4960: Retransmission Timer RFC4960: Congestion Control RFC4960: Path MTU Discovery RFC4960: Multi-Homed Endpoints RFC4960: Explicit Congestion Notification RFC4960: Packet Format RFC4960: Miscellaneous Test RFC4895: Authentication Chunks RFC5061: Dynamic Address Reconfiguration RFC3758: Partial Reliability Extension RFC3554: Internet Protocol Security 11 / 31

12. What We’ve Done on Linux Projects Others Syzkaller (fuzz test) Codenomicon (fuzz test) Packetdrill (conformance test) Scapy (packet generating) More ? 12 / 31

13. What We’ve Done on Linux Improvements Made Recently Transport Rhashtable 1 13 / 31

14. What We’ve Done on Linux Improvements Made Recently Transport Rhashtable 2 1 1-to-many(with "the same dport and different dip" lookup fast 2 1-step to find both transport and asoc 3 Rhashtable (rhlist) features: rcu_lock and resize memory 4 Why not use the key with hash(dport, lport, dip, lip) ? 5 Why not make rhashtable per endpoint/socket ? 14 / 31

15. What We’ve Done on Linux Improvements Made Recently SCTP Offload 1 15 / 31

16. What We’ve Done on Linux Improvements Made Recently SCTP Offload 2 16 / 31

17. What We’ve Done on Linux Improvements Made Recently SCTP Diag 1 [iproute2]# ss --sctp -n -l State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 128 172.16.254.254:8888 *:* LISTEN 0 5 127.0.0.1:1234 *:* LISTEN 0 5 127.0.0.1:1234 *:* - ESTAB 0 0 127.0.0.1%lo:1234 127.0.0.1:4321 LISTEN 0 128 172.16.254.254:8888 *:* - ESTAB 0 0 172.16.254.254%eth1:8888 172.16.253.253:8888 - ESTAB 0 0 172.16.254.254%eth1:8888 172.16.1.1:8888 - ESTAB 0 0 172.16.254.254%eth1:8888 172.16.1.2:8888 - ESTAB 0 0 172.16.254.254%eth1:8888 172.16.2.1:8888 - ESTAB 0 0 172.16.254.254%eth1:8888 172.16.2.2:8888 - ESTAB 0 0 172.16.254.254%eth1:8888 172.16.3.1:8888 - ESTAB 0 0 172.16.254.254%eth1:8888 172.16.3.2:8888 LISTEN 0 0 127.0.0.1:4321 *:* - ESTAB 0 0 127.0.0.1%lo:4321 127.0.0.1:1234 [iproute2]# ss -Snai State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 1 127.0.0.1:27375 *:* locals:127.0.0.1,192.168.42.2, v4mapped:1 ESTAB 0 0 127.0.0.1:37636 127.0.0.1:27375 locals:0.0.0.0, v4mapped:1 17 / 31

18. What We’ve Done on Linux Improvements Made Recently SCTP Diag 2 18 / 31

19. What We’ve Done on Linux Improvements Made Recently Others Dst source addr selection Rwnd improvements Partial reliability fixes MTU handling refactor PMTU discovery (critical) fixes CRC32c offloading on virtual interfaces Some codes cleaning up More ... 19 / 31

20. What We’ve Done on Linux Features Implemented Lately Overview Stream Schedulers and User Message Interleaving for the Stream Control Transmission Protocol [RFC8260] Additional Policies for the Partially Reliable Stream Control Transmission Protocol Extension [RFC7496] Stream Control Transmission Protocol (SCTP) Stream Reconfiguration [RFC6525] Sockets API Extensions for the Stream Control Transmission Protocol (SCTP) [RFC6458] Full SELinux support More ... 20 / 31

21. What We’ve Done on Linux Features Implemented Lately Stream Schedulers 21 / 31

22. What We’ve Done on Linux Features Implemented Lately Message Interleaving 22 / 31

23. What We’ve Done on Linux Features Implemented Lately PR_SCTP policies 1 Timed Reliability SCTP_PR_SCTP_TTL 2 Limited Retransmissions Policy SCTP_PR_SCTP_RTX When dequeuing chunks from A When dequeuing chunks from C When moving chunks from B to C After receiving a SACK, check B and C 3 Priority Policy SCTP_PR_SCTP_PRIO Before enqueuing chunk into A And No Enough TX Buffer Then try to drop C -> B -> A. 23 / 31

24. What We’ve Done on Linux Features Implemented Lately Stream Reconfig 1 Add Outgoing Streams: No restrictions 2 Add Incoming Streams: No restrictions 3 Reset Outgoing Streams: Reset stream 1, b have to be empty 4 Reset Incoming Streams: Peer will send Outgoing Stream request for which it has to follow the above rule 5 Reset SSN/TSN: All queues have to be empty: A, B, C, a, b, c 24 / 31

25. What We’ve Done on Linux Features Implemented Lately Socket APIs 1 User APIs sctp_sendv sctp_recvv 2 Snd Info Flags SENDALL MSG_MORE 3 Cmsgs PR_INFO AUTH_INFO DSTv4 DSTv6 25 / 31

26. What We’ve Done on Linux LINUX vs BSD Linux vs BSD on Features Chunks Others LINUX: LINUX: ongoing sctp_do_sm() transport rhashtable BSD: offload SCTP_NR_SELECTIVE_ACK (draft) diag SCTP_PACKET_DROPPED (draft) SCTP_PAD_CHUNK BSD: sctp_cc_functions 26 / 31

27. What’s the Next Outline 1 What and Why is SCTP Architecture SCTP vs TCP 2 What We’ve Done on Linux Projects Improvements Made Recently Features Implemented Lately LINUX vs BSD 3 What’s the Next Features Development Code Refactor Hardware Support 27 / 31

28. What’s the Next Features Development Features Development Support more Chunks, Apis, Sockopts, Notifications. Other features from Draft RFC, like SCTP NAT and CMT. SCTP Performance Improvement (including sndbuf auto-tuning) Add more test cases in sctp-tests. 28 / 31

29. What’s the Next Code Refactor Code Refactor Some huge and messy functions. Congestion framework. Refactor lksctp-tools. 29 / 31