In this slide, we discussed the architecture of iptables and also showed how to implement your own IPTABLES module.
Upon the understanding of iptables, we implemented the DNS layer 7 parse in iptables module.
After that, we studied how Kubernetes service works and also explained why Kubernetes can&#...
In this slide, we discussed the architecture of iptables and also showed how to implement your own IPTABLES module.
Upon the understanding of iptables, we implemented the DNS layer 7 parse in iptables module.
After that, we studied how Kubernetes service works and also explained why Kubernetes can't do layer7 load-balancer in TCP connection but UDP.
IPTABLES Series
Introduction to IPTABLES
Learn IPTABLES by Docker environment.
Implementation of IPTABLES
User Space/Kernel Space
Implement our own iptables modules
Kubernetes Service discussion
Layer4 load-balancing, why ?
Modify the kernel module to make it support Layer7, really ?
IPTABLES/EBTABLES
Example
iptables -t nat -A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
ebtables -t filter -I INPUT --log --log-prefix 'ctc/ebtable/filter-input' --log-level debug
Components
Chain. -> Insert/Append (-I/-A)
Table
Match (Module) -> build-in/module
Target (Module). -> build-in/module
Last Time
Demo Environment
ContainerA
Linux Bridge
Eth0
Eth0
Veth0
https://github.com/hwchiu/
network-study/tree/master/
iptables/k8s
Customize IPTABLES module
Drop DNS packet if its request
domain is what we expect
Layer 7 processing.
Architecture
UserSpace
Parameter, usage
Kernel space
Implementation
Communicate by
getsockopt
setsockopt
User Space
Repo: git.netfilter.org/iptables
Version: 1.6.1 (In my demo)
Includes
iptables commands, modules,
Parses parameters and then send out to kernel
Written by C language
User Space
Directory
extension
old/new style
iptables
Extension
Parse the argument and then store to a memory.
iptables send data to kernel space later.
Implement
Implement a basic module
Support one argument
domain (string)
Name-> hwchiu
Output
Iptables -L
Iptables-save
Kernel Space
Repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Version: 4.15 (In my demo)
Whole Linux kernel
Implement the function (match/target)
Build as build-in function or kernel module.
Written by C language
Kernel Space
net/netfilter/
Implement the function then register to kernel.
Be called when someone(iptables) calls it by
setSockOpt
getSockOpt
DNS Format (RFC 1035)
Summary
You can inspect the packet in iptables modules.
Do anything you want
Coding in kernel space
Kubernetes Service
Why People called it a Layer 4 Load-Balancer ?
We could do layer7 parser in iptables.
Kubernetes
Service
ClusterIP
NodePort
LoadBalancer
Headless
Demo Environment
Kubernetes
Pod
Service
ClusterIP
Deployment
Three Pods
Pod Pod
ClusterIP
Workflow
PRE_ROUTING
Packets
KUBE-SERVICES KUBE-SVC-XXX KUBE-SEP-XXX
DNAT
Jump
Jump if match protocol/clusterIPJump if random module return true
Match protocol
Module: TCP/UDP Module: Random
Choose ENDPOINTS
OUTPUT
Again
Random
If P < 0.25
If P < 0.33
If P < 0.5
Endpoint3 Endpoint4
Endpoint2
Endpoint1
Endpoint1
Endpoint2
Endpoint3
Endpoint4
10.244.1.3
10.244.2.32
10.244.3.63
10.244.3.23
Request
P=1/4 = 0.25
P= 3/4 * 1/3 = 0.25
P= 3/4 * 2/3 * 1/2 = 0.251-0.75 = 0.25
Workflow
PRE_ROUTING
Packets
KUBE-SERVICES KUBE-SVC-XXX KUBE-SEP-XXX
DNAT
Jump
Jump if match protocol/clusterIPJump if random module return true
Match protocol
Module: UDP Module: Random
Choose ENDPOINTS
dest=10.96.121.30
dest=10.96.121.30
dest=10.96.121.30
dest=10.244.0.3
Kubernetes Service
Two types of load-balancer
Client <----> LB <----> Server
Two connections
Client <----LB-----> Server
One connection.
Kubernetes
Conntrack help to cache the NAT result.
Packets will skip NAT table if there is a conntrack entry for it.
For TCP packets
Kubernetes did DNAT in three-way handshake.
Without application data
For UDP packets
No three-way handshake
(Demo)Kubernetes
Modify the statistic modules to support UDP load-balancing.
Just for fun, to demonstrate how can we do from iptables.
Parameters (/proc)
ClusterIP (focus on specific service)
Content, PodIP (forward to PodIP if it's data include content)
PodIP list (to know where we are)
Match workflow (limited by Random and iptables structure)
Rollback if it's TCP packet.
Rollback if it's destination IP != ClusterIP
Return true if (1) data contains content, (2) PodIP equals iptables rule's target.
Else, return false
One Question
If we send packets to ClusterIP from Kubernetes node, what
happen?
UDP (Statistic + UDP)
How about ARP?