0x01 前期准备

三台电脑,配置如下:

主机名IP角色
master192.168.1.60控制节点
worker01192.168.1.61工作节点
worker02192.168.1.62工作节点

0x02 系统基础设置

全部节点都要执行!

下载软件包

apt update
apt full-upgrade -y
apt install sudo apt-transport-https ca-certificates curl gnupg  ntpdate ipset ipvsadm -y

设置时区&同步时间

# 定时任务在/var/spool/cron/crontabs/root
timedatectl set-timezone Asia/Shanghai
CRON_JOB="0 3 * * * /usr/sbin/ntpdate ntp.aliyun.com > /dev/null 2>&1"
(crontab -l 2>/dev/null | grep -Fxq "$CRON_JOB") || (crontab -l 2>/dev/null; echo "$CRON_JOB") | crontab -

设置主机名

read -p "请输入主机名:" hostname
hostnamectl set-hostname $hostname

关闭SWAP

# 请注释带有swap的分区
swapoff -a
nano /etc/fstab

修改hosts

# 修改 /etc/hosts
192.168.1.60 master
192.168.1.61 worker01
192.168.1.62 worker02

网络设置

rm /etc/sysctl.d/99-sysctl.conf
cat > /etc/sysctl.d/k8s.conf << EOF
# 禁止保存TCP统计数据,避免历史连接影响新连接的性能
net.ipv4.tcp_no_metrics_save=1
# 禁用TCP显式拥塞通知(ECN),避免网络设备不支持而导致的问题
net.ipv4.tcp_ecn=0
# 禁用TCP快速重传超时(F-RTO),避免不兼容设备导致的丢包问题
net.ipv4.tcp_frto=0
# 禁用TCP路径MTU探测,避免路径MTU变化导致的传输问题
net.ipv4.tcp_mtu_probing=0
# 禁用TCP防御拒绝服务攻击的RFC 1337标准实现(通常不需要)
net.ipv4.tcp_rfc1337=0
# 启用TCP选择性确认(SACK),提高高延迟网络下的传输效率
net.ipv4.tcp_sack=1
# 启用TCP选择性确认的恢复算法(FACK),配合SACK优化数据恢复
net.ipv4.tcp_fack=1
# 启用TCP窗口大小缩放因子,提高TCP流量的窗口大小以便提升传输速度
net.ipv4.tcp_window_scaling=1
# 设置TCP窗口的动态比例因子,允许TCP窗口大小根据网络情况进行调节
net.ipv4.tcp_adv_win_scale=1
# 启用TCP接收缓冲区自动调节,提高网络负载时的传输性能
net.ipv4.tcp_moderate_rcvbuf=1
# 设置系统最大接收缓冲区大小,单位为字节,提高接收吞吐量
net.core.rmem_max=33554432
# 设置系统最大发送缓冲区大小,单位为字节,提高发送吞吐量
net.core.wmem_max=33554432
# 设置TCP连接的接收缓冲区大小范围(最小值、默认值、最大值)
net.ipv4.tcp_rmem=4096 87380 33554432
# 设置TCP连接的发送缓冲区大小范围(最小值、默认值、最大值)
net.ipv4.tcp_wmem=4096 16384 33554432
# 设置UDP最小接收缓冲区大小,提高UDP传输的稳定性
net.ipv4.udp_rmem_min=8192
# 设置UDP最小发送缓冲区大小,提高UDP传输的稳定性
net.ipv4.udp_wmem_min=8192
# 设置默认的队列管理算法为FQ(公平队列算法),优化流量管理
net.core.default_qdisc=fq
# 设置TCP拥塞控制算法为BBR,提升网络传输效率和延迟
net.ipv4.tcp_congestion_control=bbr
# 允许本地流量使用路由功能,提高局域网访问速度
net.ipv4.conf.all.route_localnet=1
# 启用IP转发功能,允许数据包在不同网络接口间转发
net.ipv4.ip_forward=1
# 启用所有接口的内核转发,支持跨网络通信
net.ipv4.conf.all.forwarding=1
# 启用默认接口的内核转发,适用于默认接口的网络流量
net.ipv4.conf.default.forwarding=1
# 启用桥接模式下的数据包过滤功能,允许IPv6数据包通过ip6tables进行过滤
net.bridge.bridge-nf-call-ip6tables=1
# 启用桥接模式下的数据包过滤功能,允许IPv4数据包通过iptables进行过滤
net.bridge.bridge-nf-call-iptables=1
EOF

添加内核模块配置文件&启动模块

cat > /etc/modules-load.d/k8s.conf << EOF
# 作为容器存储驱动(storage driver)之一
overlay
# 启用网络桥接设备上的网络过滤功能
br_netfilter
# 加载 IPVS 模块,用于 Linux 内核的 IP 虚拟服务器功能
ip_vs
# 加载 IPVS 的轮询调度算法模块(Round-Robin)
ip_vs_rr
# 加载 IPVS 的加权轮询调度算法模块(Weighted Round-Robin)
ip_vs_wrr
# 加载 IPVS 的静态哈希调度算法模块(Source Hashing)
ip_vs_sh
# 加载连接跟踪模块,用于跟踪和管理网络连接
nf_conntrack
EOF

modprobe overlay
modprobe br_netfilter
modprobe ip_vs
modprobe ip_vs_rr
modprobe ip_vs_wrr
modprobe ip_vs_sh
modprobe nf_conntrack

sysctl -p && sysctl --system

0x03 Containerd

下载Containerd

wget https://gh-proxy.com/https://github.com/containerd/containerd/releases/download/v1.7.25/cri-containerd-1.7.25-linux-amd64.tar.gz
tar -xzvf cri-containerd-1.7.25-linux-amd64.tar.gz -C /
rm -rf cri-containerd-1.7.25-linux-amd64.tar.gz

生成配置&注册服务

请修改/etc/containerd/config.toml中如下条目的内容

  • SystemdCgroup = true
  • config_path = "/etc/containerd/certs.d"
  • sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.10"
# 配置国内加速
mkdir -p /etc/containerd/certs.d/docker.io
cat > /etc/containerd/certs.d/docker.io/hosts.toml <<EOF
server = "https://docker.1ms.run"

[host."https://docker.1ms.run"]
  capabilities = ["pull", "resolve", "push"]
EOF

# 编辑service文件
cat > /etc/systemd/system/containerd.service <<EOF
[Unit]
Description=containerd container runtime
Documentation=https://containerd.io
After=network.target local-fs.target

[Service]
ExecStart=/usr/local/bin/containerd
Type=notify
Delegate=yes
KillMode=process
Restart=always
RestartSec=5
LimitNPROC=infinity
LimitCORE=infinity
LimitNOFILE=infinity
TasksMax=infinity
OOMScoreAdjust=-999

[Install]
WantedBy=multi-user.target

EOF

# 生成配置
mkdir -p /etc/containerd
containerd config default > /etc/containerd/config.toml
nano /etc/containerd/config.toml


# systemctl stop containerd
# systemctl disable containerd
# systemctl daemon-reload
# rm /etc/systemd/system/containerd.service
systemctl daemon-reload  # 重新加载 systemd 服务单元配置
systemctl start containerd  # 启动 containerd 服务
systemctl enable containerd  # 设置开机自启
systemctl status containerd  # 查看 containerd 服务状态

安装完成

0x04 集群前置

K8s密钥

# 官方源
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.32/deb/Release.key | gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

# 国内源
curl -fsSL https://mirrors.tuna.tsinghua.edu.cn/kubernetes/core:/stable:/v1.32/deb/Release.key | gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

APT源

# 官方源(需要可以访问外网)
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.32/deb/ /" | tee /etc/apt/sources.list.d/kubernetes.list

# 清华源
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://mirrors.tuna.tsinghua.edu.cn/kubernetes/core:/stable:/v1.32/deb/ /" | tee /etc/apt/sources.list.d/kubernetes.list

三大组件

apt update
apt install kubectl kubelet kubeadm -y 
apt-mark hold kubelet kubeadm kubectl

cat > /etc/default/kubelet << EOF
KUBELET_EXTRA_ARGS="--cgroup-driver=systemd"
EOF

systemctl daemon-reload
systemctl enable kubelet
systemctl status kubelet

0x05 初始化集群

Master节点专用

生成K8s配置

请修改kubeadm-config.yaml中如下条目的内容

  • advertiseAddress: 192.168.1.60
  • name: master
  • podSubnet: 10.244.0.0/16
  • imageRepository: registry.aliyuncs.com/google_containers
kubeadm config images pull --config=kubeadm-config.yaml

初始化

kubeadm init --config=kubeadm-config.yaml --upload-certs --v=5
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

看到下面如下输出代表初始化成功!

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

worker节点输入下方代码即可加入集群

kubeadm join 192.168.1.60:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:99f6f9bb95fea2f21b8467640faf85b8dd90be9e563f94228c6e8c07624b4472

配置 Calico 网络

请修改calico.yaml的网络部分

# 去掉下面两行代码的注释,记得value与上文的podSubnet一致
# - name: CALICO_IPV4POOL_CIDR
#     value: "10.244.0.0/16"
wget --no-check-certificate  https://raw.gitmirror.com/projectcalico/calico/v3.29.1/manifests/calico.yaml
nano calico.yaml

拉取Calico镜像

# 可以所有节点一起拉取,会节省一点时间
crictl pull docker.io/calico/cni:v3.29.1
crictl pull docker.io/calico/node:v3.29.1
crictl pull docker.io/calico/kube-controllers:v3.29.1

启动Calico

# kubectl delete -f calico.yaml && rm -rf /etc/cni/net.d && systemctl restart kubelet 
kubectl create -f calico.yaml && kubectl get pods -A

完成!