网站建设手机端郑州seo推广外包
gobgp服务无损变更:graceful restart特性
场景
当我们的bgp网关在对外宣告bgp路由的时候,如果我们网关有新的特性要发布,那么此时如果把网关停止再启动新版本,此时bgp路由会有短暂撤回再播出的过程,会有网络抖动
期待的行为:无损变更
我们希望bgp网关服务在变更的时候,播出去的路由能够在bgp网关中断时,继续保持一段时间,除非过了这段时间,bgp网关仍无法正常启动,对端网络设备再进行路由撤回
graceful restart特性
- bgp服务非正常退出时,会启动优雅重启特性,此时路由不会马上撤回
- bgp服务是被SIGTERM信号终止的时候,则会马上回撤路由
解析:
通过此配置,如果与对等方协商了优雅重启功能,则对等方启动优雅重启帮助程序,当 gobgpd 非自愿死亡或 SIGINT 时,SIGKILL 信号发送到 gobgpd。请注意,当 SIGTERM 信号发送到 gobgpd 时,优雅重启协商的对等点不会启动优雅重启帮助程序,因为 gobgpd 在它死亡之前会向这些对等点发送通知消息
graceful restart演示:gobgp graceful restart example
场景:
192.168.128.132节点与192.168.128.134节点建立bgp连接,132向134宣告路由,同时132会模拟退出后,让134进行路由保持的特性,也即:graceful restart
192.168.128.132节点的bgp config文件:
[global.config]as = 65001router-id = "192.168.128.132"
[[neighbors]][neighbors.config]neighbor-address = "192.168.128.134"peer-as = 65001[neighbors.graceful-restart.config]enabled = truerestart-time = 30[[neighbors.afi-safis]][neighbors.afi-safis.config]afi-safi-name = "ipv4-unicast"[neighbors.afi-safis.mp-graceful-restart.config]enabled = true[neighbors.afi-safis.long-lived-graceful-restart.config]enabled = truerestart-time = 30
192.168.128.134节点的bgp config文件:
[global.config]as = 65001router-id = "192.168.128.134"
[[neighbors]][neighbors.config]neighbor-address = "192.168.128.132"peer-as = 65001[neighbors.graceful-restart.config]enabled = truelong-lived-enabled = truerestart-time = 30notification-enabled = true[[neighbors.afi-safis]][neighbors.afi-safis.config]afi-safi-name = "ipv4-unicast"[neighbors.afi-safis.mp-graceful-restart.config]enabled = true[neighbors.afi-safis.long-lived-graceful-restart.config]enabled = truerestart-time = 30
启动两个bgp server:
sudo ./gobgpd -f bgp-graceful.conf -l debug -p -r
在132上宣告一条路由:./gobgp global rib -a ipv4 add 192.168.3.0/24 origin igp
Key=192.168.128.134 Topic=config
INFO[0000] Add a peer configuration Key=192.168.128.134 Topic=Peer
DEBU[0000] IdleHoldTimer expired Duration=0 Key=192.168.128.134 Topic=Peer
DEBU[0000] state changed Key=192.168.128.134 Topic=Peer new=BGP_FSM_ACTIVE old=BGP_FSM_IDLE reason=idle-hold-timer-expired
DEBU[0005] try to connect Key=192.168.128.134 Topic=Peer
DEBU[0005] state changed Key=192.168.128.134 Topic=Peer new=BGP_FSM_OPENSENT old=BGP_FSM_ACTIVE reason=new-connection
DEBU[0005] state changed Key=192.168.128.134 Topic=Peer new=BGP_FSM_OPENCONFIRM old=BGP_FSM_OPENSENT reason=open-msg-received
INFO[0005] Peer Up Key=192.168.128.134 State=BGP_FSM_OPENCONFIRM Topic=Peer
DEBU[0005] state changed Key=192.168.128.134 Topic=Peer new=BGP_FSM_ESTABLISHED old=BGP_FSM_OPENCONFIRM reason=open-msg-negotiated
DEBU[0005] Now syncing, suppress sending updates. start deferral timer Duration=360 Key=192.168.128.134 Topic=Server
DEBU[0005] received update Key=192.168.128.134 Topic=Peer attributes="[]" nlri="[]" withdrawals="[]"
DEBU[0005] EOR received AddressFamily=ipv4-unicast Key=192.168.128.134 Topic=Peer
INFO[0005] sync finished Topic=Server
DEBU[0005] sent update Key=192.168.128.134 State=BGP_FSM_ESTABLISHED Topic=Peer attributes="[]" nlri="[]" withdrawals="[]"
DEBU[0012] create Destination Nlri=192.168.3.0/24 Topic=Table
DEBU[0012] sent update Key=192.168.128.134 State=BGP_FSM_ESTABLISHED Topic=Peer attributes="[{Origin: i} {Nexthop: 192.168.128.132} {LocalPref: 100}]" nlri="[192.168.3.0/24]" withdrawals="[]"
在134上查看从132邻居学到的路由:
luzejia@luzejia-virtual-machine:~/Desktop$ ./gobgp neighbor 192.168.128.132 adj-inID Network Next Hop AS_PATH Age Attrs0 192.168.3.0/24 192.168.128.132 00:00:02 [{Origin: i} {LocalPref: 100}]
使用ctrl + c将bgp server停掉,可以看到做了一些清理现场的行为,让134对端知道你是正常退出,不需要启动优雅重启,直接回撤路由即可
sudo ./gobgpd -f bgp-graceful.conf -l debug -p -r
INFO[0000] gobgpd started
INFO[0000] Finished reading the config file Topic=Config
INFO[0000] Add Peer Key=192.168.128.134 Topic=config
INFO[0000] Add a peer configuration Key=192.168.128.134 Topic=Peer
DEBU[0000] IdleHoldTimer expired Duration=0 Key=192.168.128.134 Topic=Peer
DEBU[0000] state changed Key=192.168.128.134 Topic=Peer new=BGP_FSM_ACTIVE old=BGP_FSM_IDLE reason=idle-hold-timer-expired
DEBU[0001] Accepted a new passive connection Key=192.168.128.134 Topic=Peer
DEBU[0001] stop connect loop Key=192.168.128.134 Topic=Peer
DEBU[0001] state changed Key=192.168.128.134 Topic=Peer new=BGP_FSM_OPENSENT old=BGP_FSM_ACTIVE reason=new-connection
DEBU[0001] peer has restarted, skipping wait for EOR Key=192.168.128.134 State=BGP_FSM_OPENSENT Topic=Peer
DEBU[0001] state changed Key=192.168.128.134 Topic=Peer new=BGP_FSM_OPENCONFIRM old=BGP_FSM_OPENSENT reason=open-msg-received
INFO[0001] Peer Up Key=192.168.128.134 State=BGP_FSM_OPENCONFIRM Topic=Peer
DEBU[0001] state changed Key=192.168.128.134 Topic=Peer new=BGP_FSM_ESTABLISHED old=BGP_FSM_OPENCONFIRM reason=open-msg-negotiated
INFO[0001] sync finished Key=192.168.128.134 Topic=Server
DEBU[0001] received update Key=192.168.128.134 Topic=Peer attributes="[]" nlri="[]" withdrawals="[]"
DEBU[0001] EOR received AddressFamily=ipv4-unicast Key=192.168.128.134 Topic=Peer
DEBU[0001] sent update Key=192.168.128.134 State=BGP_FSM_ESTABLISHED Topic=Peer attributes="[]" nlri="[]" withdrawals="[]"
DEBU[0008] create Destination Nlri=192.168.3.0/24 Topic=Table
DEBU[0008] sent update Key=192.168.128.134 State=BGP_FSM_ESTABLISHED Topic=Peer attributes="[{Origin: i} {Nexthop: 192.168.128.132} {LocalPref: 100}]" nlri="[192.168.3.0/24]" withdrawals="[]"
^CINFO[0021] stopping gobgpd server
INFO[0021] Delete a peer configuration Key=192.168.128.134 Topic=Peer
INFO[0021] Peer Down Key=192.168.128.134 Reason=dying State=BGP_FSM_ESTABLISHED Topic=Peer
DEBU[0021] freed fsm.h Key=192.168.128.134 State=BGP_FSM_ESTABLISHED Topic=Peer
134上观察路由,发现被回撤:
luzejia@luzejia-virtual-machine:~/Desktop$ ./gobgp neighbor 192.168.128.132 adj-in
neighbor 192.168.128.132's BGP session is not established
查看134的日志,原因是识别到了132是peer down,然后回撤路由
INFO[0028] Peer Down Key=192.168.128.132 Reason="notification-received code 6(cease) subcode 3(peer deconfigured)" State=BGP_FSM_ESTABLISHED Topic=Peer
DEBU[0028] state changed Key=192.168.128.132 Topic=Peer new=BGP_FSM_IDLE old=BGP_FSM_ESTABLISHED reason="notification-received code 6(cease) subcode 3(peer deconfigured)"
DEBU[0028] Removing withdrawals Key=192.168.3.0/24 Topic=Table
DEBU[0033] IdleHoldTimer expired Duration=5 Key=192.168.128.132 Topic=Peer
DEBU[0033] state changed Key=192.168.128.132 Topic=Peer new=BGP_FSM_ACTIVE old=BGP_FSM_IDLE reason=idle-hold-timer-expired
如果使用kill -9来杀掉132上的bgp server
luzejia@luzejia-virtual-machine:~/bgp$ sudo ./gobgpd -f bgp-graceful.conf -l debug -p -r
INFO[0000] gobgpd started
INFO[0000] Finished reading the config file Topic=Config
INFO[0000] Add Peer Key=192.168.128.134 Topic=config
INFO[0000] Add a peer configuration Key=192.168.128.134 Topic=Peer
DEBU[0000] IdleHoldTimer expired Duration=0 Key=192.168.128.134 Topic=Peer
DEBU[0000] state changed Key=192.168.128.134 Topic=Peer new=BGP_FSM_ACTIVE old=BGP_FSM_IDLE reason=idle-hold-timer-expired
DEBU[0005] try to connect Key=192.168.128.134 Topic=Peer
DEBU[0005] state changed Key=192.168.128.134 Topic=Peer new=BGP_FSM_OPENSENT old=BGP_FSM_ACTIVE reason=new-connection
DEBU[0005] state changed Key=192.168.128.134 Topic=Peer new=BGP_FSM_OPENCONFIRM old=BGP_FSM_OPENSENT reason=open-msg-received
INFO[0005] Peer Up Key=192.168.128.134 State=BGP_FSM_OPENCONFIRM Topic=Peer
DEBU[0005] state changed Key=192.168.128.134 Topic=Peer new=BGP_FSM_ESTABLISHED old=BGP_FSM_OPENCONFIRM reason=open-msg-negotiated
DEBU[0005] Now syncing, suppress sending updates. start deferral timer Duration=360 Key=192.168.128.134 Topic=Server
DEBU[0005] received update Key=192.168.128.134 Topic=Peer attributes="[]" nlri="[]" withdrawals="[]"
DEBU[0005] EOR received AddressFamily=ipv4-unicast Key=192.168.128.134 Topic=Peer
INFO[0005] sync finished Topic=Server
DEBU[0005] sent update Key=192.168.128.134 State=BGP_FSM_ESTABLISHED Topic=Peer attributes="[]" nlri="[]" withdrawals="[]"
DEBU[0012] create Destination Nlri=192.168.3.0/24 Topic=Table
DEBU[0012] sent update Key=192.168.128.134 State=BGP_FSM_ESTABLISHED Topic=Peer attributes="[{Origin: i} {Nexthop: 192.168.128.132} {LocalPref: 100}]" nlri="[192.168.3.0/24]" withdrawals="[]"
已杀死
134上观察到132的路由还在,并且有S标志,这个是保留的意思,证明启动了优雅重启,暂时不回撤,等待对端重启:
luzejia@luzejia-virtual-machine:~/Desktop$ ./gobgp neighbor 192.168.128.132 adj-inID Network Next Hop AS_PATH Age Attrs
S 0 192.168.3.0/24 192.168.128.132 00:00:21 [{Origin: i} {LocalPref: 100}]
134上看到识别出了peer是graceful restart,启动了优雅重启,没有马上回撤路由,但是过了超时时间后还是回撤了路由:
DEBU[0053] From same AS, ignore Key=192.168.128.132 Path="{ 192.168.3.0/24 | src: { 192.168.128.132 | as: 65001, id: 192.168.128.132 }, nh: 192.168.128.132 }" Topic=Peer
INFO[0071] peer graceful restart Key=192.168.128.132 State=BGP_FSM_ESTABLISHED Topic=Peer
INFO[0071] Peer Down Key=192.168.128.132 Reason=graceful-restart State=BGP_FSM_ESTABLISHED Topic=Peer
DEBU[0071] state changed Key=192.168.128.132 Topic=Peer new=BGP_FSM_IDLE old=BGP_FSM_ESTABLISHED reason=graceful-restart
DEBU[0071] Implicit withdrawal of old path, since we have learned new path from the same peer Key=192.168.3.0/24 Path="{ 192.168.3.0/24 | src: { 192.168.128.132 | as: 65001, id: 192.168.128.132 }, nh: 192.168.128.132 }" Topic=Table
DEBU[0076] IdleHoldTimer expired Duration=5 Key=192.168.128.132 Topic=Peer
DEBU[0076] state changed Key=192.168.128.132 Topic=Peer new=BGP_FSM_ACTIVE old=BGP_FSM_IDLE reason=idle-hold-timer-expired
DEBU[0085] try to connect Key=192.168.128.132 Topic=Peer
DEBU[0085] failed to connect Error="dial tcp 0.0.0.0:0->192.168.128.132:179: connect: connection refused" Key=192.168.128.132 Topic=Peer
WARN[0101] graceful restart timer expired Key=192.168.128.132 State=BGP_FSM_ACTIVE Topic=Peer
DEBU[0101] stop connect loop Key=192.168.128.132 Topic=Peer
DEBU[0101] state changed Key=192.168.128.132 Topic=Peer new=BGP_FSM_IDLE old=BGP_FSM_ACTIVE reason=restart-timer-expired
DEBU[0101] Removing withdrawals Key=192.168.3.0/24 Topic=Table
DEBU[0106] IdleHoldTimer expired Duration=5 Key=192.168.128.132 Topic=Peer
DEBU[0106] state changed Key=192.168.128.132 Topic=Peer new=BGP_FSM_ACTIVE old=BGP_FSM_IDLE reason=idle-hold-timer-expired
DEBU[0115] try to connect Key=192.168.128.132 Topic=Peer
DEBU[0115] failed to connect Error="dial tcp 0.0.0.0:0->192.168.128.132:179: connect: connection refused" Key=192.168.128.132 Topic=Peer
过了graceful resstart的timeout时间后,看到路由被正常撤回
luzejia@luzejia-virtual-machine:~/Desktop$ ./gobgp neighbor 192.168.128.132 adj-in
neighbor 192.168.128.132's BGP session is not established
总结:
- bgp服务非正常退出时,会启动优雅重启特性,此时路由不会马上撤回
- bgp服务是被SIGTERM信号终止的时候,则会马上回撤路由
注意:
- bgp服务是被SIGTERM信号终止的时候,则会马上回撤路由,这部分需要自己实现去捕捉SIGTERM信号,然后调用gobgp server的stop接口,才能实现路由回撤,也就是实际stop接口向对端宣告了一个自己是正常退出的down信息,从而告知对端此时不需要启动优雅重启特性来保持路由,直接回撤即可
捕捉SIGTERM信号并进行处理,参考gobgpd源码,给出一个example:
package mainimport ("fmt""io""net/http"_ "net/http/pprof""os""os/signal""runtime""syscall""github.com/coreos/go-systemd/v22/daemon""github.com/jessevdk/go-flags""github.com/kr/pretty""github.com/sirupsen/logrus""golang.org/x/net/context""google.golang.org/grpc""google.golang.org/grpc/credentials""github.com/osrg/gobgp/v3/internal/pkg/version""github.com/osrg/gobgp/v3/pkg/config""github.com/osrg/gobgp/v3/pkg/server"
)func main() {sigCh := make(chan os.Signal, 1)signal.Notify(sigCh, syscall.SIGTERM, syscall.SIGINT)......logger.Info("gobgpd started")bgpServer := server.NewBgpServer(server.GrpcListenAddress(opts.GrpcHosts), server.GrpcOption(grpcOpts), server.LoggerOption(&builtinLogger{logger: logger}))go bgpServer.Serve()for sig := range sigCh {if sig != syscall.SIGHUP {stopServer(bgpServer, opts.UseSdNotify)return}logger.WithFields(logrus.Fields{"Topic": "Config",}).Info("Reload the config file")newConfig, err := config.ReadConfigFile(opts.ConfigFile, opts.ConfigType)if err != nil {logger.WithFields(logrus.Fields{"Topic": "Config","Error": err,}).Warningf("Can't read config file %s", opts.ConfigFile)continue}currentConfig, err = config.UpdateConfig(context.Background(), bgpServer, currentConfig, newConfig)if err != nil {logrus.WithFields(logrus.Fields{"Topic": "Config","Error": err,}).Warningf("Failed to update config %s", opts.ConfigFile)continue}}
}func stopServer(bgpServer *server.BgpServer, useSdNotify bool) {logger.Info("stopping gobgpd server")bgpServer.Stop()if useSdNotify {daemon.SdNotify(false, daemon.SdNotifyStopping)}
}