paul | 2 Jun 2009 17:32

[quagga-dev 6613] Re: fd leak in bgpd

On Mon, 20 Apr 2009, Steve Hill wrote:

> My previous patch suffered from a race condition, triggered if a 
> session with 0 prefixes had a hold timer expire; The peer would get 
> stuck in Clearing because the Clearing_Completed event was 
> inadvertently eaten by calling bgp_stop in Clearing.

Ah.

> This new patch just calls bgp_stop at the bottom of
> bgp_fsm_holdtime_expire, which seems to work better.

> I think there is still a an fd leak if TCP_connection_open_failed 
> fires during Established, but I can't convince myself that's 
> actually possible. Using bgp_stop instead of bgp_ignore as the 
> event handler for TCP_connection_open_failed would clear it up.

Hmm, or we stick with your original proposal but remove the 
problematic from bgp_stop (or split up bgp_stop). All transitions 
from Established have to go through Clearing:

Established -> (shutdown) -> Clearing -> (stop) -> Idle

The stop code path == shutdown + flush. shutdown should be 
idem-potent.

Transitions into Idle directly can call either stop or shutdown. 
Calling just shutdown might result in further events in Idle besides 
stop, but can be ignored - unneeded but harmless. That can happen 
with transition into Idle via bgp_stop_with_error, and could be fixed 
by splitting it up into 2 or by having an entry function into Idle - 
but that seems an over-optimisation.

The off-the-cuff, untested, not-ready-for-integration patch is 
attached, what do you think?

regards,
-- 
Paul Jakma	paul <at> clubi.ie	paul <at> jakma.org	Key ID: 64A2FF6A
Fortune:
A language that doesn't affect the way you think about programming is
not worth knowing.
diff --git a/bgpd/bgp_fsm.c b/bgpd/bgp_fsm.c
index d0db6aa..bd96139 100644
--- a/bgpd/bgp_fsm.c
+++ b/bgpd/bgp_fsm.c
 <at>  <at>  -426,17 +426,14  <at>  <at>  bgp_fsm_change_status (struct peer *peer, int status)
 		LOOKUP (bgp_status_msg, peer->status));
 }
 
-/* Administrative BGP peer stop event. */
-int
-bgp_stop (struct peer *peer)
+/* Shutdown and release resources */
+static int
+bgp_shutdown (struct peer *peer)
 {
   afi_t afi;
   safi_t safi;
   char orf_name[BUFSIZ];
 
-  /* Delete all existing events of the peer */
-  BGP_EVENT_FLUSH (peer);
-
   /* Increment Dropped count. */
   if (peer->status == Established)
     {
 <at>  <at>  -573,6 +570,20  <at>  <at>  bgp_stop (struct peer *peer)
   return 0;
 }
 
+/* Administrative BGP peer stop event. 
+ *
+ * Also used by bgp_packet to do its peer-swap hack
+ */
+int
+bgp_stop (struct peer *peer) {
+  bgp_shutdown (peer);
+  
+  /* Delete all existing events of the peer */
+  BGP_EVENT_FLUSH (peer);
+  
+  return 0;
+}
+
 /* BGP peer is stoped by the error. */
 static int
 bgp_stop_with_error (struct peer *peer)
 <at>  <at>  -584,7 +595,7  <at>  <at>  bgp_stop_with_error (struct peer *peer)
   if (peer->v_start >= (60 * 2))
     peer->v_start = (60 * 2);
 
-  bgp_stop (peer);
+  bgp_shutdown (peer);
 
   return 0;
 }
 <at>  <at>  -996,36 +1007,36  <at>  <at>  struct {
   {
     /* Established, */
     {bgp_ignore,               Established}, /* BGP_Start                    */
-    {bgp_stop,                    Clearing}, /* BGP_Stop                     */
-    {bgp_stop,                    Clearing}, /* TCP_connection_open          */
-    {bgp_stop,                    Clearing}, /* TCP_connection_closed        */
-    {bgp_ignore,                  Clearing}, /* TCP_connection_open_failed   */
-    {bgp_stop,                    Clearing}, /* TCP_fatal_error              */
-    {bgp_ignore,                  Clearing}, /* ConnectRetry_timer_expired   */
-    {bgp_fsm_holdtime_expire,     Clearing}, /* Hold_Timer_expired           */
+    {bgp_shutdown,             Clearing}, /* BGP_Stop                     */
+    {bgp_shutdown,             Clearing}, /* TCP_connection_open          */
+    {bgp_shutdown,             Clearing}, /* TCP_connection_closed        */
+    {bgp_shutdown,             Clearing}, /* TCP_connection_open_failed   */
+    {bgp_shutdown,             Clearing}, /* TCP_fatal_error              */
+    {bgp_shutdown,             Clearing}, /* ConnectRetry_timer_expired   */
+    {bgp_fsm_holdtime_expire,  Clearing}, /* Hold_Timer_expired           */
     {bgp_fsm_keepalive_expire, Established}, /* KeepAlive_timer_expired      */
-    {bgp_stop,                    Clearing}, /* Receive_OPEN_message         */
+    {bgp_shutdown,             Clearing}, /* Receive_OPEN_message         */
     {bgp_fsm_keepalive,        Established}, /* Receive_KEEPALIVE_message    */
     {bgp_fsm_update,           Established}, /* Receive_UPDATE_message       */
-    {bgp_stop_with_error,         Clearing}, /* Receive_NOTIFICATION_message */
-    {bgp_ignore,                      Idle}, /* Clearing_Completed           */
+    {bgp_stop_with_error,      Clearing}, /* Receive_NOTIFICATION_message */
+    {bgp_ignore,               Idle}, /* Clearing_Completed           */
   },
   {
     /* Clearing, */
-    {bgp_ignore,  Clearing},	/* BGP_Start                    */
-    {bgp_stop,    Clearing},	/* BGP_Stop                     */
-    {bgp_stop,    Clearing},	/* TCP_connection_open          */
-    {bgp_stop,    Clearing},	/* TCP_connection_closed        */
-    {bgp_stop,    Clearing},	/* TCP_connection_open_failed   */
-    {bgp_stop,    Clearing},	/* TCP_fatal_error              */
-    {bgp_ignore,  Clearing},	/* ConnectRetry_timer_expired   */
-    {bgp_ignore,  Clearing},	/* Hold_Timer_expired           */
-    {bgp_ignore,  Clearing},	/* KeepAlive_timer_expired      */
-    {bgp_ignore,  Clearing},	/* Receive_OPEN_message         */
-    {bgp_ignore,  Clearing},	/* Receive_KEEPALIVE_message    */
-    {bgp_ignore,  Clearing},	/* Receive_UPDATE_message       */
-    {bgp_ignore,  Clearing},	/* Receive_NOTIFICATION_message */
-    {bgp_ignore,  Idle    },	/* Clearing_Completed           */
+    {bgp_ignore,   Clearing},	/* BGP_Start                    */
+    {bgp_shutdown, Clearing},	/* BGP_Stop                     */
+    {bgp_shutdown, Clearing},	/* TCP_connection_open          */
+    {bgp_shutdown, Clearing},	/* TCP_connection_closed        */
+    {bgp_shutdown, Clearing},	/* TCP_connection_open_failed   */
+    {bgp_shutdown, Clearing},	/* TCP_fatal_error              */
+    {bgp_shutdown, Clearing},	/* ConnectRetry_timer_expired   */
+    {bgp_shutdown, Clearing},	/* Hold_Timer_expired           */
+    {bgp_shutdown, Clearing},	/* KeepAlive_timer_expired      */
+    {bgp_shutdown, Clearing},	/* Receive_OPEN_message         */
+    {bgp_shutdown, Clearing},	/* Receive_KEEPALIVE_message    */
+    {bgp_shutdown, Clearing},	/* Receive_UPDATE_message       */
+    {bgp_shutdown, Clearing},	/* Receive_NOTIFICATION_message */
+    {bgp_stop,     Idle    },	/* Clearing_Completed           */
   },
   {
     /* Deleted, */
diff --git a/bgpd/bgp_fsm.c b/bgpd/bgp_fsm.c
index d0db6aa..bd96139 100644
--- a/bgpd/bgp_fsm.c
+++ b/bgpd/bgp_fsm.c
 <at>  <at>  -426,17 +426,14  <at>  <at>  bgp_fsm_change_status (struct peer *peer, int status)
 		LOOKUP (bgp_status_msg, peer->status));
 }
 
-/* Administrative BGP peer stop event. */
-int
-bgp_stop (struct peer *peer)
+/* Shutdown and release resources */
+static int
+bgp_shutdown (struct peer *peer)
 {
   afi_t afi;
   safi_t safi;
   char orf_name[BUFSIZ];
 
-  /* Delete all existing events of the peer */
-  BGP_EVENT_FLUSH (peer);
-
   /* Increment Dropped count. */
   if (peer->status == Established)
     {
 <at>  <at>  -573,6 +570,20  <at>  <at>  bgp_stop (struct peer *peer)
   return 0;
 }
 
+/* Administrative BGP peer stop event. 
+ *
+ * Also used by bgp_packet to do its peer-swap hack
+ */
+int
+bgp_stop (struct peer *peer) {
+  bgp_shutdown (peer);
+  
+  /* Delete all existing events of the peer */
+  BGP_EVENT_FLUSH (peer);
+  
+  return 0;
+}
+
 /* BGP peer is stoped by the error. */
 static int
 bgp_stop_with_error (struct peer *peer)
 <at>  <at>  -584,7 +595,7  <at>  <at>  bgp_stop_with_error (struct peer *peer)
   if (peer->v_start >= (60 * 2))
     peer->v_start = (60 * 2);
 
-  bgp_stop (peer);
+  bgp_shutdown (peer);
 
   return 0;
 }
 <at>  <at>  -996,36 +1007,36  <at>  <at>  struct {
   {
     /* Established, */
     {bgp_ignore,               Established}, /* BGP_Start                    */
-    {bgp_stop,                    Clearing}, /* BGP_Stop                     */
-    {bgp_stop,                    Clearing}, /* TCP_connection_open          */
-    {bgp_stop,                    Clearing}, /* TCP_connection_closed        */
-    {bgp_ignore,                  Clearing}, /* TCP_connection_open_failed   */
-    {bgp_stop,                    Clearing}, /* TCP_fatal_error              */
-    {bgp_ignore,                  Clearing}, /* ConnectRetry_timer_expired   */
-    {bgp_fsm_holdtime_expire,     Clearing}, /* Hold_Timer_expired           */
+    {bgp_shutdown,             Clearing}, /* BGP_Stop                     */
+    {bgp_shutdown,             Clearing}, /* TCP_connection_open          */
+    {bgp_shutdown,             Clearing}, /* TCP_connection_closed        */
+    {bgp_shutdown,             Clearing}, /* TCP_connection_open_failed   */
+    {bgp_shutdown,             Clearing}, /* TCP_fatal_error              */
+    {bgp_shutdown,             Clearing}, /* ConnectRetry_timer_expired   */
+    {bgp_fsm_holdtime_expire,  Clearing}, /* Hold_Timer_expired           */
     {bgp_fsm_keepalive_expire, Established}, /* KeepAlive_timer_expired      */
-    {bgp_stop,                    Clearing}, /* Receive_OPEN_message         */
+    {bgp_shutdown,             Clearing}, /* Receive_OPEN_message         */
     {bgp_fsm_keepalive,        Established}, /* Receive_KEEPALIVE_message    */
     {bgp_fsm_update,           Established}, /* Receive_UPDATE_message       */
-    {bgp_stop_with_error,         Clearing}, /* Receive_NOTIFICATION_message */
-    {bgp_ignore,                      Idle}, /* Clearing_Completed           */
+    {bgp_stop_with_error,      Clearing}, /* Receive_NOTIFICATION_message */
+    {bgp_ignore,               Idle}, /* Clearing_Completed           */
   },
   {
     /* Clearing, */
-    {bgp_ignore,  Clearing},	/* BGP_Start                    */
-    {bgp_stop,    Clearing},	/* BGP_Stop                     */
-    {bgp_stop,    Clearing},	/* TCP_connection_open          */
-    {bgp_stop,    Clearing},	/* TCP_connection_closed        */
-    {bgp_stop,    Clearing},	/* TCP_connection_open_failed   */
-    {bgp_stop,    Clearing},	/* TCP_fatal_error              */
-    {bgp_ignore,  Clearing},	/* ConnectRetry_timer_expired   */
-    {bgp_ignore,  Clearing},	/* Hold_Timer_expired           */
-    {bgp_ignore,  Clearing},	/* KeepAlive_timer_expired      */
-    {bgp_ignore,  Clearing},	/* Receive_OPEN_message         */
-    {bgp_ignore,  Clearing},	/* Receive_KEEPALIVE_message    */
-    {bgp_ignore,  Clearing},	/* Receive_UPDATE_message       */
-    {bgp_ignore,  Clearing},	/* Receive_NOTIFICATION_message */
-    {bgp_ignore,  Idle    },	/* Clearing_Completed           */
+    {bgp_ignore,   Clearing},	/* BGP_Start                    */
+    {bgp_shutdown, Clearing},	/* BGP_Stop                     */
+    {bgp_shutdown, Clearing},	/* TCP_connection_open          */
+    {bgp_shutdown, Clearing},	/* TCP_connection_closed        */
+    {bgp_shutdown, Clearing},	/* TCP_connection_open_failed   */
+    {bgp_shutdown, Clearing},	/* TCP_fatal_error              */
+    {bgp_shutdown, Clearing},	/* ConnectRetry_timer_expired   */
+    {bgp_shutdown, Clearing},	/* Hold_Timer_expired           */
+    {bgp_shutdown, Clearing},	/* KeepAlive_timer_expired      */
+    {bgp_shutdown, Clearing},	/* Receive_OPEN_message         */
+    {bgp_shutdown, Clearing},	/* Receive_KEEPALIVE_message    */
+    {bgp_shutdown, Clearing},	/* Receive_UPDATE_message       */
+    {bgp_shutdown, Clearing},	/* Receive_NOTIFICATION_message */
+    {bgp_stop,     Idle    },	/* Clearing_Completed           */
   },
   {
     /* Deleted, */

Gmane