Strange OSPF Issue - bit urgent.....
Hi All, just out of the blue tonight the loopback address on both our core switches stopped responding, the switches are still passing data thank god but there is clearly an OSPF issue between these Juniper EX series switches and my Routerboards 1100AHx2's. We run a typical core design with two switches and two routers and additional border gateways in a standard redundant design, but out of the blue tonight all of a sudden core router 1 decided to complain about the switch neighbours going from exStart to 2Way mode, and I am not getting the router from the switch it's complaining about, nor the other switch. I have done some research and it seems that the OSPF error I am seeing is typically related to MTU mismatch, so I did some checking, my L3 MTU is 1500 and L2 MTU is 9192 on core router 2, but on core router 1 it's 1600 and 9116. After thinking to myself that I had found the issue, even though no MTU changes have been made, when I tried to correct the MTU problem I couldn't because I am plugged into switch 1 via ether13, and the maxmtu for some stupid reason on that port is 9116, I can only set the L3 MTU, L2 MTU is stuck on a max of 9116. Now this was working perfectly with nothing touched until about an hour ago, so I am forked if I know what is going on. Both core routers run 5.24 at the moment, I know it's old but it's been really stable so I haven't wanted to upset the apple cart. As these are core routers I can't just drop them and upgrade them or I would have a lot of customers that were pretty upset, so apart from updating the MTU which clearly I can't, does anybody have any pretty quick suggestions on what I could try to fix this issue before business hours kicks in again ? I think I have seen something like this once before and a reboot of said router fixed it, but right now the impact I can see it has is just that I can't manage those switches right now which is not a critical issue, rebooting the core routers though is a whole other story.... Thanks in advance Regards Paul
Well tried everything I could, ended up holding out as late as I could and rebooted core routers, hopefully nobody will complain, but problem is fixed, just with a reboot. Time to schedule some updates to our core routers I think. Regards Paul -----Original Message----- From: Public [mailto:public-bounces@talk.mikrotik.com.au] On Behalf Of Paul Julian Sent: Wednesday, 12 August 2015 10:41 PM To: public@talk.mikrotik.com.au Subject: [MT-AU Public] Strange OSPF Issue - bit urgent..... Hi All, just out of the blue tonight the loopback address on both our core switches stopped responding, the switches are still passing data thank god but there is clearly an OSPF issue between these Juniper EX series switches and my Routerboards 1100AHx2's. We run a typical core design with two switches and two routers and additional border gateways in a standard redundant design, but out of the blue tonight all of a sudden core router 1 decided to complain about the switch neighbours going from exStart to 2Way mode, and I am not getting the router from the switch it's complaining about, nor the other switch. I have done some research and it seems that the OSPF error I am seeing is typically related to MTU mismatch, so I did some checking, my L3 MTU is 1500 and L2 MTU is 9192 on core router 2, but on core router 1 it's 1600 and 9116. After thinking to myself that I had found the issue, even though no MTU changes have been made, when I tried to correct the MTU problem I couldn't because I am plugged into switch 1 via ether13, and the maxmtu for some stupid reason on that port is 9116, I can only set the L3 MTU, L2 MTU is stuck on a max of 9116. Now this was working perfectly with nothing touched until about an hour ago, so I am forked if I know what is going on. Both core routers run 5.24 at the moment, I know it's old but it's been really stable so I haven't wanted to upset the apple cart. As these are core routers I can't just drop them and upgrade them or I would have a lot of customers that were pretty upset, so apart from updating the MTU which clearly I can't, does anybody have any pretty quick suggestions on what I could try to fix this issue before business hours kicks in again ? I think I have seen something like this once before and a reboot of said router fixed it, but right now the impact I can see it has is just that I can't manage those switches right now which is not a critical issue, rebooting the core routers though is a whole other story.... Thanks in advance Regards Paul _______________________________________________ Public mailing list Public@talk.mikrotik.com.au http://talk.mikrotik.com.au/mailman/listinfo/public_talk.mikrotik.com.au
I'd be curious to know if this was a random incident, or a known bug in that version. I've not encountered it before, but I wonder if anyone else has? Best of luck :) On Wed, 12 Aug 2015, Paul Julian wrote:
Well tried everything I could, ended up holding out as late as I could and rebooted core routers, hopefully nobody will complain, but problem is fixed, just with a reboot.
Time to schedule some updates to our core routers I think.
Regards Paul
-----Original Message----- From: Public [mailto:public-bounces@talk.mikrotik.com.au] On Behalf Of Paul Julian Sent: Wednesday, 12 August 2015 10:41 PM To: public@talk.mikrotik.com.au Subject: [MT-AU Public] Strange OSPF Issue - bit urgent.....
Hi All, just out of the blue tonight the loopback address on both our core switches stopped responding, the switches are still passing data thank god but there is clearly an OSPF issue between these Juniper EX series switches and my Routerboards 1100AHx2's.
We run a typical core design with two switches and two routers and additional border gateways in a standard redundant design, but out of the blue tonight all of a sudden core router 1 decided to complain about the switch neighbours going from exStart to 2Way mode, and I am not getting the router from the switch it's complaining about, nor the other switch.
I have done some research and it seems that the OSPF error I am seeing is typically related to MTU mismatch, so I did some checking, my L3 MTU is 1500 and L2 MTU is 9192 on core router 2, but on core router 1 it's 1600 and 9116. After thinking to myself that I had found the issue, even though no MTU changes have been made, when I tried to correct the MTU problem I couldn't because I am plugged into switch 1 via ether13, and the maxmtu for some stupid reason on that port is 9116, I can only set the L3 MTU, L2 MTU is stuck on a max of 9116.
Now this was working perfectly with nothing touched until about an hour ago, so I am forked if I know what is going on.
Both core routers run 5.24 at the moment, I know it's old but it's been really stable so I haven't wanted to upset the apple cart.
As these are core routers I can't just drop them and upgrade them or I would have a lot of customers that were pretty upset, so apart from updating the MTU which clearly I can't, does anybody have any pretty quick suggestions on what I could try to fix this issue before business hours kicks in again ?
I think I have seen something like this once before and a reboot of said router fixed it, but right now the impact I can see it has is just that I can't manage those switches right now which is not a critical issue, rebooting the core routers though is a whole other story....
Thanks in advance
Regards Paul _______________________________________________ Public mailing list Public@talk.mikrotik.com.au http://talk.mikrotik.com.au/mailman/listinfo/public_talk.mikrotik.com.au
_______________________________________________ Public mailing list Public@talk.mikrotik.com.au http://talk.mikrotik.com.au/mailman/listinfo/public_talk.mikrotik.com.au
I have had something similar before on one of those routers, we have a redundant transit link coming in on one of the routers and when we lit that up one day our whole OSPF area shit itself bigtime, had to disable the BGP peer to the provider and the router still wouldn't recover the routing table properly, had to reboot to get it working again, I thought that was just an anomaly but perhaps there are bugs in that version which only show up in certain scenarios. Regards Paul -----Original Message----- From: Public [mailto:public-bounces@talk.mikrotik.com.au] On Behalf Of Stephen Sent: Thursday, 13 August 2015 9:45 AM To: MikroTik Australia Public List Subject: Re: [MT-AU Public] Strange OSPF Issue - bit urgent..... I'd be curious to know if this was a random incident, or a known bug in that version. I've not encountered it before, but I wonder if anyone else has? Best of luck :) On Wed, 12 Aug 2015, Paul Julian wrote:
Well tried everything I could, ended up holding out as late as I could and rebooted core routers, hopefully nobody will complain, but problem is fixed, just with a reboot.
Time to schedule some updates to our core routers I think.
Regards Paul
-----Original Message----- From: Public [mailto:public-bounces@talk.mikrotik.com.au] On Behalf Of Paul Julian Sent: Wednesday, 12 August 2015 10:41 PM To: public@talk.mikrotik.com.au Subject: [MT-AU Public] Strange OSPF Issue - bit urgent.....
Hi All, just out of the blue tonight the loopback address on both our core switches stopped responding, the switches are still passing data thank god but there is clearly an OSPF issue between these Juniper EX series switches and my Routerboards 1100AHx2's.
We run a typical core design with two switches and two routers and additional border gateways in a standard redundant design, but out of the blue tonight all of a sudden core router 1 decided to complain about the switch neighbours going from exStart to 2Way mode, and I am not getting the router from the switch it's complaining about, nor the other switch.
I have done some research and it seems that the OSPF error I am seeing is typically related to MTU mismatch, so I did some checking, my L3 MTU is 1500 and L2 MTU is 9192 on core router 2, but on core router 1 it's 1600 and 9116. After thinking to myself that I had found the issue, even though no MTU changes have been made, when I tried to correct the MTU problem I couldn't because I am plugged into switch 1 via ether13, and the maxmtu for some stupid reason on that port is 9116, I can only set the L3 MTU, L2 MTU is stuck on a max of 9116.
Now this was working perfectly with nothing touched until about an hour ago, so I am forked if I know what is going on.
Both core routers run 5.24 at the moment, I know it's old but it's been really stable so I haven't wanted to upset the apple cart.
As these are core routers I can't just drop them and upgrade them or I would have a lot of customers that were pretty upset, so apart from updating the MTU which clearly I can't, does anybody have any pretty quick suggestions on what I could try to fix this issue before business hours kicks in again ?
I think I have seen something like this once before and a reboot of said router fixed it, but right now the impact I can see it has is just that I can't manage those switches right now which is not a critical issue, rebooting the core routers though is a whole other story....
Thanks in advance
Regards Paul _______________________________________________ Public mailing list Public@talk.mikrotik.com.au http://talk.mikrotik.com.au/mailman/listinfo/public_talk.mikrotik.com. au
_______________________________________________ Public mailing list Public@talk.mikrotik.com.au http://talk.mikrotik.com.au/mailman/listinfo/public_talk.mikrotik.com. au
_______________________________________________ Public mailing list Public@talk.mikrotik.com.au http://talk.mikrotik.com.au/mailman/listinfo/public_talk.mikrotik.com.au
We would randomly have this issue on 5.x, we logged tickets and from 6.2 onwards we have not seen the issue again. On Thu, Aug 13, 2015 at 12:06 PM, Paul Julian <paul@oxygennetworks.com.au> wrote:
I have had something similar before on one of those routers, we have a redundant transit link coming in on one of the routers and when we lit that up one day our whole OSPF area shit itself bigtime, had to disable the BGP peer to the provider and the router still wouldn't recover the routing table properly, had to reboot to get it working again, I thought that was just an anomaly but perhaps there are bugs in that version which only show up in certain scenarios.
Regards Paul
-----Original Message----- From: Public [mailto:public-bounces@talk.mikrotik.com.au] On Behalf Of Stephen Sent: Thursday, 13 August 2015 9:45 AM To: MikroTik Australia Public List Subject: Re: [MT-AU Public] Strange OSPF Issue - bit urgent.....
I'd be curious to know if this was a random incident, or a known bug in that version. I've not encountered it before, but I wonder if anyone else has? Best of luck :)
On Wed, 12 Aug 2015, Paul Julian wrote:
Well tried everything I could, ended up holding out as late as I could and rebooted core routers, hopefully nobody will complain, but problem is fixed, just with a reboot.
Time to schedule some updates to our core routers I think.
Regards Paul
-----Original Message----- From: Public [mailto:public-bounces@talk.mikrotik.com.au] On Behalf Of Paul Julian Sent: Wednesday, 12 August 2015 10:41 PM To: public@talk.mikrotik.com.au Subject: [MT-AU Public] Strange OSPF Issue - bit urgent.....
Hi All, just out of the blue tonight the loopback address on both our core switches stopped responding, the switches are still passing data thank god but there is clearly an OSPF issue between these Juniper EX series switches and my Routerboards 1100AHx2's.
We run a typical core design with two switches and two routers and additional border gateways in a standard redundant design, but out of the blue tonight all of a sudden core router 1 decided to complain about the switch neighbours going from exStart to 2Way mode, and I am not getting the router from the switch it's complaining about, nor the other switch.
I have done some research and it seems that the OSPF error I am seeing is typically related to MTU mismatch, so I did some checking, my L3 MTU is 1500 and L2 MTU is 9192 on core router 2, but on core router 1 it's 1600 and 9116. After thinking to myself that I had found the issue, even though no MTU changes have been made, when I tried to correct the MTU problem I couldn't because I am plugged into switch 1 via ether13, and the maxmtu for some stupid reason on that port is 9116, I can only set the L3 MTU, L2 MTU is stuck on a max of 9116.
Now this was working perfectly with nothing touched until about an hour ago, so I am forked if I know what is going on.
Both core routers run 5.24 at the moment, I know it's old but it's been really stable so I haven't wanted to upset the apple cart.
As these are core routers I can't just drop them and upgrade them or I would have a lot of customers that were pretty upset, so apart from updating the MTU which clearly I can't, does anybody have any pretty quick suggestions on what I could try to fix this issue before business hours kicks in again ?
I think I have seen something like this once before and a reboot of said router fixed it, but right now the impact I can see it has is just that I can't manage those switches right now which is not a critical issue, rebooting the core routers though is a whole other story....
Thanks in advance
Regards Paul _______________________________________________ Public mailing list Public@talk.mikrotik.com.au http://talk.mikrotik.com.au/mailman/listinfo/public_talk.mikrotik.com. au
_______________________________________________ Public mailing list Public@talk.mikrotik.com.au http://talk.mikrotik.com.au/mailman/listinfo/public_talk.mikrotik.com. au
_______________________________________________ Public mailing list Public@talk.mikrotik.com.au http://talk.mikrotik.com.au/mailman/listinfo/public_talk.mikrotik.com.au
_______________________________________________ Public mailing list Public@talk.mikrotik.com.au http://talk.mikrotik.com.au/mailman/listinfo/public_talk.mikrotik.com.au
Thanks Andrew, sounds like an upgrade is required then, thanks for confirming. Regards Paul -----Original Message----- From: Public [mailto:public-bounces@talk.mikrotik.com.au] On Behalf Of Andrew Thrift Sent: Thursday, 13 August 2015 3:25 PM To: MikroTik Australia Public List Subject: Re: [MT-AU Public] Strange OSPF Issue - bit urgent..... We would randomly have this issue on 5.x, we logged tickets and from 6.2 onwards we have not seen the issue again. On Thu, Aug 13, 2015 at 12:06 PM, Paul Julian <paul@oxygennetworks.com.au> wrote:
I have had something similar before on one of those routers, we have a redundant transit link coming in on one of the routers and when we lit that up one day our whole OSPF area shit itself bigtime, had to disable the BGP peer to the provider and the router still wouldn't recover the routing table properly, had to reboot to get it working again, I thought that was just an anomaly but perhaps there are bugs in that version which only show up in certain scenarios.
Regards Paul
-----Original Message----- From: Public [mailto:public-bounces@talk.mikrotik.com.au] On Behalf Of Stephen Sent: Thursday, 13 August 2015 9:45 AM To: MikroTik Australia Public List Subject: Re: [MT-AU Public] Strange OSPF Issue - bit urgent.....
I'd be curious to know if this was a random incident, or a known bug in that version. I've not encountered it before, but I wonder if anyone else has? Best of luck :)
On Wed, 12 Aug 2015, Paul Julian wrote:
Well tried everything I could, ended up holding out as late as I could and rebooted core routers, hopefully nobody will complain, but problem is fixed, just with a reboot.
Time to schedule some updates to our core routers I think.
Regards Paul
-----Original Message----- From: Public [mailto:public-bounces@talk.mikrotik.com.au] On Behalf Of Paul Julian Sent: Wednesday, 12 August 2015 10:41 PM To: public@talk.mikrotik.com.au Subject: [MT-AU Public] Strange OSPF Issue - bit urgent.....
Hi All, just out of the blue tonight the loopback address on both our core switches stopped responding, the switches are still passing data thank god but there is clearly an OSPF issue between these Juniper EX series switches and my Routerboards 1100AHx2's.
We run a typical core design with two switches and two routers and additional border gateways in a standard redundant design, but out of the blue tonight all of a sudden core router 1 decided to complain about the switch neighbours going from exStart to 2Way mode, and I am not getting the router from the switch it's complaining about, nor the other switch.
I have done some research and it seems that the OSPF error I am seeing is typically related to MTU mismatch, so I did some checking, my L3 MTU is 1500 and L2 MTU is 9192 on core router 2, but on core router 1 it's 1600 and 9116. After thinking to myself that I had found the issue, even though no MTU changes have been made, when I tried to correct the MTU problem I couldn't because I am plugged into switch 1 via ether13, and the maxmtu for some stupid reason on that port is 9116, I can only set the L3 MTU, L2 MTU is stuck on a max of 9116.
Now this was working perfectly with nothing touched until about an hour ago, so I am forked if I know what is going on.
Both core routers run 5.24 at the moment, I know it's old but it's been really stable so I haven't wanted to upset the apple cart.
As these are core routers I can't just drop them and upgrade them or I would have a lot of customers that were pretty upset, so apart from updating the MTU which clearly I can't, does anybody have any pretty quick suggestions on what I could try to fix this issue before business hours kicks in again ?
I think I have seen something like this once before and a reboot of said router fixed it, but right now the impact I can see it has is just that I can't manage those switches right now which is not a critical issue, rebooting the core routers though is a whole other story....
Thanks in advance
Regards Paul _______________________________________________ Public mailing list Public@talk.mikrotik.com.au http://talk.mikrotik.com.au/mailman/listinfo/public_talk.mikrotik.com. au
_______________________________________________ Public mailing list Public@talk.mikrotik.com.au http://talk.mikrotik.com.au/mailman/listinfo/public_talk.mikrotik.com. au
_______________________________________________ Public mailing list Public@talk.mikrotik.com.au http://talk.mikrotik.com.au/mailman/listinfo/public_talk.mikrotik.com. au
_______________________________________________ Public mailing list Public@talk.mikrotik.com.au http://talk.mikrotik.com.au/mailman/listinfo/public_talk.mikrotik.com. au
_______________________________________________ Public mailing list Public@talk.mikrotik.com.au http://talk.mikrotik.com.au/mailman/listinfo/public_talk.mikrotik.com.au
participants (3)
-
Andrew Thrift
-
Paul Julian
-
Stephen