Creative Mikrotik scripting - checking connectivity inside customer networks
Hi all, On the back of Terry's creative NBN interface management, I thought I would share an interesting "hack" I came up with last week that might be of value to list members. We provide Managed Private Network services to some customers, where they have multiple sites connected to each other. Our monitoring system does a great job of monitoring if our routers are alive and accessible on our management IP subnets, but we didn't have a way of ensuring that the inter-site links were online. These could be VPNs, EoIP tunnels, VLANs, or routed connections. We also sometimes need to know if there is an issue with their Internet connectivity. A while back, I wrote a script that would loop through a Firewall IP Address List to ping specific hosts and send an email if any were offline. However, this never made it to production for various reasons, and had issues such as producing a lot of emails. Last week I revisited the project, and was looking around to see if I could read the results via SNMP and leverage our existing monitoring platform, which added the advantage of being able to see historical data. But things like Global Variables are not available via SNMP. Enter the Bridge Interface. These interfaces are available via SNMP, and have an MTU value that can go up to 65535. With a bit of tweaking to my previous script, instead of sending an email it changes the value of the Bridge MTU. The MTU can't go below about 80, so I set it up to have an "everything is good" value of 32768, then reduced the number depending on the services offline. A colleague suggested using powers of 2 so that, with a single number, we can identify exactly which services are offline, up to 15 entries. Our monitoring platform doesn't understand bitwise numbering like that, so it requires some manual processing to work out which services are offline, but it is enough to trigger an event so that somebody takes a look. This script gets installed on one single device per customer, obviously choosing one that has high stability and connectivity to all other sites, such as a router installed in our data centre. Any feedback is welcome. My apologies if my code isn't as polished as it could be, it is still being tweaked and improved. This is working well on RouterOS v6.30.4. Regards, Philip # Bridge interface /interface bridge add mtu=32768 name=bridge-MPN-Connectivity-Check # Address list - the first number is the value assigned to the site, and will be used to adjust the MTU of the bridge /ip firewall address-list add address=8.8.8.8 comment=1,Internet list=MPN-Connectivity-Check add address=192.168.250.10 comment=2,Site2 list=MPN-Connectivity-Check add address=192.168.250.18 comment=4,Site3 list=MPN-Connectivity-Check add address=192.168.250.54 comment=8,Site4 list=MPN-Connectivity-Check add address=192.168.250.58 comment=16,Site5 list=MPN-Connectivity-Check #The script that does the work /system script add name=MPN-Connectivity-Check policy=read,write source="# Check MPN paths (VLAN, EoIP, Internet)\r\ \n\r\ \n:local tempIP;\r\ \n:local tempIPcomment;\r\ \n:local tempIParray;\r\ \n:local pingResult;\r\ \n:local errorCount 0;\r\ \n\r\ \n:put (\"Running path assurance script...\");\r\ \n\r\ \n:foreach listEntry in=[/ip firewall address-list find where list=\"MPN-Connectivity-Check\" disabled=no] do={\r\ \n :set tempIP [/ip firewall address-list get \$listEntry address]\r\ \n :set tempIPcomment [/ip firewall address-list get \$listEntry comment]\r\ \n :set tempIParray [:toarray \$tempIPcomment];\r\ \n :set pingResult [/ping \$tempIP count=1]\r\ \n :if (\$pingResult = 0) do={\r\ \n # Ping failed\r\ \n :log warning \"Ping failed to \$tempIP (\$tempIPcomment)\"\r\ \n :put \"Ping failed to \$tempIP (\$tempIPcomment)\"\r\ \n :set errorCount (\$errorCount + \$tempIParray->0)\r\ \n }\r\ \n}\r\ \n\r\ \n# You can't have values below about 80 - but that is enough to cause alarms, so set it to minimum of 80\r\ \n:if (\$errorCount>32688) do={:set errorCount 32688}\r\ \n\r\ \n# Update the MTU value regardless\r\ \n:put \"Setting alert status\";\r\ \n/interface bridge set mtu=(32768-\$errorCount) [find name=bridge-MPN-Connectivity-Check]\r\ \n" #Make it run every minute /system scheduler add interval=1m name=MPN-Connectivity-Check on-event="/system script run 0" policy=\ ftp,reboot,read,write,policy,test,password,sniff,sensitive start-time=startup To get the OID for SNMP monitoring of the new Bridge interface: /interface print [check the ID of bridge-MPN-Connectivity-Check] print oid [look for entry ID you noted before, capture OID for actual-mtu]
Hey Philip, neat work mate, I never really considered using a bridge like this :-) Thanks for sharing Regards Paul -----Original Message----- From: Public [mailto:public-bounces@talk.mikrotik.com.au] On Behalf Of Philip Loenneker Sent: Tuesday, 6 September 2016 2:44 PM To: MikroTik Australia Public List Subject: [MT-AU Public] Creative Mikrotik scripting - checking connectivity inside customer networks Hi all, On the back of Terry's creative NBN interface management, I thought I would share an interesting "hack" I came up with last week that might be of value to list members. We provide Managed Private Network services to some customers, where they have multiple sites connected to each other. Our monitoring system does a great job of monitoring if our routers are alive and accessible on our management IP subnets, but we didn't have a way of ensuring that the inter-site links were online. These could be VPNs, EoIP tunnels, VLANs, or routed connections. We also sometimes need to know if there is an issue with their Internet connectivity. A while back, I wrote a script that would loop through a Firewall IP Address List to ping specific hosts and send an email if any were offline. However, this never made it to production for various reasons, and had issues such as producing a lot of emails. Last week I revisited the project, and was looking around to see if I could read the results via SNMP and leverage our existing monitoring platform, which added the advantage of being able to see historical data. But things like Global Variables are not available via SNMP. Enter the Bridge Interface. These interfaces are available via SNMP, and have an MTU value that can go up to 65535. With a bit of tweaking to my previous script, instead of sending an email it changes the value of the Bridge MTU. The MTU can't go below about 80, so I set it up to have an "everything is good" value of 32768, then reduced the number depending on the services offline. A colleague suggested using powers of 2 so that, with a single number, we can identify exactly which services are offline, up to 15 entries. Our monitoring platform doesn't understand bitwise numbering like that, so it requires some manual processing to work out which services are offline, but it is enough to trigger an event so that somebody takes a look. This script gets installed on one single device per customer, obviously choosing one that has high stability and connectivity to all other sites, such as a router installed in our data centre. Any feedback is welcome. My apologies if my code isn't as polished as it could be, it is still being tweaked and improved. This is working well on RouterOS v6.30.4. Regards, Philip # Bridge interface /interface bridge add mtu=32768 name=bridge-MPN-Connectivity-Check # Address list - the first number is the value assigned to the site, and will be used to adjust the MTU of the bridge /ip firewall address-list add address=8.8.8.8 comment=1,Internet list=MPN-Connectivity-Check add address=192.168.250.10 comment=2,Site2 list=MPN-Connectivity-Check add address=192.168.250.18 comment=4,Site3 list=MPN-Connectivity-Check add address=192.168.250.54 comment=8,Site4 list=MPN-Connectivity-Check add address=192.168.250.58 comment=16,Site5 list=MPN-Connectivity-Check #The script that does the work /system script add name=MPN-Connectivity-Check policy=read,write source="# Check MPN paths (VLAN, EoIP, Internet)\r\ \n\r\ \n:local tempIP;\r\ \n:local tempIPcomment;\r\ \n:local tempIParray;\r\ \n:local pingResult;\r\ \n:local errorCount 0;\r\ \n\r\ \n:put (\"Running path assurance script...\");\r\ \n\r\ \n:foreach listEntry in=[/ip firewall address-list find where list=\"MPN-Connectivity-Check\" disabled=no] do={\r\ \n :set tempIP [/ip firewall address-list get \$listEntry address]\r\ \n :set tempIPcomment [/ip firewall address-list get \$listEntry comment]\r\ \n :set tempIParray [:toarray \$tempIPcomment];\r\ \n :set pingResult [/ping \$tempIP count=1]\r\ \n :if (\$pingResult = 0) do={\r\ \n # Ping failed\r\ \n :log warning \"Ping failed to \$tempIP (\$tempIPcomment)\"\r\ \n :put \"Ping failed to \$tempIP (\$tempIPcomment)\"\r\ \n :set errorCount (\$errorCount + \$tempIParray->0)\r\ \n }\r\ \n}\r\ \n\r\ \n# You can't have values below about 80 - but that is enough to cause alarms, so set it to minimum of 80\r\ \n:if (\$errorCount>32688) do={:set errorCount 32688}\r\ \n\r\ \n# Update the MTU value regardless\r\ \n:put \"Setting alert status\";\r\ \n/interface bridge set mtu=(32768-\$errorCount) [find name=bridge-MPN-Connectivity-Check]\r\ \n" #Make it run every minute /system scheduler add interval=1m name=MPN-Connectivity-Check on-event="/system script run 0" policy=\ ftp,reboot,read,write,policy,test,password,sniff,sensitive start-time=startup To get the OID for SNMP monitoring of the new Bridge interface: /interface print [check the ID of bridge-MPN-Connectivity-Check] print oid [look for entry ID you noted before, capture OID for actual-mtu] _______________________________________________ Public mailing list Public@talk.mikrotik.com.au http://talk.mikrotik.com.au/mailman/listinfo/public_talk.mikrotik.com.au
-----Original Message----- From: Public [mailto:public-bounces@talk.mikrotik.com.au] On Behalf Of Paul Julian Sent: Tuesday, 6 September 2016 9:37 PM To: 'MikroTik Australia Public List' <public@talk.mikrotik.com.au> Subject: Re: [MT-AU Public] Creative Mikrotik scripting - checking connectivity inside customer networks
Hey Philip, neat work mate, I never really considered using a bridge like
Some neat topics for the next MUM I think! ;) Cheers! this :-)
Thanks for sharing
Regards Paul
-----Original Message----- From: Public [mailto:public-bounces@talk.mikrotik.com.au] On Behalf Of Philip Loenneker Sent: Tuesday, 6 September 2016 2:44 PM To: MikroTik Australia Public List Subject: [MT-AU Public] Creative Mikrotik scripting - checking
inside customer networks
Hi all,
On the back of Terry's creative NBN interface management, I thought I would share an interesting "hack" I came up with last week that might be of value to list members.
We provide Managed Private Network services to some customers, where they have multiple sites connected to each other. Our monitoring system does a great job of monitoring if our routers are alive and accessible on our management IP subnets, but we didn't have a way of ensuring that the inter- site links were online. These could be VPNs, EoIP tunnels, VLANs, or routed connections. We also sometimes need to know if there is an issue with
Internet connectivity.
A while back, I wrote a script that would loop through a Firewall IP Address List to ping specific hosts and send an email if any were offline. However, this never made it to production for various reasons, and had issues such as producing a lot of emails.
Last week I revisited the project, and was looking around to see if I could read the results via SNMP and leverage our existing monitoring platform, which added the advantage of being able to see historical data. But things like Global Variables are not available via SNMP.
Enter the Bridge Interface. These interfaces are available via SNMP, and have an MTU value that can go up to 65535. With a bit of tweaking to my
script, instead of sending an email it changes the value of the Bridge MTU. The MTU can't go below about 80, so I set it up to have an "everything is good" value of 32768, then reduced the number depending on the services offline. A colleague suggested using powers of 2 so that, with a single number, we can identify exactly which services are offline, up to 15 entries. Our monitoring platform doesn't understand bitwise numbering like that, so it requires some manual processing to work out which services are offline, but it is enough to trigger an event so that somebody takes a look.
This script gets installed on one single device per customer, obviously choosing one that has high stability and connectivity to all other sites, such as a router installed in our data centre.
Any feedback is welcome. My apologies if my code isn't as polished as it could be, it is still being tweaked and improved. This is working well on RouterOS v6.30.4.
Regards, Philip
# Bridge interface /interface bridge add mtu=32768 name=bridge-MPN-Connectivity-Check
# Address list - the first number is the value assigned to the site, and will be used to adjust the MTU of the bridge /ip firewall address-list add address=8.8.8.8 comment=1,Internet list=MPN-Connectivity-Check add address=192.168.250.10 comment=2,Site2 list=MPN-Connectivity-Check add address=192.168.250.18 comment=4,Site3 list=MPN-Connectivity-Check add address=192.168.250.54 comment=8,Site4 list=MPN-Connectivity-Check add address=192.168.250.58 comment=16,Site5 list=MPN-Connectivity-Check
#The script that does the work /system script add name=MPN-Connectivity-Check policy=read,write source="# Check MPN paths (VLAN, EoIP, Internet)\r\ \n\r\ \n:local tempIP;\r\ \n:local tempIPcomment;\r\ \n:local tempIParray;\r\ \n:local pingResult;\r\ \n:local errorCount 0;\r\ \n\r\ \n:put (\"Running path assurance script...\");\r\ \n\r\ \n:foreach listEntry in=[/ip firewall address-list find where
connectivity their previous list=\"MPN-
Connectivity-Check\" disabled=no] do={\r\ \n :set tempIP [/ip firewall address-list get \$listEntry address]\r\ \n :set tempIPcomment [/ip firewall address-list get \$listEntry comment]\r\ \n :set tempIParray [:toarray \$tempIPcomment];\r\ \n :set pingResult [/ping \$tempIP count=1]\r\ \n :if (\$pingResult = 0) do={\r\ \n # Ping failed\r\ \n :log warning \"Ping failed to \$tempIP (\$tempIPcomment)\"\r\ \n :put \"Ping failed to \$tempIP (\$tempIPcomment)\"\r\ \n :set errorCount (\$errorCount + \$tempIParray->0)\r\ \n }\r\ \n}\r\ \n\r\ \n# You can't have values below about 80 - but that is enough to cause alarms, so set it to minimum of 80\r\ \n:if (\$errorCount>32688) do={:set errorCount 32688}\r\ \n\r\ \n# Update the MTU value regardless\r\ \n:put \"Setting alert status\";\r\ \n/interface bridge set mtu=(32768-\$errorCount) [find name=bridge-MPN- Connectivity-Check]\r\ \n"
#Make it run every minute /system scheduler add interval=1m name=MPN-Connectivity-Check on-event="/system script run 0" policy=\ ftp,reboot,read,write,policy,test,password,sniff,sensitive start- time=startup
To get the OID for SNMP monitoring of the new Bridge interface: /interface print [check the ID of bridge-MPN-Connectivity-Check] print oid [look for entry ID you noted before, capture OID for actual-mtu]
_______________________________________________ Public mailing list Public@talk.mikrotik.com.au http://talk.mikrotik.com.au/mailman/listinfo/public_talk.mikrotik.com.au
_______________________________________________ Public mailing list Public@talk.mikrotik.com.au http://talk.mikrotik.com.au/mailman/listinfo/public_talk.mikrotik.com.au
participants (3)
-
Mike Everest
-
Paul Julian
-
Philip Loenneker