-----Original Message----- From: Public [mailto:public-bounces@talk.mikrotik.com.au] On Behalf Of Andrew Cox Sent: Tuesday, 29 April 2014 11:47 PM To: steve@digitronics.com.au; MikroTik Australia Public List Subject: Re: [MT-AU Public] Problems with multiple RBMetal2SHPn devices failing at one site
Hi Steve,
- checked ram/resource graphs to see if it was perhaps hitting a memory leak and crashing - tried disabling additional services that were not in use to make sure something isn't causing the crashes (remove l7 filtering anywhere, disable conntrack, stop polling via SNMP for a period of time, disable all but winbox/ssh services) - (as an inverse to the previous) tried monitoring more information, voltage levels, cpu, interface errors, ambient temp - enabled watchdog timer with a ping/reboot target - added some netflow monitoring to report traffic through 1 or more of the units to catch any odd traffic around the time of the lockups - added firewall log & filter rules to drop any non-critical input-chain
hitting the units
Just a couple of things I'd look to run through, depending on how easy
Hi Andrew, all,... The previous 2 failures display apparent physical hardware problem (even netinstall does not recover) and they have even been accepted for replacement by MikroTik (i.e. they are convinced it is hardware problem ;) Which makes it quite a mystery indeed! I have seen this /sort/ of thing happen a few times before, but manifested behaviour is either drop in tx/rx signals (radio damage) or dead internal power supply, e.g: - lightning damage - electrical interference (e.g. nearby refrigeration motors) - industrial/welding machinery Usually, the effect can be resolved by proper grounding (via both case-to-pole and shielded Ethernet cable) and/or physical relocation of the mounting point by a few meters in any direction. In this case, however, the problem is not a typical behaviour - maybe something different? Cheers! Mike. traffic they are
to do in your current environment/setup.
- Andrew
On 29 April 2014 23:37, Steve at Digitronics <steve@digitronics.com.au>wrote:
We have lots of groove type devices out there, plastic and metal, but there is one installation where we are having consistent problems, and it is the only place we have had any problems.
Over the last 6 months or so we have had kernel failures and script errors logged on four different devices at the same site, the last three being RBMetal2SHPns. The four devices have been installed at the same site with the last three as subsequent replacements for the prior unit because it was playing up.
Typically, a device works ok for a while (up to a month) but then starts logging kernel faults and exhibiting other weird symptoms such as script failures, and vanishing scripts. Sometimes only a reboot or a power cycle will get a failed unit going again.
The chances of their being actual faulty devices is now so vanishing small as to be discounted, so we are trying to figure out what it is at this site that could be causing the same persistent failures on the series of devices.
The device mounting and sealing has been checked. The antenna VSWR has been checked. The antenna cabling has been checked. The CAT5 cabling has been checked. The PSU and POE injector have both been changed at different times. The PSU is on a UPS. The site is not subject to lightning strikes or subsequent voltage gradients. The device is only 15m from the POE injector. When it is working the wireless data throughput is as expected. The device is on a private CAN so cannot be publicly hacked. The times and frequency of failures do not follow any obvious throughput, temperature, humidity or time of day patterns. The unit at the other end of the link has never had a failure of any kind.
We are struggling to think of any other possible site specific environmental or equipment influence(s) that could be causing these failures, and I am really hoping that someone on the list can give us some fresh ideas or can share the resolution to similar circumstances they have experienced.
TIA.
Steve.
_______________________________________________ Public mailing list Public@talk.mikrotik.com.au http://talk.mikrotik.com.au/mailman/listinfo/public_talk.mikrotik.com. au
_______________________________________________ Public mailing list Public@talk.mikrotik.com.au http://talk.mikrotik.com.au/mailman/listinfo/public_talk.mikrotik.com.au