Author: Craig Mathias, Principal, Farpoint Group

There’s an old saying that I’m sure you’ve heard – if it ain’t broke, don’t fix it. And this maxim has defined much of human behavior regarding improvements (and sometimes, sadly, even routine and required maintenance!) to functioning products and systems for much of the industrial age. The information age, however, saw a modification to this saying into the much more important and relevant rule we should all live by today: fix it, or it will break! Good enough is never good enough, and any mission-critical element of operational IT deserves, and, yes, demands, regular re-evaluation and re-consideration, even if it’s currently doing the job it is intended for.

And the reason for this is that IT solutions tend to have complex and often obscure failure modes, usually, it seems, going south at the worst possible time. And, when that happens, it’s often extremely difficult to determine just what went wrong and how to best remedy the situation, all the while with productivity going straight down the tubes with critical information systems offline.

Sure – outright, total failures are rare, and such are usually the result of power failures. I’ve seen more than a few cases where the wrong Ethernet cable was unplugged, just for example. Clearly unintentional misconfigurations, upgrades gone wrong, and operational and administrative errors come next, and the possibility of failures in security leading to intentional damage requires constant vigilance

More likely, though, are errors that creep in over time. Available capacity gets consumed (end-user demand only grows, after all), but even more importantly many tasks which could be used to monitor for and proactively address potential reliability issues are historically labor-intensive. This means that support, troubleshooting and changes to operational procedures to minimize costs often have big costs of their own, and consequently don’t always get done. This is why a wholesale replacement of network infrastructure, most notably Wi-Fi, is desirable for so many shops – newer products can do a much better job of minimizing operational overhead and expense. While updates, upgrades, bug fixes, and related should absolutely be performed regularly and as required, the best way to fix it before it breaks is increasingly to take advantage of new technologies and capabilities embodied in newer products.

And this doesn’t mean just gaining access to the latest IEEE 802.11 standards and Wi-Fi Alliance specifications. Leading vendors are today doing a lot more than ever before to justify new investments in capital spending by demonstrating a reduction in operational expense, with a corresponding improvement in reliability – a major win-win if there ever was one.

Just for example, consider the use of artificial intelligence (AI) techniques in minimizing operational expense. Intelligent automation can spot and remedy problems before they become productivity-sapping issues or even outright outages, automatically adjusting operational parameters like channel assignments and signal strength system-wide with no operator intervention. Humans simply cannot do this anymore – the systems we depend upon are too large for any human to manage the numerous low-level details involved. This is just like the case with modern aircraft, where pilots specify what to do and dozens of processors and lots of software decide how, optimizing the operator’s intent and specified policies in real time.

Similarly, Cloud-based solutions push essential complexity into services addressing large numbers of otherwise unrelated end-user organizations. Centralized analytics can spot issues and often-subtle negative trends before end-users – and even operations staff – can, with fixes applied again before any productivity is lost. The Cloud additionally enables essentially transparent scalability, and all of this at the lowest cost possible.

Interestingly, this future is here today. AI Wi-Fi is already available – take a look at what KodaCloud is offering, again just for example. We can indeed have solutions that are self-configuring, self-optimizing, and self-repairing, resulting in solutions that essentially never break, in the traditional sense, in the first place. Fix it before it breaks, then? No problem!

