Data Mining the Internet of Things
Flickr - Tau Zero
In 1994 my wife came to me while I was at a fledgling startup department at MCI, providing Internet at college campuses with their CampusMCI service. She said to me, “Someone should start a company that has every movie ever made and you can watch them on demand over the internet.”
I brushed her suggestion aside. “Video requires huge amounts of data. It simply couldn’t be delivered fast enough, especially as a widely available service.” Having seen data speeds jump from 300 baud to 56k, I still couldn’t comprehend a world where widespread 10mb cable connections would be prevalent or where fiber would threaten to make those speeds seem archaic. It was my “why would you want more than 640k of memory,” moment. Years later when Netflix announced streaming service, she enjoyed reminding me.
The ChromeBook is realization of the vision that Ellison had in his head all those years ago
That same year an executive at MCI came to me and asked me to create a report analyzing the prospects for thin-client computing. Oracle’s CEO, Larry Ellison just wouldn’t shut up about how disruptive thin-client computing was going to be to technology industries. Centralized management, lower total cost of ownership and operation, better security, the benefits were just too large for corporate America to continue to ignore. My recommendation – without complete shifts in the underlying nature of network infrastructure and back-end servers, thin-client computing was a step backwards to the early days of big-iron mainframe and minicomputers connected to dumb terminals.
Now, 20 years later, we stand at the cusp of a fundamental shift to lower cost, lower powered personal computing devices that rely on cloud based web applications and services. But we still can’t say Larry was right, because we’re still not there. The ChromeBook is realization of the vision that Ellison had in his head all those years ago, but consumers still aren’t flocking to adopt this approach to personal computing. There is hesitation, mistrust and what I think is the good sense to understand that the promise of ubiquitous and persistent wireless connectivity to remote cloud services is not a reality yet and without that thin-client computing is a hollow promise full of compromise.
Now, we really don’t think of ChromeOS and cloud based services as the realization of Larry Ellison’s proposals on thin-client computing today, but they are.
Many of the barriers to widespread adoption that keep thin-client solutions at bay apply to connected machines
So what does the slow adoption of thin-client computing have to do with the Internet of Things and Machine to Machine (M2M) technologies? A lot more than you might think at first glance. There is the same sense of hyperbole and exaggeration around the impact and speed with which IoT will change society. Additionally, many of the barriers to widespread adoption that keep thin-client solutions at bay apply to connected machines.
Consumer reluctance and privacy concerns, cost management and benefit, technological maturity and the ability to rapidly and inexpensively transfer, analyze, interpret and act on the data are all hurdles to widespread M2M adoption.
We’ve seen much analysis of Google’s acquisition of NEST, and quite a bit of speculation as to why Google was willing to pay more than $1 billion more than the market value to ensure that this property would become part of their family. While that is an interesting part of the story, I’m going to focus on something a little more mundane that hardly ever gets mentioned in the M2M discussion.
GM is a pioneer in this field. Their OnStar system is exactly the kind of automotive telematics system frequently used as an example of what the future holds for machines connected by the Internet of Things. But GM has been doing it since 1996. Your Nike Fuelband doesn’t seem so innovative and revolutionary now, does it?
Several years ago I purchased a used Escalade. My first experience with OnStar was driving in a heavy storm looking for a house I had never been to before. In these horrible conditions I drove around the streets of a suburban subdivision at 5MPH for 45 minutes. Suddenly I heard a phone ringing. I pulled out my phone to answer it, but that wasn’t it. I asked my friend if it was his phone, he replied,
“Is it your car?”
I looked down at the odometer and instead of the mileage I saw the word, “ring”. I then recalled seeing an icon of a phone somewhere on the dash. After a few moments of frantic searching, I remembered where it was. I pressed the button and was greeted by an operator who asked for the previous owner of the vehicle.
It took us a few moments to convince the operator that I was the new owner of the automobile. She then asked if I needed assistance. This is the double-edged sword of a connected M2M. My vehicle was constantly uploading telemetrics to servers at GM’s data-center, and those servers were analyzing all incoming data against algorithms. In turn, those routines looked for anomalous patterns and once detected, an operator was alerted to call and ensure that everything was okay. It is simultaneously comforting and troubling to think that a computer could determine something was amiss and that someone was standing by to make sure everything was okay. It was more troubling that I wasn’t even signed up for the service and had neither opted in nor out. It is worth noting that the operator unintentionally revealed personal information about the previous owner.
It is simultaneously comforting and troubling to think that a computer could determine something was amiss
I did end up subscribing, which was expensive and rarely used. When I had a flat tire on a highway in Montana, I would have had to wait 2 hours for roadside service and the agent could not tell me over the phone how to remove my spare tire. My sister-in-law’s iPhone and Google came to the rescue and we did it ourselves. On that same trip the telemetrics were not advanced enough to alert me that my transmission was running hot and causing damage. Once the damage was done and the car was damaged, OnStar agents ascertained that I had a fault code. Their suggestion was to turn the vehicle off to reset the code, wait 30 minutes, then try again. When the symptoms returned, my second call resulted in the advice to stop driving the car immediately.
These examples help illustrate the concerns and the immaturity of M2M connected devices and sensors. They are already here in a very primitive form. Since 1996 OnStar has morphed from a high end luxury feature on the most expensive GM vehicles to an aftermarket 3rd party addition that can be purchased for almost any make – but the subscription cost is still prohibitive. The bulk of that cost is the same thing that prevents thin-client solutions like ChromeOS from being viable. Your Fuelband connecting to your smartphone via BT to then upload data over your regular subscription’s wireless carrier is one thing. But the truth is that the ongoing debate about Net Neutrality has a huge impact on how quickly IoT technologies can be adopted.
The conclusion is clear - we simply do not have pipes big enough to deliver the Internet of Things promised us
Central to the debate about Net Neutrality is the concept that modern web apps and services are designed to deliver the most satisfying multimedia experience as if there are no limitations to bandwidth capacity. According to wireless carriers, telcos and ISPs, our demand for bandwidth intensive content is breaking the internet’s back. While I think they exaggerate, I know that if not for my grandfathered unlimited Verizon plan I would exceed the standard 2GB cap frequently. While the incremental data created by a single Fitbit is unlikely to cause total collapse of the carrier backbones, when multiplied by a billion connected cars, refrigerators, thermostats, streetlights, and anything else you can extract data from – the conclusion is clear – we simply do not have pipes big enough to deliver the Internet of Things promised us. Making that a reality will be expensive. Who will foot that bill? The cost savings provided by the analytics has to be amazing to offset the increased costs to move all that data around, store it, analyze it and act on it. The truth is that the transmission of data remains the most expensive part of the equation.
I’ve said before that Google put the cart before the horse with Chromebooks, and I think that M2M faces similar hurdles. For their part Google has proactive initiatives to bring competitive fiber to large metropolitan areas. While bringing 34 additional cities high speed internet is an aggressive roadmap, that only puts us a notch closer to supporting a M2M future.
Many argue that privacy is an illusion, and that those with nothing to hide have nothing to fear. I remain unconvinced that I want to give Google, Pepsico, Microsoft, GE, GM or any other corporation such unprecedented vision into my personal life.
Back to NEST acquisition concerns, M2M benefits come with a unique set of trade-offs. Google Now is a great example of the double-edged sword of data analytics able to anticipate needs and predict behaviors. With Google Now I saw the promise of the “personal” part of PDA for the first time since the Apple Newton. A personal assistant adds value by working so closely with you that they learn your routine and can act as an extension of your will. In return for that, you trust that individual to understand you better than anyone else.
Likewise with a PDA, in order for it to really learn you, you opt in to allowing it to collect, store and analyze very personal data. Many argue that privacy is an illusion, and that those with nothing to hide have nothing to fear. I remain unconvinced that I want to give Google, Pepsico, Microsoft, GE, GM or any other corporation such unprecedented vision into my personal life. Do you want a chip bag that tracks how many times a day and at what times you access it and how many chips you eat per session? Ruffles does, and that is part of the scope that makes analysts claim that M2M and IoT will be huge.
Recently I toured a food manufacturing plant and talked to an operator about the data that the various hoppers, baggers, feeders and other machines and robots were sending back to the central interface. The depth of detail was incredible. Reports were generated in real time as machines malfunctioned, as spice bins fell beneath critical thresholds, as other alerts were triggered. Then I watched in disappointment as he impatiently dismissed alerts repeatedly and expressed frustration at the volume of unnecessary data generated. I was shocked to find that the plant only had a single T1 and that the manufacturing network was isolated and not sharing data with the operations side.
Generating data and mining that data is key to leveraging the benefits of a M2M future. I’ve spoken about the avalanche of data created by society before. There are valid concerns that the signal to noise ratio drowns out the critical data from the trivial. Humans are data hoarders. We’re never sure what to keep and what to discard. Like logs full of arcane Windows or Linux server data, we generate mountains of information. Extracting useful conclusions from that data is the challenge. I am unconvinced that the Internet of Things makes this better or worse. The Internet of Things only becomes viable when we all become experts like Google at separating the wheat from the chaff.
I don’t want to sound completely bearish on the Internet of Things. I think we’ll see a slow and sporadic evolution, with occasional lurches forward, as different technologies evolve. As more capacity becomes available and affordable, industries will design more sophisticated M2M sensors. As that happens, costs will come down. Privacy issues will be discovered and addressed. I just think the current predictions of a wildly Internet connected world as early as 2020 are optimistic. M2M and IoT align nicely with business principles like JIT inventory to create leaner, more automated solutions that offset increased costs. It is an inevitable evolution to a knowledge driven technology based society. But it doesn’t come like a sudden tidal wave. Instead, it is the slow increase of a pot of liquid brought to boil. By the time it is pervasive and significant, we’ll long have since grown accustomed to thinking of it as normal.