Big Data. Everyone seems to be talking about big data as if it is a panacea to solve every business problem. While we will freely admit that there are likely going to be some valuable insights within all of that data, the real discussion shouldn’t be about big data as a whole. Rather the question to ask should be “how can we quickly and efficiently leverage what’s in all of that data to positively impact our business?”
In many ways, data is a lot like a pile of lumber. By itself, the lumber doesn’t do much for you – it’s what you do with the lumber that really adds value. It’s the same way with data. Just collecting and storing it doesn’t really do much for the business. It’s the insights and analytics derived from the data which offers value to the enterprise.
Let’s also keep things real – that data doesn’t sit in one self-contained repository, and there is no certainty that the data will all share the same format or schema. Instead, data tends to be decentralized and disconnected from one source to the next, making it challenging to explore and analyze.
Companies have a lot of different data sources, like sale & marketing data; logistics, supply chain, and inventory data; customer data; accounting data; vendor and partner data and more. While the list can vary dramatically from one industry to the next, it behooves an organization to be able to access and derive real-time insights and analytics driven by a truly comprehensive 360-degree view of all of its data.
Factor in real-time data, like that generated by IoT devices, and you can easily see that this problem is only going to get bigger as more and more devices come online and begin generating data. By 2020, analysts are predicting that as many as 22 Billion IoT-enabled devices will be online and producing data – that works out to about three devices for every person on the planet.
So, how do we get from that pile of disparate data, real-time or otherwise, and arrive at a point where it can provide real value? How are we going to turn this into a better customer experience? How are we going to leverage this data to increase revenue growth? Can we use it to create better efficiencies? What about ensuring that fleets and other critical equipment are working properly? Historically, this is where exploratory analytics enters the equation.
Exploratory Analytics for Real-Time Data
Today’s business environment is more competitive than ever. It demands that companies make informed and timely decisions. Having a comprehensive read of current data is no longer a luxury – it is a necessity. In other words, business leaders are looking for on-demand insights and exploratory analytics of their real-time data and they need to be able to query that data efficiently and intuitively.
We suggest that what people are really trying with exploratory analytics is to discover something unknown to them. That’s quite a trick – “find that thing that I haven’t seen before, but please do it quickly.”
This is where just writing a query in an effort to “Solve for X” can be challenging. How can a data analyst “solve for X”, when she’s not even sure what “X” represents or even if “X” exists?
Typically, in order to understand what she has, the data analyst has to do a lot of data preparation and cleanup. The structure of the data can be unconventional. The data itself could be “dirty” and needs to be cleaned before use.
For example, one of our Cinchapi developers tells of a time that he needed to do clean up of data related to a branding impression survey of sporting goods companies. To ensure that the respondents were not biased by names contained in drop down lists or next to radio buttons, data was captured via blank text-based form fields.
Our team member quickly discovered that misspellings were common, and they had to be accounted for – just take the brand name “Adidas “ as an example. He had to ensure that all attempts to spell the brand, including variations like “Edinas” and “Addeedus”, were properly tabulated. Multiply that by the thousands of responses to be cleaned and you can see how cleaning just one data point can consume a great deal of time. Yet it needs to be completed before data exploration can truly be effective. While this is admittedly a very simplistic example of messy data, it should serve to illustrate the larger issue.
One method to speed up the process would be to deploy a team of data analysts to scour these different data sources looking for things that are unusual or out of character. While there is nothing wrong with this approach, it has to be said that it can be cost prohibitive for a smaller organization. Even for a larger enterprise, one with deep pockets, the negative could be the amount of time it would take to go through everything, with no certainty that there will be anything of true value at the end of the process.
It’s time for a better way.
Machine Learning and Human Intelligence
At Cinchapi, we realize that there are certain things that machines can do well and there are certain things that the human brain can do that a computer simply cannot. So why not combine the strengths of each? The Cinchapi Data Platform was purpose built to combine these strengths.
Machine learning can make very short work of data prep and cleanup – a process which can consume as much as 80% of a data analyst’s time. It can also look for patterns, anomalies, and relationships which were previously obscured or hidden across decentralized data sources.
Still, just because a pattern or an anomaly exists is not to suggest that it is relevant or worth additional investigation. That’s where the mind of a data professional comes into play. Machine learning is capable of handling many basic tasks, but it will not be able to do things like make a judgment call anytime soon.
However, by combining human thought with machine learning, the Cinchapi Data Platform makes the process of exploratory analytics far more efficient and intuitive than had previously been possible. With a context-aware and truly conversational natural language interface, users can literally ask questions of the data with everyday English words and phrases.
Instead of stilted queries like “Sales Report Cleveland 30 days”, users can ask real questions like “What is Cleveland looking like this month?”, and the machine will provide analytics as rich visualizations along with descriptive text to give the information desired. With each use, the platform learns more about the industry, company, and user roles to better understand to context of user questions.
Need to drill deeper? Just do what we do in real life – ask a follow-up question like “What does Cincinnati look like?” The machine not only understands the context of the second question because of the first, it also understands that the user’s role is in sales, so she is interested in sales numbers. Thus it knows when asked by the user about Cleveland, that she is almost always looking at her sales figures. It also knows that a month is roughly 30 days, so it defaults to that result.
While the Cinchapi Data Platform is an ideal real-time data analytics tool for data analysts and scientists, its three-step, Ask, See, and Act workflow makes it easy for business leaders with an interest in data to use it. Think of the advantage your business would have with decision makers able to gain insights from real-time data just by asking a few simple questions?
Want to learn more? Click here to see a one-minute video overview and to sign up for a live demo.