Introduction – Building Better Companies with Data
It’s a fascinating time to be in the data and analytics space. Companies are more aware than ever of the impact data can have on their business. This year, our tried and true predictions that we have been making for some time now are coming true right before our very eyes — every company is becoming a data company.
In the coming years, the true value of data will be realized and companies will adjust their business in anticipation of this value. Data will drive processes and operations through analytics and BI. Developers and data teams will emerge and create clear and efficient insights that will drive decisions. Analysts and business users will be handed coherent answers from complex data queries that will lead the way to more successful businesses.
At Datore, we have very forward-looking leaders, innovators, and data professionals working full time on a vision toward building the future. In our newest trends guide — Data and Analytics Trends for 2020 (and 2030!) — we give you our predictions about what will happen in 2020, and what the data landscape will look like in 2030.
2020 – The BI Industry Will Consolidate Into a Handful of Single Stack Options
Embedded analytics have been gradually creeping into more and more of our tools for the past few years, but in 2019, we’ll see analytics incorporated into pretty much all the software we use. We’ll be able to analyze the swathes of data produced by our digital worlds and the software we use and packaging it into something we can make sense of—and act on—immediately.
There’s a rhythm to the BI industry that alternates between periods of bundling and unbundling. Around 20 years ago, there were a few big monolithic platforms where most companies did their BI analysis — MicroStrategy, Business Objects, and Cognos. At a point in the early 2000s, different point solutions began to spring up to provide superior tools for individual parts of that process — visualization, data science, warehousing, ETL. For example, Tableau is great at visualization, so companies began to build stacks that incorporated Tableau to their existing monolithic stack. Once Tableau emerged, there was an opportunity for vendors like Qlik and Power BI to grow. Those point solutions are all great at what they do, but a data stack that involves too many one-off additions introduces room for error and runs into scalability and consistency issues.
What we started to see in 2019 was a motion to combine several of the more popular point solutions into larger stacks. Google acquired Looker, Salesforce acquired Tableau, and Sisense merged with Periscope Data. This is a trend that will continue for the next few years until a few major players emerge as viable end-to-end solutions. The point solutions that teams love today will likely be merged into one of those big stacks or forced into a very difficult independent route.
For BI buyers, this means that the long-term BI stack options will be reduced. It also means that the packages will include an entire suite of tools and a buyer will need to be comfortable using each of those tools. As the industry continues bundling, organizations will buy one of the new monoliths; it will no longer be a tenable solution to customize a BI stack with a combination of preferred point solutions.
2030 – New BI Technology Will Introduce a Period of Unbundling
It’s a very interesting thought experiment to imagine what technology will drive the next round of unbundling. Will it be AI, new analysis or interaction methods, virtual reality, or something we can’t even imagine today? By 2030, it will be about the right time for this new technological breakthrough to occur. From a BI stack standpoint, this new technology will be so valuable that some organizations will break their existing monolithic stack to incorporate it as a point solution.
Once that breakthrough technology has been established, other tools will be created that work in conjunction with or compete against it. There may be an entirely new industry that appears based on this one big new innovation. As new layers are added to that technology, it opens the door for another round of stack unbundling to begin.
2020 – Product Management Will Turn Into a Test-Driven Process
The traditional (flawed) method of improving a product has been to talk to customers and observe market trends before synthesizing a hypothesis based on those anecdotes. From there, a small amount of data is usually hand-chosen to support the conclusion that a product manager has already determined. This approach occasionally results in good decisions but has obvious downfalls.
The products that are truly winning with data today are revolutionizing the product improvement process by shifting to a methodology that starts with initial hypotheses and executes a series of tests to conclusively determine an effect. Once a hypothesis becomes a quantifiable insight instead of just a hunch, it’s easier for a product team to invest in the right feature improvements.
This new approach is essentially the scientific method of product improvement. The types of people who will succeed in this new environment are those who can effectively translate product questions into highly measurable, single-variant tests with results that clearly verify or reject a hypothesis. These product managers will no longer make suggestions like “I think we should choose color A over color B.” They’ll upgrade those recommendations to “I ran a test and color A resulted in X% more product clicks than color B, so we should prioritize that change.”
2030 – High Levels of Product Testing and Improvement Will Be Ubiquitous
In today’s data environment, it’s relatively expensive to maintain a fully staffed data team to design and run all these tests. As the testing-savvy product managers ascend and the testing technology improves to be more user-friendly for that new audience, the trend will only accelerate further.
At a certain point, data will be so massive and the costs of running tests will be so small that a product team will be able to take small chunks of data and conduct conclusive testing in a very short time span. As the cost and time needed to run a test decrease, the number of tests a team can run will increase, so the people and teams running the best tests will very quickly prevail. In 10 years, it’s reasonable to expect that data-driven products will have a roadmap entirely designed by tests like this.
2020 – Companies Will Be Hybrid
Data and storage will not be on-premises or cloud, rather a hybrid of the two. When I say hybrid, there are a few dimensions I am talking about:
- Data storage will be hybrid — data will be stored both on-premises and in the cloud and companies will be agnostic to a cloud vendor
- Product features will be hybrid — there will be some product features that work in the cloud, some that work on-premises, and some that work on both
There are a few reasons why the hybrid solution will gain popularity. First, by storing data both in the cloud and on-premises, a company can be agnostic to a cloud vendor. With the data and analytics world changing so quickly, you don’t want to tether the entire future of your company to a specific technology or vendor.
Second, although cloud implementations are gaining traction, there are very good reasons for some companies to keep their data on-premises. Data security is one of the primary reasons that most companies make that decision.
However, one of the most overlooked benefits of keeping data on-premises, and one that will become very prevalent in the future, is the customer experience. The cloud can only provide so much performance. In order to provide a truly interactive experience with your customer, you need to accelerate the performance of the data, and that means moving it closer to the customer. In other words, you will need to cache certain snippets of data on-premises to meet customers’ growing expectations.
2030 – Logical Mashup
Data privacy issues will become even more pronounced in the future. We can already see that companies are going to further restrict access to their data sources. Pulling data directly from separate sources to mash up in a central location will no longer be possible because the risk of a security breach will be a big concern.
This is when the use of data federations is going to increase dramatically. Companies will need to use data federations, or query federations, to treat multiple data sources logically — as if they were one source.
There are a few companies already starting to develop this type of technology, and we are going to see more of this logical mashup in the future.
2020 – The Rise of the BI Operating System
As developers create more experiences related to a company’s data flow, they will begin to enrich the system with analytic apps that reduce the time needed to take actions based on data — actually closing the BI loop. The flow will look like this: Data > Insight > Action > Data. This will transform the traditional dashboard into an operating system that continually drives the business.
As this happens, a couple of things will occur:
UX will be King:
Once we master building an analytic app, the next step in that progression will be to build a better app where code and design thinking come together. Once this happens, BI developers will begin to craft experiences that are more appealing, native to the BI platform, and most importantly — effective.
The rise of the BI ecosystem:
Workflows will begin to feel like one unified system. Users will manage one asset (through the BI and analytics platform) instead of multiple assets. A unified system will be presented to the user instead of siloed platforms.
2030 – Autonomous Operating Systems Through Analytics and BI
If we consider the next 10 years as a training set for data-driven businesses in a unified environment (using BI apps), we can assume that this process will evolve to produce an autonomous operation system.
Business users will have minimal touches with data. They will spend most of their time asking questions from analytic apps and getting operational data paths in return.
Data experts will handle the quality and integrations that support those operational paths. That setup will deliver the best experience for anyone interacting with the system, a highly customized interface based on a user’s role and specific needs.
These autonomous operating systems will reduce the time it takes to look for information (which is around 20-30% in 2019), to a much lower number in 2030. Every minute that can be spent on a novel and operative analysis rather than data “massaging” will translate into an improvement in overall lifestyle and well-being.
Data Storage and Streaming
2020 – Operationalizing the Data Lake
Over the past few years, data lakes have become a more popular way of storing data for most companies. This is due to the massive amounts of data that can be stored at a very low cost and the separation of storage from computing. The data, however, is stored in an unstructured way without a predefined schema and most companies find themselves using less than 10% of their stored data for analytics and BI purposes.
In the coming year, we will see more and more companies using existing and emerging technologies and products to make the huge amounts of unused data accessible for any type of analysis – from BI to data science.
The new breed of products will include new and improved engines to run queries against the data lake, tools to transform and prepare data in the data lake, tools to move data between environments, and tools to manage the overall data environment, including pipelines, cataloging and RBAC.
2030 – Streaming Analysis will Become an Analytics Service
Streaming Analytics allows companies to analyze data as soon as it becomes available, significantly enhancing real-time analytics.
This is a very difficult level of BI. Think of Uber trying to pick your driver in seconds from hundreds of choices. Until now, this has been a complicated process involving a full back-end team of DevOps and data engineers to constantly monitor and manipulate the data so that real-time data can be acted upon at any given moment.
Over the next few years, we will see more and more services offering a simple point and click interface for streaming real-time data. This will allow companies to gain the value of streaming analytics while cutting the cost of the enabling infrastructure. The value of the data as companies will be able to analyze risks before they occur.
AI and Analytics
2020 – AI Will Put the Answers in the Hands of Business Users
As AI becomes more well-known and effective in operational analytics, we will begin to see companies, even data teams, combining different AI components at different times in the analytics lifecycle to come up with better suggestions and insights.
If you look at the AI trends today (graphs, Explainable AI, Continuous AI,) you can see that each one facilitates a specific aspect of analysis, or is used for a specific purpose. Graphs show relationships, Explainable AI helps with transparency behind the analytics, and Continuous AI helps in constantly discovering new insights and revealing a data story within a complete context.
Each one, by itself, brings great insight to the organization. Combine them, and you will get a much more powerful solution.
A typical example would be if a company is using AI to find outliers, then they can combine that insight with AI for key drivers. Now you will be able to find a cause for situations (key drivers) that vary from the norm (outliers) and make decisions about how to change them.
In a retail environment, a company that sells socks may discover a spike in sales (the outlier) from a previous month. The next step would be to analyze the key drivers to find out more details. This type of analysis uncovers details. For example, contributors might be the color red (i.e. product type dimension) or the region of Nebraska (geo-location dimension). Decision-makers can slice and dice these dimensions to find further insights like the demographics of the customers (age groups, etc.), and make decisions to boost sales in other territories with similar demographics.
2030 – Analytic Apps Will Automate Consumption
We’ve been working on analytic apps since the beginning of last year, not because this is a buzzword, but because we believe that analytic apps will further automate the process of AI consumption, making it easier for data teams and business users to use different types of AI.
By 2030 though, quantum computing will be well-established :-). So for the sake of this trend, let’s say by 2025 every analytic app will be built on a sort of template, or “business question block”. This “block” is the perfect AI guide for the data expert or business user.
Why? In general, analysts and business users know their data. But they don’t know which type of analysis they need on the data. By choosing a template, they will inherit the analysis that needs to be done.
A good example of this is an analytic app for marketing. If you want to show your marketing attribution, your analytics app with analysis components will automatically suggest which data sources you should connect to — Salesforce, Gainsight, Zendesk, or maybe all of them — in order to get the desired analytical results.
Now the focus is on the questions, KPIs, and insights needed instead of the analytics functionality that needs to happen to get there. This will automate the use of AI in the organization and make it more friendly and effective for insights and analysis.
2020 – Knowledge Graphs will Drive More Database Technologies
It’s the perfect time to start using knowledge graphs for analytics. The technology is now standardized, and there is more data than ever before that can be manipulated and cast in a semantic graph format. However, there is more than one way to build a graph, and there are billions of ways to connect the data. Along with companies using knowledge graphs for BI, there will be a need to improve the ways to build the database powering the graphs.
In 2020, companies will begin to store their data in a format that is conducive to graphs and we will start to see more technologies that will help comprise the data in a data lake. These graph database technologies will improve the ways companies can build the database and improve the results that are seen in the knowledge graphs.
2030 – Knowledge Graphs Will Start Using More AI Technologies
Once companies have become comfortable with the knowledge graph concept, they will begin to build their own algorithms so that they can maximize the value of their own semantic relationships in the graph.
From here, we will see companies adding AI to get deeper insights.
A recommendation system, another AI technology, can be based on the graph and be leveraged in applications like autocomplete to make personalized and timely suggestions. By using AI technologies, combining them, and putting them on top of the relationships found in the spider-like knowledge graphs, companies can discover deeper insights that are greater than anything we have today.
For instance, using an AI technology such as a recommendation service on a knowledge graph can help a dashboard designer with sharing suggestions. The dashboard designer will receive recommendations about dashboard visibility based on the relationships found in the knowledge graph from previous dashboards that were created and shared.
Chief Data Officers
2020 – CDOs Will Begin Transitioning Into CEOs
The CDO is in a unique position at a company. Unlike other line-of-business executives, the CDO is charged with creating value from projects across multiple teams at an organization. Through the lens of companywide data collection and analysis, the CDOs are establishing a unique perspective at the executive table. When data-driven companies look at the line of succession to the CEO, a CDO’s cross-team optimizations will stand out since that work is already an important part of a CEO’s job description.
In the short term, this changes how data builders are measured in their daily work. As a data leader takes on more responsibilities, they will be viewed as having a transformative role with an effect felt across the entire organization. This is true for CDOs, but it’s also true for ambitious leaders anywhere at a company. Individual data builders will begin to feel more appreciated for analytical insights that impact their colleagues. Modern organizations are already building processes to collect and study data from every line of business, so it is a logical next step to start viewing those data experts as cross-team operations experts.
2030 – 20 to 30% of CEOs Will Have a CDO Background
Today, 0% of companies have a CEO who comes from a position of data leadership. Data at a large scale is so new that a lot of companies are still figuring out how to translate it into business value. The CDO position is still young too; so young that none of them have had the time to establish their expertise and ascend to the ranks of CEO. Ten years from now, early experiments will prove successful and more companies will recognize the value of a CEO who comes from a data leadership position.
As companies embrace the ubiquity of Big Data, it’s a natural step to see CDOs take on greater responsibility. Organizations that truly transform with data will value those data-based insights and will put stock in the people responsible for uncovering them. Organizations that are winning with data will begin to groom young data talent for more ambitious leadership roles in the decade ahead.
This year is going to be the year when everything is going to come up data; data will drive processes and operations through analytics and BI, developers and data teams will emerge into the business and create what every business analyst and end-user have been waiting for — clear and efficient insights that will drive data-decision and lead to making from the top down for more successful businesses decisions.
Here’s to every company becoming a data company and our predictions for analytics and BI trends for 2020 and beyond.