Managing the challenge of data management, retention and availability is an ongoing issue for most organizations. In this article, Gordon Cullum explains where data virtualization can help; and where it won’t.
Data virtualization has often been heralded as the answer to enterprises caught in a vicious circle in a world riddled with data, both online and offline. However, it is important to remember that no technical solution is a silver bullet and data virtualization should not be thought of as a one stop solution for all an enterprise’s needs.
Businesses want to act and improve their decision-making in real time whilst containing costs and supporting business-as-usual activities, which can leave CIOs struggling to navigate through an array of complex applications and systems.
To get the most out of data virtualization, and when deployed with the right capabilities and methodology to achieve the desired result, businesses can leverage existing investment to solve current and future analytic needs without compromising on quality, budget and time.
Don’t get caught in the data maze
It seems like a Catch-22 situation where businesses need data to derive meaningful insight and improve decision-making. However, many large enterprises have evolved over years of operation and accumulated a variety of data resources along the way, which can make it difficult to access and utilise information across numerous business systems.
Businesses are increasingly implementing retention strategies, which means that the industry is witnessing a proliferation of structured and unstructured customer information. As a result, enterprises are feeling compelled to feed the analytical needs of the business with complex, enterprise data warehouses (EDWs) and business intelligence (BI) solutions.
On the face of it, investing in BI solutions may seem like the clear 'get-out-of-jail-free' card, however, these systems can create a whirlpool of data management challenges. From master data management, to data integration and data storage, BI systems lack agility and flexibility. Moreover, the complexity of the data landscape makes it difficult for BI systems to accommodate additional business needs with ease.
These analytic solutions combine multi-vendor product deployments and disciplines across complex integration patterns. Unsurprisingly, they are deployed at the cost of lengthy timeframes and excessive capital investments. While the solutions address several operational use cases of the business, they struggle to provide quick and actionable insights.
In such a disparate business and IT landscape, data virtualization comes to the rescue. The need of the hour is to invest in existing technologies whilst retaining business engagement without having to start all over again. Rather than replace existing EDWs, which is time-consuming and can result in loss of data, enterprises should utilise the available knowledge and leverage it with other systems, to effectively address and resolve business issues in a matter of days, not weeks, months or years.
With 35 percent of enterprises planning to implement data virtualization in some form as a forward-thinking option for data integration by 2020, it is increasingly gaining favour as a versatile tool in the enterprise data toolbox. Data virtualization seamlessly bridges the gap between existing systems and processes without requiring a complex transformation to deliver quick results, unlocking value without increasing resources, processes or technology investments.
However, enterprises must be aware that data virtualization is not a silver bullet. It should be deployed with the right capabilities and methodology to achieve the desired results with an integrated view of the business.
When is data virtualization viable?
Transforming business dynamics requires that enterprises access information in a variety of formats across numerous business systems. As a result, enterprises are still getting to grips with the data that is supposed to empower them. Data virtualization is an agile and effective way for organizations to stay on top of their ever-changing data needs and should be used to harmonise their existing enterprise data landscape. As structured and unstructured information grows exponentially, organizations must capitalise on data to gain the competitive advantage. Their IT departments are surrounded by a plethora of reporting solutions including databases, data marts and data warehouses. All these solutions aim to address the business user’s needs, which results in data silos and lack of governance.
Businesses are constantly trying to address disparate data systems by building big data platforms. However, data virtualization allows disparate data sources to be combined within a logical layer or ‘virtual database’. Such a solution will result in quicker access to data, reduce development and implementation timelines, minimise data replication, reduce cost, and deliver an agile approach that can adjust to new business needs.
When virtualization won't work
While it is easy to understand the merits of data virtualization, it begs some obvious questions around when enterprises should avoid using the solution.
Trend and analytics reporting requires voluminous data to be crunched using complex business rules. Crunching huge volumes of data virtually could impact performance and slow down analysis, so it is better to create a physical copy of the required data in order to boost performance.
Most of the time source systems are already stretched to the limit and cannot process any more queries. A data virtualization solution requires frequent reads of source system data to combine data sets and deliver insights, so it is advisable to get such data in a physical space before applying virtualization.
The accuracy of data is of paramount importance for any analytical system. Poor quality source data if fed directly into reporting would deliver incorrect results so it is imperative that data undergoes a rigorous data quality check before it is made available for consumption through virtualization.
Complex merging of incremental data is necessary to create a version of facts that delivers insight from old facts in order to derive new facts. Such operations require enormous processing power and memory and these activities are best accomplished by an ETL solution, rather than data virtualization.
What does the future hold?
As data virtualization comes of age, it is going a long way to solving the problem of today’s proliferation of data. By providing organizations with the ability to combine data from a variety of disparate data sources into a common format, it not only addresses problems of data compatibility and volume but also eliminates issues relating to expertise in specific programming languages.
Data virtualization is a formidable ally and will deliver faster ROI and agility in decision-making based on actionable insights. As an alternative to big bang data warehouse solutions, data virtualization offers a lightweight, cost-effective solution in a rapidly changing market place that enables businesses to remain competitive in their sector.
Gordon Cullum, CTO, Mastek
Gordon is a technology enthusiast with a professional background in bespoke enterprise software development and architecture, regularly designing and delivering major integration programmes across a number of industries including travel and leisure, telecommunications and healthcare.