I recently wrote about the potential benefits of data mesh. As I noted, data mesh is not a product that can be acquired, or even a technical architecture that can be built. It’s an organizational and cultural approach to data ownership, access and governance. While the concept of data mesh is agnostic to the technology used to implement it, technology is clearly an enabler for data mesh. For many organizations, new technological investment and evolution will be required to facilitate adoption of data mesh. Meanwhile, the concept of the data fabric, a technology-driven approach to managing and governing data across distributed environments, is rising in popularity. Although I previously touched on some of the technologies that might be applicable to data mesh, it is worth diving deeper into the data architecture implications of data mesh, and the potential overlap with data fabric.
I recently wrote about the importance of data pipelines and the role they play in transporting data between the stages of data processing and analytics. Healthy data pipelines are necessary to ensure data is integrated and processed in the sequence required to generate business intelligence. The concept of the data pipeline is nothing new of course, but it is becoming increasingly important as organizations adapt data management processes to be more data driven.
Topics: business intelligence, Analytics, Data Governance, Data Integration, Data, Digital Technology, Digital transformation, data lakes, AI and Machine Learning, data operations, digital business, data platforms, Analytics & Data, Streaming Data & Events
Data governance is an issue that impacts all organizations large and small, new and old, in every industry, and every region of the world. Data governance ensures that an organization’s data can be cataloged, trusted and protected, improving business processes to accelerate analytics initiatives and support compliance with regulatory requirements. Not all data governance initiatives will be driven by regulatory compliance; however, the risk of falling foul of privacy (and human rights) laws ensures that regulatory compliance influences data-processing requirements and all data governance projects. Multinational organizations must be cognizant of the wide variety of regional data security and privacy requirements, not least the European Union’s General Data Protection Regulation (GDPR). The GDPR became enforceable in 2018, protects the privacy of personal or professional data, and carries with it the threat of fines of up to 20 million euros ($22 million) or 4% of a company’s global revenue. Europe is not alone in regulating against the use of personally identifiable information (other similar regulations include The California Consumer Privacy Act) but Ventana Research’s Data Governance Benchmark Research illustrates that there are differing attitudes and approaches to data governance on either side of the Atlantic.
I recently described the growing level of interest in data mesh which provides an organizational and cultural approach to data ownership, access and governance that facilitates distributed data processing. As I stated in my Analyst Perspective, data mesh is not a product that can be acquired or even a technical architecture that can be built. Adopting the data mesh approach is dependent on people and process change to overcome traditional reliance on centralized ownership of data and infrastructure and adapt to its principles of domain-oriented ownership, data as a product, self-serve data infrastructure and federated governance. Many organizations will need to make technological changes to facilitate adoption of data mesh, however. Starburst Data is associated with accelerating analysis of data in data lakes but is also one of several vendors aligning their products with data mesh.
Data mesh is the latest trend to grip the data and analytics sector. The term has been rapidly adopted by numerous vendors — as well as a growing number of organizations —as a means of embracing distributed data processing. Understanding and adopting data mesh remains a challenge, however. Data mesh is not a product that can be acquired, or even a technical architecture that can be built. It is an organizational and cultural approach to data ownership, access and governance. Adopting data mesh requires cultural and organizational change. Data mesh promises multiple benefits to organizations that embrace this change, but doing so may be far from easy.
Topics: business intelligence, Analytics, Data Governance, Data Integration, Data, Digital Technology, Digital transformation, data lakes, data operations, digital business, data platforms, Analytics & Data, Streaming Data & Events
Despite widespread and increasing use of the cloud for data and analytics workloads, it has become clear in recent years that, for most organizations, a proportion of data-processing workloads will remain on-premises in centralized data centers or distributed-edge processing infrastructure. As we recently noted, as compute and storage are distributed across a hybrid and multi-cloud architecture, so, too, is the data it stores and relies upon. This presents challenges for organizations to identify, manage and analyze all the data that is available to them. It also presents opportunities for vendors to help alleviate that challenge. In particular, it provides a gap in the market for data-platform vendors to distinguish themselves from the various cloud providers with cloud-agnostic data platforms that can support data processing across hybrid IT, multi-cloud and edge environments (including Internet of Things devices, as well as servers and local data centers located close to the source of the data). Yellowbrick Data is one vendor that has seized upon that opportunity with its cloud Data Warehouse offering.
As businesses become more data-driven, they are increasingly dependent on the quality of their data and the reliability of their data pipelines. Making decisions based on data does not guarantee success, especially if the business cannot ensure that the data is accurate and trustworthy. While there is potential value in capturing all data — good or bad — making decisions based on low-quality data may do more harm than good.
I recently described the emergence of hydroanalytic data platforms, outlining how the processes involved in generating energy from a lake or reservoir were analogous to those required to generate intelligence from a data lake. I explained how structured data processing and analytics acceleration capabilities are the equivalent of turbines, generators and transformers in a hydroelectric power station. While these capabilities are more typically associated with data warehousing, they are now being applied to data lake environments as well. Structured data processing and analytics acceleration capabilities are not the only things required to generate insights from data, however, and the hydroelectric power station analogy further illustrates this. For example, generating hydroelectric power also relies on pipelines to ensure that the water is transported from the lake or reservoir at the appropriate volume to drive the turbines. Ensuring that a hydroelectric power station is operating efficiently also requires the collection, monitoring and analysis of telemetry data to confirm that the turbines, generators, transformers and pipelines are functioning correctly. Similarly, generating intelligence from data relies on data pipelines that ensure the data is integrated and processed in the correct sequence to generate the required intelligence, while the need to monitor the pipelines and processes in data-processing and analytics environments has driven the emergence of a new category of software: data observability.
Ventana Research recently announced its 2022 Market Agenda for Data, continuing the guidance we have offered for nearly two decades to help organizations derive optimal value and improve business outcomes.
Few trends have had a bigger impact on the data platforms landscape than the emergence of cloud computing. The adoption of cloud computing infrastructure as an alternative to on-premises datacenters has resulted in significant workloads being migrated to the cloud, displacing traditional server and storage vendors. Almost one-half (49%) of respondents to Ventana Research’s Analytics and Data Benchmark Research currently use cloud computing products for analytics and data, and a further one-quarter plan to do so. In addition to deploying data workloads on cloud infrastructure, many organizations have also adopted cloud data and analytics services offered by the same cloud providers, displacing traditional data platform vendors. Organizations now have greater choice in relation to potential products and providers for data and analytics workloads, but also need to think about integrating services offered by cloud providers with established technology and processes. Having pioneered the concept, Amazon Web Services has arguably benefitted more than most from adoption of cloud computing, and is also in the process of expanding and adjusting its portfolio to alleviate challenges and encourage even greater adoption.