I previously described the concept of hydroanalytic data platforms, which combine the structured data processing and analytics acceleration capabilities associated with data warehousing with the low-cost and multi-structured data storage advantages of the data lake. One of the key enablers of this approach is interactive SQL query engine functionality, which facilitates the use of existing business intelligence (BI) and data science tools to analyze data in data lakes. Interactive SQL query engines have been in use for several years — many of the capabilities were initially used to accelerate analytics on Hadoop — but have evolved along with data lake initiatives to enable analysis of data in cloud object storage. The open source Presto project is one of the most prominent interactive SQL query engines and has been adopted by some of the largest digital-native organizations. Presto managed-services provider Ahana is on a mission to bring the advantages of Presto to the masses.
I previously explained how the data lakehouse is one of two primary approaches being adopted to deliver what I have called a hydroanalytic data platform. Hydroanalytics involves the combination of data warehouse and data lake functionality to enable and accelerate analysis of data in cloud storage services. The term data lakehouse has been rapidly adopted by several vendors in recent years to describe an environment in which data warehousing functionality is integrated into the data lake environment, rather than coexisting alongside. One of the vendors that has embraced the data lakehouse concept and terminology is Dremio, which recently launched the general availability of its Dremio Cloud data lakehouse platform.
As I recently described, it is anticipated that the majority of database workloads will continue to be served by specialist data platforms targeting operational and analytic workloads, albeit with growing demand for hybrid data processing use-cases and functionality. Specialist operational and analytic data platforms have historically been the since preferred option, but there have always been general-purpose databases that could be used for both analytic and operational workloads, with tuning and extensions to meet the specific requirements of each.
I recently wrote about the potential benefits of data mesh. As I noted, data mesh is not a product that can be acquired, or even a technical architecture that can be built. It’s an organizational and cultural approach to data ownership, access and governance. While the concept of data mesh is agnostic to the technology used to implement it, technology is clearly an enabler for data mesh. For many organizations, new technological investment and evolution will be required to facilitate adoption of data mesh. Meanwhile, the concept of the data fabric, a technology-driven approach to managing and governing data across distributed environments, is rising in popularity. Although I previously touched on some of the technologies that might be applicable to data mesh, it is worth diving deeper into the data architecture implications of data mesh, and the potential overlap with data fabric.
I recently described the use cases driving interest in hybrid data processing capabilities that enable analysis of data in an operational data platform without impacting operational application performance or requiring data to be extracted to an external analytic data platform. Hybrid data processing functionality is becoming increasingly attractive to aid the development of intelligent applications infused with personalization and artificial intelligence-driven recommendations. These applications can be used to improve customer service; engagement, detect and prevent fraud; and increase operational efficiency. Several database providers now offer hybrid data processing capabilities to support these application requirements. One of the vendors addressing this opportunity is SingleStore.
I recently described how the operational data platforms sector is in a state of flux. There are multiple trends at play, including the increasing need for hybrid and multicloud data platforms, the evolution of NoSQL database functionality and applicable use-cases, and the drivers for hybrid data processing. The past decade has seen significant change in the emergence of new vendors, data models and architectures as well as new deployment and consumption approaches. As organizations adopted strategies to address these new options, a few things remained constant – one being the influence and importance of Oracle. The company’s database business continues to be a core focus of innovation, evolution and differentiation, even as it expanded its portfolio to address cloud applications and infrastructure.
I recently wrote about the importance of data pipelines and the role they play in transporting data between the stages of data processing and analytics. Healthy data pipelines are necessary to ensure data is integrated and processed in the sequence required to generate business intelligence. The concept of the data pipeline is nothing new of course, but it is becoming increasingly important as organizations adapt data management processes to be more data driven.
Topics: business intelligence, Analytics, Data Governance, Data Integration, Data, Digital Technology, Digital transformation, data lakes, AI and Machine Learning, data operations, digital business, data platforms, Analytics & Data, Streaming Data & Events
Data governance is an issue that impacts all organizations large and small, new and old, in every industry, and every region of the world. Data governance ensures that an organization’s data can be cataloged, trusted and protected, improving business processes to accelerate analytics initiatives and support compliance with regulatory requirements. Not all data governance initiatives will be driven by regulatory compliance; however, the risk of falling foul of privacy (and human rights) laws ensures that regulatory compliance influences data-processing requirements and all data governance projects. Multinational organizations must be cognizant of the wide variety of regional data security and privacy requirements, not least the European Union’s General Data Protection Regulation (GDPR). The GDPR became enforceable in 2018, protects the privacy of personal or professional data, and carries with it the threat of fines of up to 20 million euros ($22 million) or 4% of a company’s global revenue. Europe is not alone in regulating against the use of personally identifiable information (other similar regulations include The California Consumer Privacy Act) but Ventana Research’s Data Governance Benchmark Research illustrates that there are differing attitudes and approaches to data governance on either side of the Atlantic.
I recently described the growing level of interest in data mesh which provides an organizational and cultural approach to data ownership, access and governance that facilitates distributed data processing. As I stated in my Analyst Perspective, data mesh is not a product that can be acquired or even a technical architecture that can be built. Adopting the data mesh approach is dependent on people and process change to overcome traditional reliance on centralized ownership of data and infrastructure and adapt to its principles of domain-oriented ownership, data as a product, self-serve data infrastructure and federated governance. Many organizations will need to make technological changes to facilitate adoption of data mesh, however. Starburst Data is associated with accelerating analysis of data in data lakes but is also one of several vendors aligning their products with data mesh.
Data mesh is the latest trend to grip the data and analytics sector. The term has been rapidly adopted by numerous vendors — as well as a growing number of organizations —as a means of embracing distributed data processing. Understanding and adopting data mesh remains a challenge, however. Data mesh is not a product that can be acquired, or even a technical architecture that can be built. It is an organizational and cultural approach to data ownership, access and governance. Adopting data mesh requires cultural and organizational change. Data mesh promises multiple benefits to organizations that embrace this change, but doing so may be far from easy.
Topics: business intelligence, Analytics, Data Governance, Data Integration, Data, Digital Technology, Digital transformation, data lakes, data operations, digital business, data platforms, Analytics & Data, Streaming Data & Events