All Analyst Perspectives
Posted by Matt Aslett on Jun 24, 2022 3:00:00 AM
I previously explained how the data lakehouse is one of two primary approaches being adopted to deliver what I have called a hydroanalytic data platform. Hydroanalytics involves the combination of data warehouse and data lake functionality to enable and accelerate analysis of data in cloud storage services. The term data lakehouse has been rapidly adopted by several vendors in recent years to describe an environment in which data warehousing functionality is integrated into the data lake environment, rather than coexisting alongside. One of the vendors that has embraced the data lakehouse concept and terminology is Dremio, which recently launched the general availability of its Dremio Cloud data lakehouse platform.
Posted by Matt Aslett on Jun 14, 2022 3:00:00 AM
As I recently described, it is anticipated that the majority of database workloads will continue to be served by specialist data platforms targeting operational and analytic workloads, albeit with growing demand for hybrid data processing use-cases and functionality. Specialist operational and analytic data platforms have historically been the since preferred option, but there have always been general-purpose databases that could be used for both analytic and operational workloads, with tuning and extensions to meet the specific requirements of each.
Posted by Matt Aslett on Jun 2, 2022 3:00:00 AM
I recently wrote about the potential benefits of data mesh. As I noted, data mesh is not a product that can be acquired, or even a technical architecture that can be built. It’s an organizational and cultural approach to data ownership, access and governance. While the concept of data mesh is agnostic to the technology used to implement it, technology is clearly an enabler for data mesh. For many organizations, new technological investment and evolution will be required to facilitate adoption of data mesh. Meanwhile, the concept of the data fabric, a technology-driven approach to managing and governing data across distributed environments, is rising in popularity. Although I previously touched on some of the technologies that might be applicable to data mesh, it is worth diving deeper into the data architecture implications of data mesh, and the potential overlap with data fabric.
Posted by Matt Aslett on May 25, 2022 3:00:00 AM
I recently described the use cases driving interest in hybrid data processing capabilities that enable analysis of data in an operational data platform without impacting operational application performance or requiring data to be extracted to an external analytic data platform. Hybrid data processing functionality is becoming increasingly attractive to aid the development of intelligent applications infused with personalization and artificial intelligence-driven recommendations. These applications can be used to improve customer service; engagement, detect and prevent fraud; and increase operational efficiency. Several database providers now offer hybrid data processing capabilities to support these application requirements. One of the vendors addressing this opportunity is SingleStore.
Posted by Matt Aslett on May 18, 2022 3:00:00 AM
The server is a key component of enterprise computing, providing the functional compute resources required to support software applications. Historically, the server was so fundamentally important that it – along with the processor, or processor core – was also a definitional unit by which software was measured, priced and sold. That changed with the advent of cloud-based service delivery and consumption models.
Posted by Matt Aslett on May 11, 2022 3:00:00 AM
Over a decade ago, I coined the term NewSQL to describe the new breed of horizontally scalable, relational database products. The term was adopted by a variety of vendors that sought to combine the transactional consistency of the relational database model with elastic, cloud-native scalability. Many of the early NewSQL vendors struggled to gain traction, however, and were either acquired or ceased operations before they could make an impact in the crowded operational data platforms market. Nonetheless, the potential benefits of data platforms that span both on-premises and cloud resources remain. As I recently noted, many of the new operational database vendors have now adopted the term “distributed SQL” to describe their offerings. In addition to new terminology, a key trend that separates distributed SQL vendors from the NewSQL providers that preceded them is a greater focus on developers, laying the foundation for the next generation of applications that will depend on horizontally scalable, relational-database functionality. Yugabyte is a case in point.
Posted by Matt Aslett on May 5, 2022 3:00:00 AM
I recently described how the operational data platforms sector is in a state of flux. There are multiple trends at play, including the increasing need for hybrid and multicloud data platforms, the evolution of NoSQL database functionality and applicable use-cases, and the drivers for hybrid data processing. The past decade has seen significant change in the emergence of new vendors, data models and architectures as well as new deployment and consumption approaches. As organizations adopted strategies to address these new options, a few things remained constant – one being the influence and importance of Oracle. The company’s database business continues to be a core focus of innovation, evolution and differentiation, even as it expanded its portfolio to address cloud applications and infrastructure.
Posted by Matt Aslett on Apr 26, 2022 3:00:00 AM
I recently wrote about the importance of data pipelines and the role they play in transporting data between the stages of data processing and analytics. Healthy data pipelines are necessary to ensure data is integrated and processed in the sequence required to generate business intelligence. The concept of the data pipeline is nothing new of course, but it is becoming increasingly important as organizations adapt data management processes to be more data driven.
Posted by Matt Aslett on Apr 20, 2022 3:00:00 AM
Data governance is an issue that impacts all organizations large and small, new and old, in every industry, and every region of the world. Data governance ensures that an organization’s data can be cataloged, trusted and protected, improving business processes to accelerate analytics initiatives and support compliance with regulatory requirements. Not all data governance initiatives will be driven by regulatory compliance; however, the risk of falling foul of privacy (and human rights) laws ensures that regulatory compliance influences data-processing requirements and all data governance projects. Multinational organizations must be cognizant of the wide variety of regional data security and privacy requirements, not least the European Union’s General Data Protection Regulation (GDPR). The GDPR became enforceable in 2018, protects the privacy of personal or professional data, and carries with it the threat of fines of up to 20 million euros ($22 million) or 4% of a company’s global revenue. Europe is not alone in regulating against the use of personally identifiable information (other similar regulations include The California Consumer Privacy Act) but Ventana Research’s Data Governance Benchmark Research illustrates that there are differing attitudes and approaches to data governance on either side of the Atlantic.
Posted by Matt Aslett on Apr 12, 2022 3:00:00 AM
I recently described the growing level of interest in data mesh which provides an organizational and cultural approach to data ownership, access and governance that facilitates distributed data processing. As I stated in my Analyst Perspective, data mesh is not a product that can be acquired or even a technical architecture that can be built. Adopting the data mesh approach is dependent on people and process change to overcome traditional reliance on centralized ownership of data and infrastructure and adapt to its principles of domain-oriented ownership, data as a product, self-serve data infrastructure and federated governance. Many organizations will need to make technological changes to facilitate adoption of data mesh, however. Starburst Data is associated with accelerating analysis of data in data lakes but is also one of several vendors aligning their products with data mesh.
Posted by Matt Aslett on Mar 29, 2022 3:00:00 AM
Data mesh is the latest trend to grip the data and analytics sector. The term has been rapidly adopted by numerous vendors — as well as a growing number of organizations —as a means of embracing distributed data processing. Understanding and adopting data mesh remains a challenge, however. Data mesh is not a product that can be acquired, or even a technical architecture that can be built. It is an organizational and cultural approach to data ownership, access and governance. Adopting data mesh requires cultural and organizational change. Data mesh promises multiple benefits to organizations that embrace this change, but doing so may be far from easy.
Posted by Matt Aslett on Mar 22, 2022 3:00:00 AM
Despite widespread and increasing use of the cloud for data and analytics workloads, it has become clear in recent years that, for most organizations, a proportion of data-processing workloads will remain on-premises in centralized data centers or distributed-edge processing infrastructure. As we recently noted, as compute and storage are distributed across a hybrid and multi-cloud architecture, so, too, is the data it stores and relies upon. This presents challenges for organizations to identify, manage and analyze all the data that is available to them. It also presents opportunities for vendors to help alleviate that challenge. In particular, it provides a gap in the market for data-platform vendors to distinguish themselves from the various cloud providers with cloud-agnostic data platforms that can support data processing across hybrid IT, multi-cloud and edge environments (including Internet of Things devices, as well as servers and local data centers located close to the source of the data). Yellowbrick Data is one vendor that has seized upon that opportunity with its cloud Data Warehouse offering.
Posted by Matt Aslett on Mar 15, 2022 3:00:00 AM
I recently examined how evolving functionality had fueled the adoption of NoSQL databases, recommending that organizations evaluate NoSQL databases when assessing options for data transformation and modernization efforts. This recommendation was based on the breadth and depth of functionality offered by NoSQL database providers today, which has expanded the range of use cases for which NoSQL databases are potentially viable. There remain a significant number of organizations that have not explored NoSQL databases as well as several workloads for which it is assumed NoSQL databases are inherently unsuitable. Given the advances in functionality, organizations would be well-advised to maintain up-to-date knowledge of available products and services and an understanding of the range of use cases for which NoSQL databases are a valid option.
Posted by Matt Aslett on Mar 8, 2022 3:00:00 AM
The various NoSQL databases have become a staple of the data platforms landscape since the term entered the IT industry lexicon in 2009 to describe a new generation of non-relational databases. While NoSQL began as a ragtag collection of loosely affiliated, open-source database projects, several commercial NoSQL database providers are now established as credible alternatives to the various relational database providers, while all the major cloud providers and relational database giants now also have NoSQL database offerings. Almost one-quarter (22%) of respondents to Ventana Research’s Analytics and Data Benchmark Research are using NoSQL databases in production today, and adoption is likely to continue to grow. More than one-third (34%) of respondents are planning to adopt NoSQL databases within two years (21%) or are evaluating (14%) their potential use. Adoption has been accelerated by the evolving functionality offered by NoSQL products and services, the growing maturity of specialist NoSQL vendors, and new commercial offerings from cloud providers and established database providers alike. This evolution is exemplified by the changing meaning of the term NoSQL itself. While it was initially associated with a rejection of the relational database hegemony, it has retroactively been reinterpreted to mean “Not Only SQL,” reflecting the potential for these new databases to coexist with and complement established approaches.
Posted by Matt Aslett on Mar 1, 2022 3:00:00 AM
As businesses become more data-driven, they are increasingly dependent on the quality of their data and the reliability of their data pipelines. Making decisions based on data does not guarantee success, especially if the business cannot ensure that the data is accurate and trustworthy. While there is potential value in capturing all data — good or bad — making decisions based on low-quality data may do more harm than good.
Posted by Matt Aslett on Feb 22, 2022 3:00:00 AM
I recently described the emergence of hydroanalytic data platforms, outlining how the processes involved in generating energy from a lake or reservoir were analogous to those required to generate intelligence from a data lake. I explained how structured data processing and analytics acceleration capabilities are the equivalent of turbines, generators and transformers in a hydroelectric power station. While these capabilities are more typically associated with data warehousing, they are now being applied to data lake environments as well. Structured data processing and analytics acceleration capabilities are not the only things required to generate insights from data, however, and the hydroelectric power station analogy further illustrates this. For example, generating hydroelectric power also relies on pipelines to ensure that the water is transported from the lake or reservoir at the appropriate volume to drive the turbines. Ensuring that a hydroelectric power station is operating efficiently also requires the collection, monitoring and analysis of telemetry data to confirm that the turbines, generators, transformers and pipelines are functioning correctly. Similarly, generating intelligence from data relies on data pipelines that ensure the data is integrated and processed in the correct sequence to generate the required intelligence, while the need to monitor the pipelines and processes in data-processing and analytics environments has driven the emergence of a new category of software: data observability.
Posted by Matt Aslett on Feb 16, 2022 3:00:00 AM
As I stated when joining Ventana Research, the socioeconomic impacts of the pandemic and its aftereffects have highlighted more than ever the differences between organizations that can turn data into insights and are agile enough to act upon it and those that are incapable of seeing or responding to the need for change. Data-driven organizations stand to gain competitive advantage, responding faster to worker and customer demands for more innovative, data-rich applications and personalized experiences. One of the key methods that accelerates business decision-making is reducing the lag between data collection and data analysis.
Posted by Matt Aslett on Feb 11, 2022 3:00:00 AM
I recently described how the data platforms landscape will remain divided between analytic and operational workloads for the foreseeable future. Analytic data platforms are designed to store, manage, process and analyze data, enabling organizations to maximize data to operate with greater efficiency, while operational data platforms are designed to store, manage and process data to support worker-, customer- and partner-facing operational applications. At the same time, however, we see increased demand for intelligent applications infused with the results of analytic processes, such as personalization and artificial intelligence-driven recommendations. The need for real-time interactivity means that these applications cannot be served by traditional processes that rely on the batch extraction, transformation and loading of data from operational data platforms into analytic data platforms for analysis. Instead, they rely on analysis of data in the operational data platform itself via hybrid data processing capabilities to accelerate worker decision-making or improve customer experience.
Posted by Matt Aslett on Feb 1, 2022 3:00:00 AM
Ventana Research recently announced its 2022 Market Agenda for Data, continuing the guidance we have offered for nearly two decades to help organizations derive optimal value and improve business outcomes.
Posted by Matt Aslett on Jan 19, 2022 3:00:00 AM
Few trends have had a bigger impact on the data platforms landscape than the emergence of cloud computing. The adoption of cloud computing infrastructure as an alternative to on-premises datacenters has resulted in significant workloads being migrated to the cloud, displacing traditional server and storage vendors. Almost one-half (49%) of respondents to Ventana Research’s Analytics and Data Benchmark Research currently use cloud computing products for analytics and data, and a further one-quarter plan to do so. In addition to deploying data workloads on cloud infrastructure, many organizations have also adopted cloud data and analytics services offered by the same cloud providers, displacing traditional data platform vendors. Organizations now have greater choice in relation to potential products and providers for data and analytics workloads, but also need to think about integrating services offered by cloud providers with established technology and processes. Having pioneered the concept, Amazon Web Services has arguably benefitted more than most from adoption of cloud computing, and is also in the process of expanding and adjusting its portfolio to alleviate challenges and encourage even greater adoption.
Posted by Matt Aslett on Jan 5, 2022 3:00:00 AM
The need for data-driven decision-making requires organizations to transform not only the approach to business intelligence and data science but also accelerate the development of new operational applications that support greater business agility, enable cloud- and mobile-based consumption, and deliver more interactive and personalized experiences. To stay competitive, organizations need to prioritize the development of new, data-driven applications. As a result, many have been encouraged to invest in new data platforms designed to support agile development and cloud-based delivery. This is one of the factors driving the growth of MongoDB, and continues to drive the evolution of its document database into what is now described as a cloud-based application data platform.
Posted by Matt Aslett on Dec 30, 2021 3:00:00 AM
The term NoSQL has been a misnomer ever since it appeared in 2009 to describe a group of emerging databases. It was true that a lack of support for Structured Query Language (SQL) was common to the various databases referred to as NoSQL. However, it was always one of a number of common characteristics, including flexible schema, distributed data processing, open source licensing, and the use of non-relational data models (key value, document, graph) rather than relational tables. As the various NoSQL databases have matured and evolved, many of them have added support for SQL terms and concepts, as well as the ability to support SQL format queries. Couchbase has been at the forefront of this effort, recognizing that to drive greater adoption of NoSQL databases in general (and its distributed document database in particular) it was wise to increase compatibility with the concepts, tools and skills that have dominated the database market for the past 50 years.
Posted by Matt Aslett on Dec 23, 2021 3:00:00 AM
Data lakes have enormous potential as a source of business intelligence. However, many early adopters of data lakes have found that simply storing large amounts of data in a data lake environment is not enough to generate business intelligence from that data. Similarly, lakes and reservoirs have enormous potential as sources of energy. However, simply storing large amounts of water in a lake is not enough to generate energy from that water. A hydroelectric power station is required to harness and unleash the power-generating potential of a lake or reservoir, utilizing a combination of turbines, generators and transformers to convert the energy of the flowing water into electricity. A hydroanalytic data platform, the data equivalent of a hydroelectric power station, is required to harness and unleash the intelligence-generating potential of a data lake.
Posted by Matt Aslett on Dec 14, 2021 3:00:00 AM
As I noted when joining Ventana Research, the range of options faced by organizations in relation to data processing and analytics can be bewildering. When it comes to data platforms, however, there is one fundamental consideration that comes before all others: Is the workload primarily operational or analytic? Although most database products can be used for operational or analytic workloads, the market has been segmented between products targeting operational workloads, and those targeting analytic workloads for almost as long as there has been a database market.
Posted by Matt Aslett on Dec 2, 2021 3:00:00 AM
Breaking into the database market as a new vendor is easier said than done given the dominance of the sector by established database and data management giants, as well as the cloud computing providers. We recently described the emergence of a new breed of distributed SQL database providers with products designed to address hybrid and multi-cloud data processing. These databases are architecturally and functionally differentiated from both the traditional relational incumbents (in terms of global scalability) and the NoSQL providers (in terms of the relational model and transactional consistency). Having differentiated functionality is the bare minimum a new database vendor needs to make itself known in a such a crowded market, however.
Posted by Matt Aslett on Nov 24, 2021 3:00:00 AM
It has been clear for some time that future enterprise IT architecture will span multiple cloud providers as well as on-premises data centers. As Ventana Research noted in the market perspective on data architectures, the rapid adoption of cloud computing has fragmented where data is accessed or consolidated. We are already seeing that almost one-half (49%) of respondents to Ventana Research’s Analytics and Data Benchmark Research are using cloud computing for analytics and data, of which 42% are currently using more than one cloud provider.
Posted by Matt Aslett on Nov 11, 2021 3:00:00 AM
Enterprises looking to adopt cloud-based data processing and analytics face a disorienting array of data storage, data processing, data management and analytics offerings. Departmental autonomy, shadow IT, mergers and acquisitions, and strategic choices mean that most enterprises now have the need to manage data across multiple locations, while each of the major cloud providers and data and analytics vendors has a portfolio of offerings that may or may not be available in any given location. As such, the ability to manage and process data across multiple clouds and data centers is a growing concern for large and small enterprises alike. Almost one-half (49%) of respondents to Ventana Research’s Analytics and Data Benchmark Research study are using cloud computing for analytics and data, of which 42% are currently using more than one cloud provider.
Posted by Matt Aslett on Oct 30, 2021 3:00:00 AM
I am very happy to announce that I have joined Ventana Research to help lead the expertise area of Digital Technology, including Analytics and Data, Cloud Computing, Artificial Intelligence and Machine Learning, the Internet of Things, Robotic Automation, and Collaborative and Conversational Computing. While the breadth of applications and technology covered by our Digital Technology practice is broad, I will naturally make use of my decades of experience covering data platforms and analytics to help organizations improve the readiness and resilience of business and IT operations.