Matt Aslett's Analyst Perspectives

Data as a Product: Needs and Requirements

Written by Matt Aslett | Oct 2, 2024 10:00:00 AM

I previously wrote about data mesh as a cultural and organizational approach to distributed data processing. Data mesh has four key principles—domain-oriented ownership, data as a product, self-serve data infrastructure and federated governance—each of which is being widely adopted. I assert that by 2027, more than 6 in 10 enterprises will adopt technologies to facilitate the delivery of data as a product as they adapt their cultural and organizational approaches to data ownership in the context of data mesh, for example. Each of the four principles is also gaining momentum outside the context of data mesh, however, with data as a product attracting attention as it enables enterprises to streamline and accelerate the delivery of analytics and data initiatives.

The terms “data as a product” and “data product” are often used interchangeably but have distinct meanings. Data as a product is the process of applying product thinking to data initiatives to ensure that the outcome—the data product—is designed to be shared and reused for multiple use cases across the business. The format of the outcome is not a defining characteristic of the data product, which could be a business intelligence (BI) dashboard (and the underlying data warehouse), a decision intelligence application, an algorithm or artificial intelligence/machine learning (AI/ML) model, or a custom-built operational application. All of these have traditionally been delivered on a project-by-project basis, often by a centralized IT team, with little or no effort to ensure the data can easily be accessed and used for other purposes without duplication. The defining characteristic of a data product is the application of product thinking in the development process to ensure that the outcome is designed to be delivered as a reusable asset that can be discovered and consumed by others on a self-service basis. The principle of domain-oriented ownership is also important to the development of data products. Domain-oriented ownership makes business departments responsible for managing the data generated by their applications and making it available to others. Data as a product is primarily concerned with the sharing of data within an enterprise, rather than selling data products to external parties (which is addressed by data as a service), although the two are complementary.

The application of product thinking ensures that consumers of data products are treated as customers. It means that data owners must be aware of data requirements from across the enterprise to understand how the resulting data product will be used, as well providing instructions and service-level commitments so data consumers can feel confident that the data product is up-to-date and of sufficient quality to be relied on for business decision-making. This is fulfilled through the development of data contracts, which are created alongside the data product and provide an agreement between the data owner and the data consumer about the nature of the data product. Data contracts should include a description of the data product, defining the structure, format and meaning of the data, as well as licensing terms and usage recommendations. A data contract should also define data quality and service-level key performance indicators and commitments.

Enterprise interest in data as a product has driven the emergence of a new category of software designed to provide an environment for the development, publication and consumption of data products. Key capabilities for these data product platforms include a dedicated interface for the development of data products with versioning, change tracking and data lineage capabilities, as well templates for the classification of data products and data contracts. A data product platform also needs to provide a dedicated interface for the self-service discovery and consumption of data products and their related data contracts. As with any product, consumers of data products should be able to provide feedback, comments and ratings as well as request improvements or new products. Data owners also require the functionality to view and manage requests for data product modifications and the development of new data products, as well as to monitor data product usage and performance metrics. Some data product platforms will also offer functionality to support the sale and licensing of data as a service to external partners or customers.

The development of any product relies on a complex supply chain of components, and data products are no exception. As such, data product platforms need to provide native or integrated data operations (DataOps) functionality, including the development and testing of data pipelines, as well as data orchestration and data observability functionality to provide the all-important information related to the validity, integrity, quality and lineage of the underlying data. Making data available as a product on a self-service basis also increases the importance of agreed-upon data definitions and entity resolution. Only 16% of participants in our Data Governance Benchmark Research say data is well-trusted in their organization, while one-half cite agreement on the definitions of data as a primary concern in managing data effectively. It is therefore important that data product platforms provide native or integrated functionality for data governance, data cataloging and master data management.

Enterprises adopting data as a product stand to benefit from interoperability and the accelerated delivery of data products that more rapidly provide business stakeholders with high-quality trusted data. I recommend that all enterprises evaluate the principle of data as a product and platforms that enable the development and delivery of data products. The capabilities required for delivering data products will be covered in detail in our forthcoming Data Operations Buyers Guide research, which is being expanded this year to assess platforms and tools for Data Products, alongside Data Pipelines, Data Observability, Data Orchestration and overall DataOps.

Regards,

Matt Aslett