The data platforms market has traditionally been divided between products specifically designed to support operational or analytic workloads, with other market segments having emerged in recent years for data platforms targeted specifically at data science and machine learning (ML), as well as real-time analytics. More recently, we have seen vendor strategies evolving to provide a more consolidated approach, with data platforms designed to address a combination of analytics and data science, as well as hybrid operational and analytic processing. Snowflake, which has been hugely successful in recent years with its cloud-based analytic data platform, is a prime example. The company has expanded its purview to address data engineering and data science, as well as transactional data. Additionally, it now provides users with the ability to access and process data in on-premises environments as part of its strategy to address an increasing range of use cases.
Snowflake was founded in 2012 to develop a cloud-based data warehouse with in-built, data-sharing capabilities. While data warehousing remains the primary use case, the company has expanded its reach over the years to address data engineering and data science, and its offering is now positioned as a data cloud. Despite entering a crowded and mature market, Snowflake benefitted from rising interest in cloud computing and has made a substantial impact in the analytic data platforms market. It generated total annual revenue of $2.1 billion in fiscal 2023 from 7,828 customers. As is to be expected from a company now in its second decade, Snowflake’s pace of growth has begun to decline. The company provided guidance projecting product revenue growth of 40% for fiscal 2024, compared to 70% in fiscal 2023, 106% in fiscal 2022 and 120% in fiscal 2021. That said, Snowflake’s growth continues to outpace many of its larger and longer-established rivals. The company’s initial success was due in part to its cloud-based architecture, which enabled users to avoid the complexity and management overheads associated with database deployment, tuning and infrastructure management in response to evolving scalability and performance requirements. Another key differentiating feature that drove early adoption is that Snowflake enables customers to instantly share live data in a governed and secured fashion via Snowflake Marketplace. By designating data for sharing and granting the appropriate permissions, organizations using Snowflake can publish or provide access to their data — both internally and externally with partners — without copying or replication.
Although Snowflake was initially focused on data warehousing workloads, enabling the analysis of structured data, it has subsequently taken steps to expand its addressable market with capabilities for data engineering and data science, as well as the analysis of semi- and unstructured data. One of the primary enablers of this strategy is the Snowpark developer environment, which was introduced in 2020 and is designed to enable data engineers, data scientists and developers to extend the capabilities of Snowflake by writing code in their preferred languages to execute workloads such as ETL/ELT, data preparation and feature engineering. 2020 also saw Snowflake extend its management capabilities to unstructured data, such as audio, video, images and PDF documents. Expanded support for semi-structured and unstructured data was an important aspect of Snowflake’s ability to address data lake workloads, along with the ability to read data stored in customer-managed cloud object storage via External Tables. More recently, the company introduced the public preview Snowpark for Python, including integration with Streamlit, the Python-based rapid application development and iteration environment, which was acquired by Snowflake in 2022. The company also announced the private preview of its Native Application Framework, enabling developers to create applications to be shared and monetized via Snowflake Marketplace. The ability to read data stored in customer-managed cloud object storage via External Tables has also been complemented by support for the Apache Iceberg table format, initially as an external table, and more recently via private preview support for Iceberg as a first-class table in Snowflake.
Snowflake Data Cloud is exclusively available as a cloud service on Amazon Web Services, Azure or Google Cloud, with cross-cloud interoperability provided by its Snowgrid functionality. However, the company has taken its first steps towards addressing on-premises workloads by utilizing its External Table functionality, announcing the private preview of the ability to query data in on-premises storage environments compatible with Amazon Web Services’ S3. Despite the widespread adoption of cloud databases in the last decade, a significant proportion of database workloads remain on premises. This is likely to remain the case for some time. Large-scale migration of mission-critical workloads from on-premises infrastructure to cloud services can be complex and should not be rushed. Some workloads may never be migrated to the cloud for a variety of reasons, including performance and data sovereignty. More than one-half of participants (52%) in Ventana Research’s Analytics and Data Benchmark Research have hybrid analytics and data deployments. This provides a potential advantage to vendors that offer data platforms that can span on-premises and cloud environments. Another major announcement from the company in 2022 was the private preview launch of its Unistore workload, which enables Snowflake to be used to store and process transactional data. Unistore takes advantage of Snowflake’s new Hybrid Tables table type to enable users to develop transactional business applications on Snowflake, as well as run analytic queries on transactional data. I previously explained how demand for data-intensive operational applications infused with the results of analytic processes is driving convergence in the data platforms sector, and I assert that through 2026, the development of intelligent applications providing personalized experiences will drive demand for data platforms capable of supporting hybrid operational and analytic processing.
Snowflake made several significant announcements during 2022 that are currently only available in preview, such as Snowpark for Python, the Native Application Framework, Unistore, and the ability to access data in on-premises storage via External Tables. Addressing transactional data and on-premises workloads, in particular, are new areas for Snowflake and are in the early stages of development. Many of these improvements will help address what we identified as opportunities for improvement in our assessment of Snowflake in our Analytics Data Platform Value Index. Snowflake can be expected to bring some of these capabilities to general availability during 2023. While the company continues to expand its use cases and functionality, I recommend that organizations evaluating potential cloud database providers include Snowflake Data Cloud in their evaluations.