Matt Aslett's Analyst Perspectives

Fivetran Automates Data Movement and Transformation

Written by Matt Aslett | Jun 29, 2023 10:00:00 AM

Despite best intentions, many organizations still struggle with some fundamental aspects of data processing and analytics. Taking data from operational applications and making it available for analysis is a first step, but data management remains a perennial challenge. Data movement and transformation difficulties can lead to delays and data quality problems that prevent organizations from generating value from data. The inability to govern and integrate data from multiple data sources prevents more than one-half (54%) of participants in Ventana Research’s Data Governance Benchmark Research from achieving a “single version of the truth’’ across the enterprise.

The persistence of data movement and transformation challenges means numerous established vendors have products to tackle the problem. New vendors continue to emerge with differentiated approaches. Fivetran is a relative newcomer to the data movement and transformation market but has established itself as a key vendor in the advanced data stack with its data movement and transformation approach.

Fivetran was founded in 2012 and emerged alongside several other companies with offerings designed to simplify and automate data movement and transformation via cloud-based extract, load and transform data pipelines. Fivetran initially saw interest in ELT coming from digital startups and small to medium-sized organizations looking to connect and analyze data from SaaS applications. As the company grew, it attracted increased interest from larger, established organizations interested in the ELT approach for enterprise applications.

In September 2021, the company announced that it was acquiring database replication provider HVR to accelerate its ability to serve enterprise accounts. The acquisition coincided with a $565 million Series D funding round led by Andreessen Horowitz, which brought Fivetran’s total funding to $730 million and valued the company at $5.6 billion. Fivetran has subsequently raised a further $125 million in financing from Vista Credit Partners, and in February announced that it surpassed $200 million in annual revenue run rate from more than 5,000 customers.

The acquisition of HVR added database replication capabilities based on change data capture techniques, as well as data processing in a customer's own datacenter or virtual private cloud. Rather than synchronizing the entire dataset, CDC facilitates agile data integration and transformation by only synchronizing data as it is inserted, updated or deleted. Fivetran has rebranded HVR's Software as Fivetran Local Data Processing and continues to develop it, with a new release (6.2) scheduled for general availability in the fourth quarter. The company is also incorporating the log-based CDC-based data replication functionality into the Fivetran managed service offering.

Fivetran's flagship offering is a cloud-based data movement platform designed to automate the extraction and loading of data from operational and analytic data sources into a variety of target platforms, including data warehouses, data lakes and streaming data platforms. A suite of more than 300 connectors for SaaS applications, databases, storage and streaming platforms enables managed data replication. Fivetran can and is used simply to move or replicate data from a source to a target. Fivetran Transformations utilize the dbt Core open-source data transformation tool to schedule and perform data transformations in the target environment.

This ELT approach of using the target database to perform transformations can be contrasted with traditional extract, transform and load data pipelines designed to extract data from a source (typically a database supporting an operational application), transform it in a dedicated staging area and load it into a target environment (typically a data warehouse or data lake) for analysis. Since they are designed for a specific data transformation task, ETL pipelines are rigid, difficult to adapt and ill-suited to continuous and agile processes.

In comparison, ELT pipelines push data transformation execution to the target data platform, resulting in an agile data extraction and loading phase that is adaptable to changing data sources. As I previously explained, we have seen increased interest in ELT pipelines driven by the need for greater agility and flexibility to meet the demands of real-time data processing. I assert that by 2025, more than three-quarters of organizations’ information architectures will support ELT (extract, load, transform) patterns to accelerate data processing and maximize the value of large volumes of data. Fivetran also offers pre-built dbt-compatible data models based on the data model of key SaaS data sources, which are updated as APIs change. Users can monitor data movement, logs and status in the form of data lineage graphs as well as configure and manage connector and transformation alerts. The company charges customers based on monthly active rows for the data that is inserted, updated or deleted using Fivetran’s connectors.

Fivetran is now well-established as a key vendor in the data management sector, and its acquisition of HVR has boosted its ability to serve the needs of larger organizations. It has added CDC data movement capabilities to its suite of connectors and its cloud-based approach to ELT. We see opportunities for the company to expand its value with the addition of further data orchestration and data observability capabilities, which we see as key components of DataOps. Nevertheless, I recommend that organizations exploring new approaches to data movement and transformation ensure Fivetran is included in evaluations.

Regards,

Matt Aslett