The need to provide access to data for reporting, analytics, and other business purposes is constantly growing. So, too, is the complexity of setting up and maintaining data pipelines that are needed to deliver that data to the right data consumer at the right time. These issues have brought great attention to data pipeline automation, which, if done correctly, can deliver significant benefits to an organization.
How significant? A recent study released by Ascend.io and conducted by Enterprise Strategy Group (ESG) quantified the benefits of using a data pipeline automation platform. The analysis found that the Ascend Data Platform delivers multiple economic benefits, including major improvements to the productivity of data analytics teams, reduced overall cloud infrastructure costs, lower data platform tooling costs, and greater pipeline reliability.
ESG conducted its standard economic analysis focusing on the quantitative and qualitative benefits organizations can expect. It found that significant savings and benefits could be realized in a number of areas. Some of these areas are obvious; others may not be as immediately apparent. In either case, ESG tried to quantify the benefits where possible. It found that automated data pipelines built using Ascend.io technology and solutions offered the following:
- Greater analytics productivity: According to the report, organizations using Ascend are able to make decisions faster with more complete data. Many users reported 80% reductions in the time required to create new data pipelines, while others said processes that previously took hours are accomplished in minutes.
- Consolidation of tool spend and cost containment: The end-to-end nature of the Ascend platform enables teams to reduce up to four-point solution components of the modern data stack with data pipeline automation. This lowers the hard costs associated with data software and lost productivity from having to manually integrate various parts of the stack in-house. Multiple examples that were studied found that the reduced time it takes to build or modify intelligent pipelines directly improved analytics output. In the ESG’s financial model, the sample company was able to eliminate $156K in annual costs for tools.
- Improved data team productivity: Reducing the labor required to achieve the same outcome is boosting data teams’ throughput thanks to automation. Interviews conducted by ESG found that engineering teams using Ascend data pipeline automation spent only 25% of their time on building and maintaining manual data stack integrations, which led to a 500-700% boost in engineer efficiency when compared with a traditional, modern data stack approach.
There were other benefits and savings identified by ESG. For example, companies found their time to insights was up to 75% faster using automated data pipelines. That pure speed improvement is beneficial on its own, but it opens the door to more innovation. Specifically, getting insights in such an accelerated manner lets an organization test additional scenarios and run more and different analyses.
Addressing the issues of the modern data stack
One of the great challenges data-driven companies face today is the complexity of the modern data stack. As data volumes have grown and more users needed access to such data, companies have added more and more tools and technologies to handle the various aspects of capturing and collecting data, transforming it into a usable format for each user, ingesting the data into an appropriate database or data warehouse, and more.
The problem with this approach was put into perspective in a recent CDInsights article:
The “modern data stack” has become increasingly prominent in recent years, promising a streamlined approach to data processing. However, this well-intentioned foundation has begun to crack under its own complexity. Engineering leaders are investing more time and energy into their modern data stacks without seeing a proportional return on investment, leaving them to question whether an array of shiny new tools is better than a carefully curated, efficient set.
Such issues have led to talk of a post-modern data stack in which automated data pipelines can play a key role. In particular, the ESG report noted that by using an automated data pipeline platform, organizations were able to reap a number of benefits, including:
- Up to a 91% reduction in the total cost of coding and data preparation time
- Up to 80% less time building data pipelines
- Up to 65% savings on tool costs.
A final word on the benefits of automated data pipelines
The collection and preparation of data used for analytics are achieved by building data pipelines that ingest raw data and transform it into useful formats leveraging cloud data platforms like Snowflake, Databricks, and Google BigQuery. These data pipelines span multiple stages of data preparation, often involve blending data from several different cloud services, and are interdependent on each other.
As the use of data analytics grows, the costs of getting data to those who need it have risen. Changes in one pipeline often cascade down to different teams and projects. Manually building and maintaining those pipelines is a complex process that is costly and time-consuming for engineering and analytics professionals. Data pipeline automation can help address these pain points and deliver significant cost savings and time reductions.
Salvatore Salamone is a physicist by training who has been writing about science and information technology for more than 30 years. During that time, he has been a senior or executive editor at many industry-leading publications including High Technology, Network World, Byte Magazine, Data Communications, LAN Times, InternetWeek, Bio-IT World, and Lightwave, The Journal of Fiber Optics. He also is the author of three business technology books.