One of the leading audio-visual media groups in France, with 15 million subscribers worldwide diversified their offerings to meet customer’s needs across a multi-channel user experience. This resulted in a significant increase in data points for their business intelligence teams to analyze.
The data team decided to migrate to a Cloud Data Warehouse Amazon Redshift, which is a fully managed, petabyte-scale data warehouse from AWS. The environment allowed the team to perform cross-tabulations and calculations on two key segments: Video on Demand to analyse one-off purchases, and Replay to analyse overall consumption across different platforms.
To support this requirement, the team designed data marts that would be pre-calculated to drive the analyses.
Due to the volume of data, for a given day the calculation could take 12 hours. This meant the business teams received consolidated figures 24 to 36 hours later. “We only do a distinct count on all our platforms: how many single users used this programme, how many distinct counts per platforms, how many per day.” – explained the head of the company’s datalab. The goal was to perform the reporting in less than 1 hour.
Business intelligence teams also wanted to be able to compare consumption and development of one-off purchases from one year to the next. This analysis involved a long time examining the data. They also wanted to use their standard tooling Microsoft Power BI and not have to develop any new interfaces.
The Indexima Effect
Adding Indexima Data Hub between the BI tooling and the data warehouse caused 90% of the queries to be intercepted. Hits to Amazon Redhshift layer were greatly reduced. The key however was that the queries could now be done on the raw data and collected in real time, avoiding the need to pre-calculate data marts to drive the visualizations. The calculations performed by the Indexima Data Hub were returning within 3 minutes over a 24-month data history.
These performance levels opened up new possibilities for the data and business teams.
“We’ve been working with Indexima for 2 years and greatly appreciated their attentiveness and advice. Plus, their solution met our expectations in every respect: their relational engine allows us to make joins as we go along so we don’t have to reprocess the data after the event to enrich it. And queries can be made under SQL Static with their solution!”
Data Warehouse Optimization
Due to the 90% reduction in the number of queries hitting the data warehouse, the customer was able to reduce their Amazon Redshift cluster size by 50% and still guarantee users the expected response time.
Initially this installation was running 6 x dc2.8xlarge on demand with 10TB of data = $255,114 / year. Adding a 3 node Indexima Data Hub reduced the annual cost by $126,155.
Data Engineering Optimization
Since the business teams could now operate on the raw data and there was no longer a need to pre-calculate the data marts, each time a new insight or report across fine grained data was needed the data engineering effort was eliminated.
The customer was running 4 projects per year, with 3 weeks data engineering per project. This improved time-to-market by 3 months!!
Indexima Data Hub can reduce the TCO of data warehouses like Amazon Redshift and Snowflake, but also deliver improved time to market by reducing or removing data engineering effort from project delivery.
Indexima is an AWS Redshift Service Ready partner.
By @Florent Voignier, Co-Founder & CTO at Indexima
@Darragh O’Flanagan, Sr. Partner Solutions Architect at AWS