-
30 June 2023
Success story
Migrating data pipelines and database structures from Cloudera to GCP services for a global leader in Consumer Packaged Goods industry
September, 2019- ongoing
Challenge
Multiple data sources contained various semi-structured data types and suffered from data quality problems. The goal was to enhance the cost efficiency of campaigns in linear TV planning and purchasing processes by constructing pipelines utilizing Kubeflow services. This approach aimed to streamline the overall system performance, enhance the reliability of data transformation, and optimize Python-based advertising procedures.
Our approach
The current data pipelines have been migrated to DataProc, GCS, and Composer. To enhance scalability, we have containerized the Python ad optimization code, enabling us to execute expandable tasks on Kubeflow hosted in GKE. By utilizing Kubeflow pipelines and node pools, we can efficiently manage job resources, taking into account the diverse hardware resource needs across different scenarios. This approach allows us to optimize resource utilization and ensure a better fit for the specific workloads required.
Outcome
Cloudera data pipelines were successfully migrated to the GCP platform. The new data pipelines have been enhanced to ensure cost-effectiveness and ease of maintenance. Fast response times is guaranteed by utilizing the BigQuery cache. By leveraging GKE, Kubeflow, and Docker images, jobs can be executed on various code versions and hardware resources. The process of initiating optimization jobs has been streamlined through the utilization of Cloud Functions.
Business Impact
The migration to GCP was a success, resulting in enhanced performance, easier maintenance, and improved data reliability. This achievement was made possible by leveraging reliable cloud native services. Thanks to Kubeflow hosted on GKE, the development time for optimization jobs was significantly reduced. As a result, the final optimization jobs now run on an environment that is both flexible and robust, while also being cost-optimized.
Get in touch with us
Contact us to see how we can help you.
We’ll get back to you within 4 hours on working days (Mon – Fri, 9am – 5pm).
Dominik Radwański
Service Delivery Partner
Address
PolandDS Stream sp. z o.o.
Grochowska 306/308
03-840 Warsaw, Poland
United States of America
DS Stream LLC
1209 Orange St,
Wilmington, DE 19801