Accelerating AI Insights: Snowflake-Powered Content Analysis

Client provides third-party content-level contextual analysis and brand safety solutions to publishers, DSPs, agencies and advertisers.

Challenges

  • Current system unable to cope with growing data size (in TBs) and not extensible due to flat structure
  • Insights could be generated for limited number of days and any extended period resulted in system timeouts
  • Generating new insights was complex and time consuming

Solution

  • Snowflake data warehousing enabled generating analytical insights at scale.
  • Transitioned from Athena to Snowflake, utilized  Snowflake to deliver new MVPs, while migrating the rest in parallel.
  • Leveraged curated data tables to optimize time-to-market by ensuring data readiness.
  • Curtailed learning curve by reusing Apache Airflow to sync data between S3 and Snowflake.
  • Tech Stack: 
    • Pipelines: Apache Spark, Apache Airflow, AWS (Athena,S3, Redis, ECS) 
    • Reporting: Google Looker, Docker
    • API: Python

Impact Generated

  • 86% Boost New explore (Snowflake): < 8 seconds as compared to old explore (Athena Query): > 60 seconds​
  • Scalable ​Transitioning from Athena to Snowflake enabled generating insights for a larger date range that previously resulted in timeouts with Athena​
  • Extensible Curated tables enabling versatile usage across dimensions, supporting internal and customer-facing insights
  • Seamless transition Ensured seamless transition with minimal disruption to ongoing operations

Related Stories

Unlock faster growth & intelligence.