Powering Netflix's Multimodal feature engineering at scale. Data Engineering forum 2026. San Francisco

Thu, 16 Apr 2026 12:00:00 +0000

ABSTRACT:

As multimodal models mature, the challenge increasingly shifts from model architecture to feature engineering and dataset construction at scale. In this talk, we’ll share how Netflix builds and curates multimodal features across large video and image corpora, with LanceDB serving as the core storage and query layer for multimodal data.

We’ll briefly cover how Ray powers distributed ingestion, filtering, and large-scale batch inference across hundreds of GPUs, enabling the application of modern vision-language models to extract rich multimodal embeddings from video and image data. These embeddings capture both low-level visual signals and higher-level semantic context, forming the foundation for downstream tasks such as search, retrieval, and dataset curation.

Generative on Share what you know

Powering Netflix's Multimodal feature engineering at scale. Data Engineering forum 2026. San Francisco