Feature Store

Feature store is an interface between models and the data.

  • Productionize new features
  • Automate feature computation, backfills, logging
  • Share and reuse feature pipelines across teams
  • Consistency between training and serving data

Feature store

Feature store 101

Feature store supporting batch and realtime use cases

Five main components of a feature store

Five main components of a feature store

  1. Serving - Avoid Training-serving skew

Serving

  1. Storage - Offline and online storage. Data lake is extended for offline storage. Online storage use latest feature values for each entity. Redis, DynamoDB or Cassandra is used for online storage.

Storage

  1. Transformation - Regular processing of new data into feature values. Feature stores both manage and orchestrate data transformations that produce these values, as well as ingest values produced by external systems.

Transformation

  1. Monitoring - Data quality is tracked by monitoring for drift, training-serving skew etc.

  2. Registry - A registry of standardized feature definitions and metadata. It is a common catalog to explore, develop, collaborate and publish new definitions within and across teams.

Reference

What is a feature store