Feature Store
Feature store is an interface between models and the data.
- Productionize new features
- Automate feature computation, backfills, logging
- Share and reuse feature pipelines across teams
- Consistency between training and serving data
Five main components of a feature store
- Serving - Avoid Training-serving skew
- Storage - Offline and online storage. Data lake is extended for offline storage. Online storage use latest feature values for each entity. Redis, DynamoDB or Cassandra is used for online storage.
- Transformation - Regular processing of new data into feature values. Feature stores both manage and orchestrate data transformations that produce these values, as well as ingest values produced by external systems.
Monitoring - Data quality is tracked by monitoring for drift, training-serving skew etc.
Registry - A registry of standardized feature definitions and metadata. It is a common catalog to explore, develop, collaborate and publish new definitions within and across teams.