Temporal and spatial localities are basic concepts in operating systems, and storage systems rely on localities to perform well. Surprisingly, it is difficult to quantify the localities present in workloads and how localities are transformed by storage data path components in metrics that can be compared under diverse settings.
In this thesis, we introduce stack- and block-affinity metrics to quantify temporal and spatial localities. We demonstrate that our metrics (1) behave well under extreme and normal loads, (2) can be used to validate synthetic loads at each stage of storage optimization, (3) can capture localities in ways that are resilient to generations of hardware, and (4) correlate meaningfully with performance.
Our experience also unveiled hidden semantics of localities and identified future research directions.