Computer Science Department
School of Computer Science, Carnegie Mellon University
Performance Modeling of Storage Devices
In our design, the models represent an I/O workload as vectors, and model its performance on storage devices as functions over the vectors using a regression tool. We have identied that vector representation of workloads, the regression tool, and training traces are three important factors in model quality. This thesis provides a thorough evaluation of existing techniques in addressing these issues. In addition, we have proposed the entropy plot to characterize the spatio-temporal behavior of I/O workloads and the PQRS model to generate traces of given characteristics to augment existing work in workload characterization.
Our experiments on real-world traces have shown that the learning-based models are fast and accurate when the training and testing traces are similar. Oine training using synthetic traces, however, is less effective because the synthetic trace generators fail to capture the strong correlations between requests. Our error analyses have shown both the vector representation and synthetic trace generators have space for further improvement.