Automatic Modeling of File System Workloads Using Two-Level Arrival Processes

Peter P. Ware*, Thomas W. Page, Barry L Nelson

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

19 Scopus citations

Abstract

This article describes a method for analyzing, modeling, and simulating a two-level arrival-counting process. This method is particularly appropriate when the number of independent processes is large, as is the case in our motivating application which requires analyzing and representing computer file system trace data for activity on nearly 8,000 files. The method is also applicable to network trace data characterizing communication patterns between pairs of computers. We apply cluster analysis to separate the arrival process into groups or bursts of activity on a file. We then characterize the arrival process in terms of the time between bursts of activity on a file, the time between file events within bursts, and the number of events in a burst. Finally, we model these three components individually, then reassemble the results to produce a synthetic trace generator. In order to gauge the effectiveness of this method, we use synthetically generated (simulated) trace data produced in this way to drive a discrete-event simulation of a distributed replicated file system. We compare the results of the simulation driven by the synthetic trace with the same simulation driven by the original trace data, and conclude that the synthetic data capture the essential characteristics of the empirical trace.

Original languageEnglish (US)
Pages (from-to)305-330
Number of pages26
JournalACM Transactions on Modeling and Computer Simulation
Volume8
Issue number3
DOIs
StatePublished - Jul 1 1998

Keywords

  • Clustering
  • Data replication
  • File access patterns
  • File system
  • Input modeling
  • Replication
  • Synthetic traces
  • Trace driven simulation

ASJC Scopus subject areas

  • Modeling and Simulation
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Automatic Modeling of File System Workloads Using Two-Level Arrival Processes'. Together they form a unique fingerprint.

Cite this