Are You Sure You Want to Use MMAP in Your Database Management System?

Andrew Crotty, Viktor Leis, Andrew Pavlo

Research output: Contribution to conferencePaperpeer-review

23 Scopus citations

Abstract

Memory-mapped (mmap) file I/O is an OS-provided feature that maps the contents of a file on secondary storage into a program's address space. The program then accesses pages via pointers as if the file resided entirely in memory. The OS transparently loads pages only when the program references them and automatically evicts pages if memory fills up. mmap's perceived ease of use has seduced database management system (DBMS) developers for decades as a viable alternative to implementing a buffer pool. There are, however, severe correctness and performance issues with mmap that are not immediately apparent. Such problems make it difficult, if not impossible, to use mmap correctly and efficiently in a modern DBMS. In fact, several popular DBMSs initially used mmap to support larger-than-memory databases but soon encountered these hidden perils, forcing them to switch to managing file I/O themselves after significant engineering costs. In this way, mmap and DBMSs are like coffee and spicy food: an unfortunate combination that becomes obvious after the fact. Since developers keep trying to use mmap in new DBMSs, we wrote this paper to provide a warning to others that mmap is not a suitable replacement for a traditional buffer pool. We discuss the main shortcomings of mmap in detail, and our experimental analysis demonstrates clear performance limitations. Based on these findings, we conclude with a prescription for when DBMS developers might consider using mmap for file I/O.

Original languageEnglish (US)
StatePublished - 2022
Event12th Annual Conference on Innovative Data Systems Research, CIDR 2022 - Santa Cruz, United States
Duration: Jan 9 2022Jan 12 2022

Conference

Conference12th Annual Conference on Innovative Data Systems Research, CIDR 2022
Country/TerritoryUnited States
CitySanta Cruz
Period1/9/221/12/22

ASJC Scopus subject areas

  • Artificial Intelligence
  • Hardware and Architecture
  • Information Systems
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'Are You Sure You Want to Use MMAP in Your Database Management System?'. Together they form a unique fingerprint.

Cite this