Abstract
Learned indexes have provided a new perspective on the well-studied problem of indexing. Despite early skepticism, recent results have shown significant improvements over traditional data structures. While some have attributed these improvements to an ability to adapt to patterns in the data, we believe that the main advantage of learned indexes comes instead from implicit assumptions (e.g., data sortedness) that are not fully leveraged by the comparison points. In this paper, we show that, simply by incorporating these same basic assumptions, we can design a traditional data structure capable of outperforming learned alternatives. To make our case, we propose a new index called a histogram tree (Hist-Tree), which takes advantage of the sortedness and range of the data. We also present an optimized compact layout based on the read-only assumption made by many learned indexes. Our experiments demonstrate that, in terms of lookup latency, Hist-Tree can beat three state-of-the-art learned indexes, including RMI, PGM-index, and RadixSpline, by as much as 1.8-2.7×.
Original language | English (US) |
---|---|
State | Published - 2021 |
Event | 11th Annual Conference on Innovative Data Systems Research, CIDR 2021 - Virtual, Online Duration: Jan 11 2021 → Jan 15 2021 |
Conference
Conference | 11th Annual Conference on Innovative Data Systems Research, CIDR 2021 |
---|---|
City | Virtual, Online |
Period | 1/11/21 → 1/15/21 |
ASJC Scopus subject areas
- Artificial Intelligence
- Information Systems
- Information Systems and Management
- Hardware and Architecture