Basics and Foundational Systems

  1. Architecture of a Database System
  2. The Five Minute Rule Twenty Years Later
  3. A History and Evaluation of System R
  4. The POSTGRES Next-Generation Database System
  5. The Gamma Database Machine Project

Query Processing

  1. Access Path Selection in a Relational Database Management System
  2. Query Evaluation Techniques for Large Databases
  3. The Volcano Optimizer Generator: Extensibility and Efficient Search
  4. Eddies: Continuously Adaptive Query Processing
  5. Worst-Case Optimal Join Algorithms
  6. Datalog and Recursive Query Processing

Transactions

  1. ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging
  2. Granularity of Locks and Degrees of Consistency in a Shared Data Base
  3. Concurrency Control in Distributed Database Systems
  4. Concurrency Control Performance Modeling: Alternatives and Implications

Indexing

  1. Efficient Locking for Concurrent Operations on B-trees
  2. Improved Query Performance with Variant Indexes
  3. The R*-tree: An Efficient and Robust Access Method for Points and Rectangles
  4. The log-structured merge-tree (LSM-tree)

DBMS Architectures Revisited

  1. C-Store: A Column-Oriented DBMS
  2. Hekaton: SQL Server's Memory-Optimized OLTP Engine
  3. Calvin: Fast Distributed Transactions for Partitioned Database Systems
  4. Spanner: Google's Globally-Distributed Database
  5. Building Efficient Query Engines in a High-Level Language

Distributed Data, Weak Isolation, Relaxed Consistency

  1. Transaction Management in the R* Distributed Database Management System
  2. Generalized Isolation Level Definitions
  3. Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System
  4. Dynamo: Amazon's Highly Available Key-value Store
  5. CAP Twelve Years Later: How the "Rules" Have Changed
  6. Consistency Analysis in Bloom: a CALM and Collected Approach

Parallel Dataflow

  1. Parallel Database Systems: The Future of High Performance Database Processing
  2. Encapsulation of Parallelism in the Volcano Query Processing System
  3. MapReduce: simplified data processing on large clusters
  4. TAG: A tiny aggregation service for ad-hoc sensor networks
  5. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing

The Web and Databases

  1. Combining Systems and Databases: A Search Engine Retrospective
  2. The Anatomy of a Large-Scale Hypertextual Web Search Engine
  3. WebTables: Exploring the Power of Tables on the Web

Materialized Views, Cubes and Aggregation

  1. Materialized Views
  2. On the Computation of Multidimensional Aggregates
  3. Implementing Data Cubes Efficiently
  4. Informix Under Control: Online Query Processing
  5. BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data

Special-Case Data Models: Streams, Semistructured, Graphs

  1. The CQL Continuous Query Language: Semantic Foundations and Query Execution
  2. Dataguides: Enabling Query Formulation and Optimization in Semistructured Databases
  3. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs

Data Integration, Provenance and Transformation

  1. Schema Mapping as Query Discovery
  2. Provenance in Databases: Why, How, and Where
  3. Wrangler: Interactive Visual Specification of Data Transformation Scripts

Systems support for ML

  1. The MADlib analytics library: or MAD skills, the SQL
  2. HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent
  3. Scaling Distributed Machine Learning with the Parameter Server