Basics and Foundational Systems
- Architecture of a Database System
- The Five Minute Rule Twenty Years Later
- A History and Evaluation of System R
- The POSTGRES Next-Generation Database System
- The Gamma Database Machine Project
Query Processing
- Access Path Selection in a Relational Database Management System
- Query Evaluation Techniques for Large Databases
- The Volcano Optimizer Generator: Extensibility and Efficient Search
- Eddies: Continuously Adaptive Query Processing
- Worst-Case Optimal Join Algorithms
- Datalog and Recursive Query Processing
Transactions
- ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging
- Granularity of Locks and Degrees of Consistency in a Shared Data Base
- Concurrency Control in Distributed Database Systems
- Concurrency Control Performance Modeling: Alternatives and Implications
Indexing
- Efficient Locking for Concurrent Operations on B-trees
- Improved Query Performance with Variant Indexes
- The R*-tree: An Efficient and Robust Access Method for Points and Rectangles
- The log-structured merge-tree (LSM-tree)
DBMS Architectures Revisited
- C-Store: A Column-Oriented DBMS
- Hekaton: SQL Server's Memory-Optimized OLTP Engine
- Calvin: Fast Distributed Transactions for Partitioned Database Systems
- Spanner: Google's Globally-Distributed Database
- Building Efficient Query Engines in a High-Level Language
Distributed Data, Weak Isolation, Relaxed Consistency
- Transaction Management in the R* Distributed Database Management System
- Generalized Isolation Level Definitions
- Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System
- Dynamo: Amazon's Highly Available Key-value Store
- CAP Twelve Years Later: How the "Rules" Have Changed
- Consistency Analysis in Bloom: a CALM and Collected Approach
Parallel Dataflow
- Parallel Database Systems: The Future of High Performance Database Processing
- Encapsulation of Parallelism in the Volcano Query Processing System
- MapReduce: simplified data processing on large clusters
- TAG: A tiny aggregation service for ad-hoc sensor networks
- Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing
The Web and Databases
- Combining Systems and Databases: A Search Engine Retrospective
- The Anatomy of a Large-Scale Hypertextual Web Search Engine
- WebTables: Exploring the Power of Tables on the Web
Materialized Views, Cubes and Aggregation
- Materialized Views
- On the Computation of Multidimensional Aggregates
- Implementing Data Cubes Efficiently
- Informix Under Control: Online Query Processing
- BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data
Special-Case Data Models: Streams, Semistructured, Graphs
- The CQL Continuous Query Language: Semantic Foundations and Query Execution
- Dataguides: Enabling Query Formulation and Optimization in Semistructured Databases
- PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs
Data Integration, Provenance and Transformation
- Schema Mapping as Query Discovery
- Provenance in Databases: Why, How, and Where
- Wrangler: Interactive Visual Specification of Data Transformation Scripts
Systems support for ML
- The MADlib analytics library: or MAD skills, the SQL
- HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent
- Scaling Distributed Machine Learning with the Parameter Server