Lab 2021-05-14: Database Exploration

Table of Contents

1 Project 2 Q&A

  • There's been some confusion about how we keep track of the array of key-value pairs within each internal or leaf node. Both the leaf and internal page classes inherit a size_ field from the BPlusTreePage class the stored the number of entries currently in the array, along with GetSize() and IncreaseSize() methods. Remember that for internal nodes, the size is the number of pointers (page ids), not the number of keys.
Page object
+-------------+
|  metadata   |
|-------------|<--+
|    data_    |   |
|             |   |
|             |   |
|             |   +-- the B+ Tree pages live here
|             |   |
+-------------+<--+
 B+ page object (BPlusTreeLeafPage, BPlusTreeInternalPage)
+-------------+
|  metadata   |
|-------------|
|  array      | <---array serves as a pointer to the start
|             |     of the part of the page that contains
|             |     key-value pairs. Since array is the last
|             |     field declared in the object, it implicitly
|             |     fills up the rest of the space in the page
+-------------+

So we can use array to treat this part of the page as an array of key-value pairs. Here's some example syntax:

  • array[0].second = value;
  • array[1] = MappingType(key, value);
// shift all array entries one slot to the right
for(int i = GetSize(); i > 0; i--) {
    array[i] = array[i - 1];
}

2 Database Exploration

The goals of this activity are:

  • Learn some new database terminology. The focus of this course is on relational, disk-based database systems, but the world of database software is much bigger than that. This activity will introduce you to other types of databases such as in-memory databases, key-value stores, and graph, spatial, and time-series databases.
  • See how much is out there. I've only mentioned a handful of systems by name, but there are hundreds with more being created every year.
  • Get a sense for the most popular systems. Popularity isn't everything, but when choosing a database it's helpful to know what systems other people have used. It may also be easier to find answers to questions about a popular system versus an obscure one.

Your task is to work with your group to answer the questions below. When we come back together as a class, each breakout room will be expected to contribute the answer to one of the questions, as indicated. Here are some resources you can use (you are not limited to these):

  • DB-Engines
    • Popularity rankings for different categories of databases
    • An encyclopedia with brief entries for various terminology
    • Pages for individual systems giving an overview of their capabilities
  • Database of Databases
    • Search for databases using many different criteria
    • More detailed entries for many systems than DB-Engines (with citations)
    • Logos!
  • Wikipedia, including Category:Typesofdatabases, Category:Databases, and pages for specific systems
  1. Redis and Amazon DynamoDB are among the most popular key-value stores. A key-value store acts something like a dictionary or map—data is stored as keys with associated values (i.e., not as tables with columns). What are some differences between Redis and DynamoDB? Room 1 will share their answer.
  2. What is the difference between a graph database and a spatial database? Can you think of example applications for each? Room 2 will share their answer.
  3. The database Cassandra was originally an internal project at Facebook that was later released as an open-source product. What kind of database is Cassandra? Why might Facebook find a database like this useful? Room 3 will share their answer.
  4. Would it be convenient to store weather data (think hourly temperature readings over some period of time) in a relational database? Is there a type of database that would be better suited? Room 4 will share their answer.
  5. What is the most common programming language for databases created since 2018? Room 5 will share their answer. (Use the Database of Databases for this.)

As you explore different databases, keep an eye out for good database names and/or logos. Submit your favorite to this Google form: https://forms.gle/nkXSwEB2edYopE1s9