Database 101

24
Database 101 A hands on tech talk

Transcript of Database 101

Database 101A hands on tech talk

What Is a Database?

● A collection of data that is organized in a way that makes retrieval relatively easy.

● Typically has logical groupings of schemas, and tables, (though document storage database do not)

● Typically relational, though not always (noSQL)

Kinds of Databases

● Relational - MySQL, PostgreSQL, Sqlite● Columnar - Redshift, Cassandra, BigQuery● Document - MongoDB, CouchDB● Key/value (in memory) - Memcached, Redis● Full text (search engines) - Solr, ElasticSearch, SphinxSE

MySQL

● Built by Michael Widenius● Named after his daughter “My”. MariaDB, the successor to MySQL is

named after his other daughter “Maria”. - Dad points!● Purchased by Sun in 2008, by Oracle in 2010● Popular because its relational, open source, works on many OSes, and

works well for the average use case

Connecting

● mysql -h bookshelf.crm7sspivlug.us-east-1.rds.amazonaws.com -u persuade -p -D bookshelf

● -h is the hostname● -u is the user to connect as● -p is the password - but don’t type it as we will be prompted with a masked

input● -D is the database to use

Looking around

● show databases● show tables● explain <table>● ; vs \G

SELECT from a table

● A SELECT returns one or more results● -SELECT● * is a column matcher - we can also specify individual columns● -LIMIT our result set

SELECT * FROM books LIMIT 1;

Exercises

● Select the names of all the publishers● Select the titles of the first 5 books

Functions

● Functions can either return a product on each row (e.g. length), or act as an aggregate function (e.g. avg). Note that aggregates collapse the result set

SELECT count(1) FROM books;

SELECT length(title) FROM books;

SELECT avg(length(title)) FROM books;

Conditions

● You can pass many expressions to WHERE including column names (e.g. name), or logic expressions (e.g. 1=1).

● Note that “=” is a comparison operator in SQL - not an assignment● Other comparison operators include >, >=, <, <=, <>, !=, LIKE, IN, etc

SELECT * FROM authors WHERE name LIKE '%Crichton';

SELECT * FROM authors WHERE name = 'Michael Crichton';

SELECT * FROM authors WHERE id = 28;

Exercises

● Count the number of books with a rating of 4 or higher

Ordering

● Ordering can be done a column, or an expression● Can be ASC, or DESC● Can use multiple columns

SELECT name FROM publishers ORDER BY name ASC;

SELECT name FROM publishers ORDER BY name DESC;

SELECT name FROM publishers ORDER BY name, created_at ASC;

Exercises

● Return the title of the oldest published book● Return the top 5 newest books

Joining to another table

● Relational tables excel at querying database that is normalized via JOIN● There are several types of JOIN - inner, left outer, full outer

SELECT title FROM books INNER JOIN authors on books.author_id = authors.id WHERE authors.name like '%Crichton';

SELECT * FROM authors LEFT OUTER JOIN books on books.author_id = authors.id WHERE books.id IS NULL; -- authors without a book (are they really authors then?!)

Exercises

● Return all books with title, publisher name, and author name

Grouping

● Aggregates results into groups● A common use case is to count groups of results● Use in conjunction with HAVING to use conditions on the resulting grups

SELECT author_id, count(1) FROM books GROUP BY author_id

SELECT author_id, count(1) as book_count FROM books GROUP BY author_id HAVING book_count > 10;

Exercises

● Calculate the average rating of all books● Calculate the breakdown of ratings● Return the average rating of all books by author● Return the top 5 most prolific author names, and a count of their books

Autocommit

SET AUTOCOMMIT = 0

BEGIN WORK

INSERT INTO …

SELECT * FROM … (includes record above)

ROLLBACK WORK

SELECT * FROM … (no longer includes record above)

Turn Autocommit on now

In your MySQL prompt, type:

SET AUTOCOMMIT = 0;

BEGIN WORK;

We will use a transaction for the remainder of the slides to isolate changes to the database to just your session. If you need to rollback, do:

ROLLBACK WORK; BEGIN WORK;

INSERT

● Creates one or more new records● Uses keyword INSERT● Columns to insert are in parenthesis after table name● VALUES keyword precedes values in the order of the columns listed● Note that the primary key is automatically populated● Ways to batch insert include LOAD INFILE, and INSERT INTO SELECT

FROM

INSERT INTO authors (name) VALUES ('Ben Simpson');

UPDATE

● Modifies one or more existing records● Can be used with conditions to update all records that match

UPDATE authors SET name = 'Benjamin Lee Simpson' WHERE id = 1957;

DELETE

● Deletes one or more records!● PRO TIP! Write your statement first as a SELECT along with conditions

before writing the word DELETE● Note that DELETE from <table> is valid and will affect all records● Primary keys will be preserved. To reset these use TRUNCATE

DELETE FROM authors WHERE id = 1957;

EXPLAIN

● Shows how the database is planning to execute your SQL statement. ● Useful for diagnosing why a query is slow. Helps find where an index

would be beneficial*● Prefix your query with keyword EXPLAIN● Operations like filesort, copying to tmp table, etc are BAD

EXPLAIN SELECT * FROM books WHERE rating > 4;

Indexes

● Quick reference to a subset of records that match a common condition● Foreign keys are often used in JOINs and typically benefit from indexing● Indexes are not free - when a row is inserted/updated/deleted each index

containing that row must be updated as well

SHOW INDEXES FROM authors;

CREATE INDEX books_rating on books (rating);

EXPLAIN SELECT * FROM books WHERE rating > 4;