Post on 02-Jan-2016
description
Indexing
• For very small relational databases looking up information directly is quick and easy– But production level databases can quickly
become large enough to create delays while searching
• An index uses a small amount of memory and HD space in order to speed searching
Index Types
• There are three different index types in MySQL:– INDEX– PRIMARY KEY– FULLTEXT
• Determining what type to use and where to apply it – ALTER TABLE classics ADD INDEX (author(20));
– CREATE INDEX author ON classics (author(20));
Primary Keys
• Primary keys are unique identifiers for each row in a table
• They can be added after a table is created– ALTER TABLE classics ADD isbn CHAR(13) PRIMARY KEY;
• But this only works for an unpopulated table– Use another column (or create a new one) that
will have unique data for each entry and feed it data for the key
FULLTEXT
• Saves each words of the data string in an index that can be searched using natural language– Still only works for small tables– Only works for Char, Varchar, Text columns
• ALTER TABLE classics ADD FULLTEXT(author,title);
Querying
• After creating and deleting databases the most important thing you will need to do is query them
• Basic format is:– SELECT author, title FROM classics;
• This can then be expanded to refine the information you are receiving back
SELECT Modifications
• SELECT COUNT– Returns the number of rows that match that result– SELECT COUNT(*) FROM classics;
• SELECT DISTINCT– Will only display rows that match the search
criteria just once– SELECT author FROM classics; – SELECT DISTINCT author FROM classics;
WHERE & LIKE
• Where allows you to narrow down the results that you get based upon certain qualifiers– SELECT author,title FROM classics WHERE author="Mark Twain";
• This requires specific knowledge of what you need to search for
• Like allows for vague searches using a wildcard keyword– SELECT author,title FROM classics WHERE author LIKE "Charles%";
ActivitySample Table: empinfofirst last id age city stateJohn Jones 99980 45 Payson ArizonaMary Jones 99982 25 Payson ArizonaEric Edwards 88232 32 San Diego CaliforniaMary Ann Edwards 88233 32 Phoenix Arizona
Ginger Howell 98002 42 Cottonwood Arizona
Sebastian Smith 92001 23 Gila Bend ArizonaGus Gray 22322 35 Bagdad ArizonaMary Ann May 32326 52 Tucson ArizonaErica Williams 32327 60 Show Low ArizonaLeroy Brown 32380 22 Pinetop ArizonaElroy Cleaver 32382 22 Globe Arizona
Activity
• Display all columns for everyone that is over 40 years old.
• Create a select query that finds all people with the last name “Jones”
• Create a select query that finds people with first names that start with “Er”
• Create a select query that finds people with last names ending in “s”
LIMIT
• Limit can be used to make sure that only a certain number of rows are returned in response to a query– Using just one identifier means that it will start at
the beginning and return that number. Giving it two will skip the first number of responses
– Offsets begin like arrays (at 0)• SELECT author,title FROM classics LIMIT 3;• SELECT author,title FROM classics LIMIT 1,2;
MATCH…AGAINST
• Only usable on columns with a FULLTEXT index• Allows you to do searches like in an Internet search
engine (using normal language)– There are certain words that will be ignored by this modifier
(and, or, not) and return an empty set• WHERE MATCH(author,title) AGAINST('old shop'); SELECT
author,title FROM classics
• In Boolean mode adding a + requires a word and – bars it from being in the result
• SELECT author,title FROM classics WHERE MATCH(author,title) AGAINST('"origin of"' IN BOOLEAN MODE);
UPDATE…SET
• Update set allows for the contents within a search set to be updated
• Makes sure that only the rows you want changed will be
• UPDATE classics SET author='Mark Twain (Samuel Langhorne Clemens)' WHERE author='Mark Twain';
ORDER BY
• Order by is a sort feature that you can apply to one or more columns
• Allows for both ascending and descending with ascending being the default
• SELECT author,title FROM classics ORDER BY author;
• SELECT author,title FROM classics ORDER BY title DESC;
• SELECT author,title,year FROM classics ORDER BY author,year DESC;
GROUP BY
• Also orders the results of a search by groups them according to another category in the table
• Useful when you want to see how the data relates to other information in the table
• SELECT category,COUNT(author) FROM classics GROUP BY category;
customerid firstname lastname city state10101 John Gray Lynden Washington10298 Leroy Brown Pinetop Arizona10299 Elroy Keller Snoqualmie Washington10315 Lisa Jones Oshkosh Wisconsin10325 Ginger Schultz Pocatello Idaho10329 Kelly Mendoza Kailua Hawaii10330 Shawn Dalton Cannon Beach Oregon10338 Michael Howell Tillamook Oregon10339 Anthony Sanchez Winslow Arizona10408 Elroy Cleaver Globe Arizona10410 Mary Ann Howell Charleston South Carolina10413 Donald Davids Gila Bend Arizona10419 Linda Sakahara Nogales Arizona10429 Sarah Graham Greensboro North Carolina10438 Kevin Smith Durango Colorado10439 Conrad Giles Telluride Colorado10449 Isabela Moore Yuma Arizona
customers
customerid order_date item quantity price10330 30-Jun-1999 Pogo stick 1 28.0010101 30-Jun-1999 Raft 1 58.0010298 01-Jul-1999 Skateboard 1 33.0010101 01-Jul-1999 Life Vest 4 125.0010299 06-Jul-1999 Parachute 1 1250.0010339 27-Jul-1999 Umbrella 1 4.5010449 13-Aug-1999 Unicycle 1 180.7910439 14-Aug-1999 Ski Poles 2 25.5010101 18-Aug-1999 Rain Coat 1 18.3010449 01-Sep-1999 Snow Shoes 1 45.0010439 18-Sep-1999 Tent 1 88.0010298 19-Sep-1999 Lantern 2 29.0010410 28-Oct-1999 Sleeping Bag 1 89.2210438 01-Nov-1999 Umbrella 1 6.7510438 02-Nov-1999 Pillow 1 8.5010298 01-Dec-1999 Helmet 1 22.0010449 15-Dec-1999 Bicycle 1 380.5010449 22-Dec-1999 Canoe 1 280.0010101 30-Dec-1999 Hoola Hoop 3 14.7510330 01-Jan-2000 Flashlight 4 28.0010101 02-Jan-2000 Lantern 1 16.00
10299 18-Jan-2000 Inflatable Mattress 1 38.00
10438 18-Jan-2000 Tent 1 79.9910413 19-Jan-2000 Lawnchair 4 32.0010410 30-Jan-2000 Unicycle 1 192.5010315 2-Feb-2000 Compass 1 8.0010449 29-Feb-2000 Flashlight 1 4.5010101 08-Mar-2000 Sleeping Bag 2 88.7010298 18-Mar-2000 Pocket Knife 1 22.3810449 19-Mar-2000 Canoe paddle 2 40.0010298 01-Apr-2000 Ear Muffs 1 12.5010330 19-Apr-2000 Shovel 1 16.75
items_ordered
Activity
• Select the lastname, firstname, and city for all customers in the customers table. Display the results in Ascending Order based on the lastname.
• How many people are in each unique state in the customers table? Select the state and display the number of people in each. Hint: count is used to count rows in a column, sum works on numeric data only.
• How many orders did each customer make?
Joining Tables
• Sometimes the information you need is spread across several tables
• A JOIN allows for that information to be combined in the results table (reducing the amount of information you need to handle)
• Simple joins are very easy to perform, simply list both tables after the SELECT
• SELECT name,author,title from customers,classics WHERE customers.isbn=classics.isbn;
Types of Joins
• Joining can be altered to create more specific results
• Natural Join – automatically joins columns that have the same column name
• Join…On – allows you to specify the column to join the two tables
• AS – allows you to create aliases to shorten table names when used as reference
Activity
• Using the previous two tables:• Write a query using a join to determine which
items were ordered by each of the customers in the customers table. Select the customerid, firstname, lastname, order_date, item, and price for everything each customer purchased in the items_ordered table.
Logical Operators
• Used with WHERE queries to narrow down the results
• Useful when data may be saved in a couple different ways:
• SELECT author,title FROM classics WHERE author LIKE "%Mark Twain%" OR author LIKE "%Samuel Langhorne Clemens%";
Database Design
• Databases thrive on good and efficient design• The correct layout will improve the efficiency,
speed, and usefulness of your site• Begin with trying to anticipate what types of
queries will be commonly needed on your site– What could they be for the project site?– What types of information is needed to answer
these questions?– What seem like some naturally occurring groups?
Primary Keys
• One of the most important parts of good database design– Have a quick unique identifier makes storing and
retrieving information much easier– Keys should be truly unique and not repeatable for
different objects– Auto increment works great for this– Unfortunately its not the most natural of keys
Normalization
• You want to avoid duplication of information in the database– Redundancy increases the size of the database
and how long it takes for results to be returned• Duplicates also make consistency (one of
those key principles!) hard to maintain since making sure all instances of an entry are updated (or deleted)
Normalization Schemas
• There are three separate schemas for normalization (yes I know the list has 4)– First– Second– Third– Normal Form
• Normalizing for each of these forms will make sure your database stays in that sweet spot
Author 1 Author 2 Title ISBN Price U.S. Cust. name Cust. addressPurch. date
David Sklar AdamTrachtenberg
PHPCookbook
0596101015
44.99 Emma Brown
1565 Rainbow Road, Los Angeles, CA 90014
Mar 03 2009
DannyGoodman
Dynamic HTML
0596 527403
59.99 Darren Ryder
4758 Emily Drive, Richmond, VA 23219
Dec 19 2008
Hugh E. Williams
David Lane PHP and MySQL
0596005436
44.95 Earl B.Thurston
862 Gregory Lane, Frankfort, KY 40601
Jun 22 2009
David Sklar AdamTrachtenberg
PHPCookbook
0596101015
44.99 Darren Ryder
4758 Emily Drive, Richmond, VA 23219
Dec 19 2008
Rasmus Lerdorf
Kevin Tatroe & Peter MacIntyre
ProgrammingPHP
0596006815
39.99 David Miller 3647 Cedar Lane, Waltham, MA 02154
Jan 16 2009
Table 9-1. A highly inefficient design for a database table
First Normal Form
• For a database to satisfy the First Normal Form, it must fulfill three requirements:1. There should be no repeating columns containing
the same kind of data.2. All columns should contain a single value.3. There should be a primary key to uniquely identify
each row.• Columns which are needed but not fitting this
form can (and should) be spun off to another table
ISBN Author0596101015 David Sklar
0596101015 Adam Trachtenberg
0596527403 Danny Goodman
0596005436 Hugh E Williams
0596005436 David Lane
0596006815 Rasmus Lerdorf
0596006815 Kevin Tatroe
0596006815 Peter MacIntyre
Table 9-3. The new Authors table
Second Normal Form
• Only after achieving First Normal Form can Second Normal Form be evaluated
• Second Normal Form is achieved by identifying columns whose data repeats in different places and then removing them to their own tables.
CustNo Name Address City State Zip1 Emma Brown 1565 Rainbow Road Los Angeles CA 90014
2 Darren Ryder 4758 Emily Drive Richmond VA 23219
3 Earl B. Thurston
862 Gregory Lane Frankfort KY 40601
4 David Miller 3647 Cedar Lane Waltham MA 02154
Table 9-6. The new Customers table
CustNo ISBN Date
1 0596101015 Mar 03 2009
2 0596527403 Dec 19 2008
2 0596101015 Dec 19 2008
3 0596005436 Jun 22 2009
4 0596006815 Jan 16 2009
Table 9-7. The new Purchases table
Third Normal Form
• Third normal form is considered to be the strictest rule to follow and isn’t always needed to have a productive database
• Data that is not directly dependent on the primary key but that is dependent on another value in the table should also be moved into separate tables, according to the dependence.
• In this example it would require making 3 new tables for ZIP, State, and CITY
When to Use Third Normal Form
• When additional information may be needed– Eg. The 2 letter state abbreviation
• Book suggests that if you answer “yes” to either of these questions then you should follow third normal form:
1. Is it likely that many new columns will need to be added to this table?
2. Could any of this table’s fields require a global update at any point?
When Not to Normalize
• Spreading info across so many tables can make MySQL work hard to return your results
• On a very popular site, if you have normalized tables, your database access will slow down considerably once you get above a few dozen concurrent users
Databases and Anonymity
• A lot of information about people gets stored in databases of dynamic websites
• This can be a benefit or a danger– How Target Figured Out A Teen Girl Was Pregnany
Before Her Father Did