Normalizing Database Files Professor Ralph Westfall May, 2011.
-
Upload
howard-melton -
Category
Documents
-
view
214 -
download
0
Transcript of Normalizing Database Files Professor Ralph Westfall May, 2011.
Normalizing Database Files
Professor Ralph WestfallMay, 2011
Concepts Covered
Using SQL Server Management Studio to take unnormalized data in a file and splitting it into two normalized files For example, repeated data in a field that has a
one to many or many to many relationship with the other fields in the table
Doing the same thing with Excel
Create Non-Redundant Table
CREATE TABLE currenciesCKey (Currency VARCHAR(15))INSERT INTO currencies (Currency)SELECT DISTINCT Currency from nations
Word DISTINCT gets rid of duplicated entries in the normalized new table
Finish Creating New Table
Click Execute button Click in Object Explorer window, then click
Refresh icon and expand Tables node for this database to see the new table
May need to right-click Tables>Refresh for the database used to see table that was just created
Add Normalized New Table Key
Right-click this new table name>Design Right-click top field>Insert Column Change Data Type to int (or something else
appropriate) Right-click this new column>Set Primary Key Set (Is Identity) to Yes in Column Properties Set Identity Seed to a value large enough so all
current and expected rows will have same number of digits in their key
Click diskette icon to save this new table
Create Table with Foreign Key
Define a new table that has a foreign key to reference the data in the second table
Do a join that includes all fields from both tables EXCEPT for the field that went into the separate normalized table the key field from the second table gets added to
the new table but the data the key field references doesn't go into it
Results of this join fill the new table with data
Create Table with Foreign Key
CREATE TABLE [table] ([field [datatype], [field] [datatype], etc.)INSERT INTO [table] ([field], [field], [field],[table].[field], etc.)
WHERE [table].[field]=[other_table].[field] Will have all fields (excluding duplicates) in
the new table Will need to qualify fields with same names in
both tables by name of the table they come from Execute, then Refresh to see the new table See sample code in Notes
Verify Data
Use SQL Server Management Studio to Select Top 1000 Rows to see data in new table visually check foreign keys against values in
other table Or run a SQL query to join and view the data
Backup
Backup data from the original table Right-click database name>Tasks>Export
Data>Next Data source is SQL Server Native Client
10.0>Use Windows Authentication>Database is the one that was being used>Next
Create a new .txt file on desktop>Destination is Flat File Destination>Browse for that file>Format: Delimited>Next
Copy data from one or more etc.>Next Source table or view: [original table]>Next>Next>
Finish
Finish Backup and Delete
Verify that the new table has the same number of rows as the original table SELECT COUNT([key_field]) FROM
[original_table] and then repeat with new table Delete the original table Detach the database
Excel
Could easily normalize small and middle sized data sets in Excel
See Filter for unique values or remove duplicate values for how to get unique values for second table
Copy the column to be normalized into a new table add sequence of numbers in column to the left put captions over each of these two columns follow instructions in the link above
Excel - 2
Add an extra column to the main table create a VLOOKUP column on the second table
you just created that converts the data in the un-normalized column in the main table to the key value
use Paste Special to replace the actual with key values
save both tables as .csv files to load into a database