1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior...

24
1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and international treaties. Unauthorized reproduction of this presentation, or any portion of it, may result in severe civil and criminal penalties and will be prosecuted to maximum extent possible under the law.

Transcript of 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior...

Page 1: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

1

Merging with SQL

HRP223 – 2012October 29, 2012

Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved.Warning: This presentation is protected by copyright law and international treaties. Unauthorized reproduction of this presentation, or any portion of it, may result in severe civil and criminal penalties and will be prosecuted to maximum extent possible under the law.

Page 2: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

2

Combining

• When you have data in two tables, you need to tell SQL how the two tables are related to each other.– Typically you have a subject ID number in both

files. The variable that can be used to link information is called the key.

Page 3: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

3

Demographics

Diagnosis

Here the two tables have different variables except the common ID.

I want to know the diagnosis for this cohort.

Page 4: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

4

Merging

• Merging is trivially easy with EG. Choose a table and do the Query Builder…. And push the Join Tables button.

Page 5: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

5

Double click on the dividing lines to make the columns wide

enough to read.

Page 6: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

6

Notice the name t1. In the SQL statements, variables from this

table will have the prefix t1.

This table will be referred to as t2.

It noticed that the two tables have the common variable ID. Therefore it is going to match records that have a common

value in ID.

Double click the link for details.

Page 7: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

7

Joins

• You will typically do inner joins and left joins.– Inner Joins: select the marching records– Left Joins: select all records on the left side and

any records that match on the right.

Page 8: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

8

Inner Joins

• Inner Joins are useful when you want to keep the information from the tables, if and only if, there are matches in both tables.– Here you keep the records where you have

demographic and response to treatment information on people.

Page 9: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

9

Page 10: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

10

Left Joins

• Left joins are useful when you have a table with everybody on the left side of the join and not everyone has records in the right table.– A typical example has the left side with the IDs of

everyone in a family and the right table has information on diagnoses. Not everyone is sick so you want to keep all the IDs on the left and add in diagnoses where you can.

Page 11: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

11

Typical Left Join

Notice the numeric variable is formatted to

display with words.

Page 12: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

12

Page 13: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

13

More Complete

Page 14: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

14

Coalesce

• The previous example leaves NULL for the people who are disease free. You probably want to list the rest as healthy.

• The coalesce function returns the first non-missing value. – Coalesce works on numeric lists.– CoalesceC works on character lists.

Page 15: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

15

Expand the tree

Use the query builder with an advanced expression….

Page 16: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

16

Page 17: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

17

Page 18: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

18

Complex Coalesce

• If you are using left joins from multiple tables, coalesce can be really useful.– Say you have people who have reported disease,

other people have verified disease and the rest are assumed to be healthy. You can coalesce an indicator variable from the verified table and reported table and call everybody else healthy.

Page 19: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

19

If the tables have indicator variables, once the tables are linked, the coalesce function is easy:COALESCEC(t3.status2 , t2.status1, "Healthy"))

Page 20: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

20

No indicator variables?

• If the tables you are coalescing do not have indicator variables, just make them as part of the query by adding a column which has the ID in the child tables (e.g., reported and verified) recoded to a word like “reported” or “verified”.

Page 21: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

21

The two new indicator columns.

Page 22: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

22

Coalesce the new columns

• Once the new columns are created, create a new variable using the Advanced expression option for a new computed column. Then do coalesce on the new variables. Double click on the new variables and it will insert the code.

Page 23: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

23

After double clicking the ver variable the

code is inserted.

Don’t forget the comma before double clicking

the rep variable.

After inserting reported and verified, put in

another comma and the “healthy” option.

Page 24: 1 Merging with SQL HRP223 – 2012 October 29, 2012 Copyright © 1999-2012 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

24