Incremental Load
description
Transcript of Incremental Load
-
Incremental Loadusing qvd files
-
Incremental LoadIs sometimes calledIncremental Load Differential LoadDelta Load
-
Incremental LoadGoal:Load only the new or the changed records from the database. The rest should already be available, one way or another.
-
Comments on Buffer LoadBuffer (Incremental) Load is a solution only for Log files (text files), but not for DBs.Buffer (Stale after 7 days) Select is not a good solution. It makes a full Load after 7 days. And nothing in between
-
Incremental LoadLoad new data from Database table (slow, but few records)Load old data from QVD file (many records, but fast)Create new QVD file Procedure must be repeated for each table
-
Different DB-changesIf source allows Append only. (Logfiles)Insert only. (No Update or Delete)Insert and Update. (No Delete)Insert, Update and Delete.
-
1) Append onlyMust be Log fileLoads records added in the end of the file
-
1) Append only
Buffer (Incremental) Load * From LogFile.txt(ansi, txt, delimiter is '\t', embedded labels);
Done! But it should be renamed to Buffer (Append) Load
-
2) Insert onlyCan be any DBLoads INSERTed recordsNeeds the field ModificationDate
-
2) Insert only
QV_Table:SQL SELECT PrimaryKey, X, Y FROM DB_TABLEWHERE ModificationTime >= #$(LastExecTime)#;
-
2) Insert only
QV_Table:SQL SELECT PrimaryKey, X, Y FROM DB_TABLEWHERE ModificationTime >= #$(LastExecTime)#;
ConcatenateLOAD PrimaryKey, X, Y FROM File.QVD;
-
2) Insert only
QV_Table:SQL SELECT PrimaryKey, X, Y FROM DB_TABLEWHERE ModificationTime >= #$(LastExecTime)#;
ConcatenateLOAD PrimaryKey, X, Y FROM File.QVD;
STORE QV_Table INTO File.QVD;
Almost done But there is a small chance that a record gets loaded twice
-
2) Insert only
QV_Table:SQL SELECT PrimaryKey, X, Y FROM DB_TABLEWHERE ModificationTime >= #$(LastExecTime)# AND ModificationTime < #$(BeginningThisExecTime)#;
ConcatenateLOAD PrimaryKey, X, Y FROM File.QVD;
STORE QV_Table INTO File.QVD;
Done!
-
3) Insert and UpdateCan be any DBLoads INSERTed and UPDATEd recordsNeeds the fields ModificationDate and PrimaryKey
-
3) Insert and Update
QV_Table:SQL SELECT PrimaryKey, X, Y FROM DB_TABLEWHERE ModificationTime >= #$(LastExecTime)#;
ConcatenateLOAD PrimaryKey, X, Y FROM File.QVDWHERE NOT Exists(PrimaryKey);
STORE QV_Table INTO File.QVD;
Done!
-
4) Insert, Update and DeleteCan be any DBLoads INSERTed and UPDATEd recordsRemoves DELETEd recordsNeeds the fields ModificationDate and PrimaryKeyTricky to implement
-
4) Insert, Update and Delete
QV_Table:SQL SELECT PrimaryKey, X, Y FROM DB_TABLEWHERE ModificationTime >= #$(LastExecTime)#;
ConcatenateLOAD PrimaryKey, X, Y FROM File.QVDWHERE NOT EXISTS(PrimaryKey);
Inner JoinSQL SELECT PrimaryKey FROM DB_TABLE;
STORE QV_Table INTO File.QVD;
OK, but slow
-
4) Insert, Update and DeleteListOfDeletedEntries:SQL SELECT PrimaryKey AS Deleted FROM DB_TABLEWHERE DeletionFlag = 1 and ModificationTime >= #$(LastExecTime)#;
QV_Table:SQL SELECT PrimaryKey, X, Y FROM DB_TABLEWHERE ModificationTime >= #$(LastExecTime)#;
Concatenate LOAD PrimaryKey, X, Y FROM File.QVDWHERE NOT Exists(PrimaryKey) AND NOT Exists(Deleted,PrimaryKey);
Drop Table ListOfDeletedEntries;
STORE QV_Table INTO File.QVD;
OK, but needs a DeletionFlag
-
LastExecutionTime & Error handlingLet ThisExecTime = Now();
{ Load sequence }
If ScriptErrorCount = 0 then Let LastExecTime = ThisExecTime; End If
-
Final ScriptLet ThisExecTime = Now();
QV_Table:SQL SELECT PrimaryKey, X, Y FROM DB_TABLEWHERE ModificationTime >= #$(LastExecTime)# AND ModificationTime < #$(ThisExecTime)#;
Concatenate LOAD PrimaryKey, X, Y FROM File.QVDWHERE NOT EXISTS(PrimaryKey);
Inner Join SQL SELECT PrimaryKey FROM DB_TABLE;
If ScriptErrorCount = 0 then STORE QV_Table INTO File.QVD;Let LastExecTime = ThisExecTime; End If
-
Summary 1Incremental Load possible forAppend only. (Logfiles)Yes!Insert only. (No Update or Delete)Yes!Insert and Update. (No Delete)Yes!Insert, Update and Delete.Slow, or demands DeletionFlag
-
Summary 2Incremental Load normally not equivalent to Buffer (Incremental) Load
-
CommentThe solutions above are (alone) probably not robust enough. In addition, the complete table should probably be reloaded regularly, perhaps once a month.
-
*************************