Database Farming For Improved Performance Presented By: Russell Yong Supervisor: Prof Wentworth.
-
Upload
myron-fowler -
Category
Documents
-
view
212 -
download
0
Transcript of Database Farming For Improved Performance Presented By: Russell Yong Supervisor: Prof Wentworth.
Database FarmingDatabase FarmingFor Improved PerformanceFor Improved Performance
Presented By: Russell Yong
Supervisor: Prof Wentworth
Problem at HandProblem at Hand
Database solution for a large corporationDatabase solution for a large corporation Expensive software Expensive software (Oracle Database (Oracle Database
Enterprise Edition +- US $40k)Enterprise Edition +- US $40k) Top-end hardwareTop-end hardware
Microsoft’s SQL Server 2000Microsoft’s SQL Server 2000 Not same level of confidenceNot same level of confidence
SolutionSolution
Adapt the popular technique of backend Adapt the popular technique of backend server farmingserver farming
Apply it to databases – to create a high Apply it to databases – to create a high performance database web serviceperformance database web service
Backend setup being invisible to the userBackend setup being invisible to the user
HypothesisHypothesis
Technique will create a more cost effective Technique will create a more cost effective database farmdatabase farm
Eradicate some problems associated with Eradicate some problems associated with dealing with large databasesdealing with large databases
Our PlanOur Plan
Standard 3-Tier Model
Our PlanOur Plan
Adapted 3-Tier Model
ConceptuallyConceptually
Web-server
Farm of DatabasesWeb ServiceClients
Pool of Connections
http request
http request
http request
DataSet
Multiple Threads of Execution
DataSet
DataSet
DataSet
DataSet
DataSet ObjectDataSet Object
In-memory cache of dataIn-memory cache of data Comparable to a mini-databaseComparable to a mini-database
Multiple tablesMultiple tables RelationshipsRelationships ConstraintsConstraints
DataSet ObjectDataSet Object
Easily serialized into and back out of XMLEasily serialized into and back out of XML Structure Structure (tables, columns, etc)(tables, columns, etc) described in described in
an XML schemaan XML schema View and manipulate using either relational or View and manipulate using either relational or
XML methods XML methods (unified programming model)(unified programming model) Compatible with other XML speaking Compatible with other XML speaking
applicationsapplications
DataSet ObjectDataSet Object
Disconnected ModelDisconnected Model Sub-queries fill individual datasetsSub-queries fill individual datasets Collector ObjectCollector Object
Collect and merge individual sub-queriesCollect and merge individual sub-queries
Returned to the clientReturned to the client
Typed DataSetTyped DataSet
Has an implicit schema Has an implicit schema Allows for more efficient fillingAllows for more efficient filling Faster accessFaster access Created via Form Designer, Created via Form Designer,
programmatically, or at run time via XSDprogrammatically, or at run time via XSD
XSD FileXSD File<?xml version="1.0" encoding="utf-8"?><xs:schema id="PingDataSet" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata"> <xs:element name="PingDataSet" msdata:IsDataSet="true"> <xs:complexType> <xs:choice maxOccurs="unbounded"> <xs:element name="PingInfo"> <xs:complexType> <xs:sequence> <xs:element name="ip" type="xs:string" /> <xs:element name="seen" type="xs:string" /> </xs:sequence> </xs:complexType> </xs:element> </xs:choice> </xs:complexType> <xs:unique name="PK_PingPrimaryKey" msdata:PrimaryKey="true"> <xs:selector xpath=".//PingInfo" /> <xs:field xpath="ip" /> <xs:field xpath="seen" /> </xs:unique> </xs:element></xs:schema>
Implications for Web-ApplicationsImplications for Web-Applications
Resource sensitive approachResource sensitive approach ““Bulk” approach to communicationBulk” approach to communication Access local cacheAccess local cache Ideal for non-volatile dataIdeal for non-volatile data
Implications for Web-ApplicationsImplications for Web-Applications
Optimistic concurrency modelOptimistic concurrency model Most applications ?Most applications ? Improved performance Improved performance (no locking)(no locking) No persistent connection required No persistent connection required (resources)(resources) Minimize required server resourcesMinimize required server resources Connections used more effectivelyConnections used more effectively Exceptions are dealt with accordinglyExceptions are dealt with accordingly
Our DatabaseOur Database
Excess of “10 Million Records”Excess of “10 Million Records” Network traffic informationNetwork traffic information Partitioned in 10 segmentsPartitioned in 10 segments
Initial difficultyInitial difficulty
Distributed over 3 machines Distributed over 3 machines (SQL Server 2000)(SQL Server 2000)
Simulating a completely distributed environmentSimulating a completely distributed environment
Data ProvidersData Providers
SQL Server .NET Data ProvidersSQL Server .NET Data Providers SqlConnectionSqlConnection SqlDataAdapterSqlDataAdapter SqlCommandSqlCommand
OLE DB .NET Data ProvidersOLE DB .NET Data Providers ODBC .NET Data Providers ODBC .NET Data Providers (separate (separate
download)download)
Data ProvidersData Providers
Our FrameworkOur Framework
MyQueryHandlerMyQueryHandler
Farming LayerFarming Layer An instance for each individual user queryAn instance for each individual user query Distributor Distributor (spawns threads)(spawns threads) Collector, merging DataSets as they returnCollector, merging DataSets as they return All encompassing DataSetAll encompassing DataSet PluggablePluggable
MyThreadHandlerMyThreadHandler
Represents individual threadsRepresents individual threads Fills separate DataSets for each of the Fills separate DataSets for each of the
partitions in the partitions in the farmfarm Returns DataSet to QueryHandlerReturns DataSet to QueryHandler PluggablePluggable
Specifying QueriesSpecifying Queries
Couple queries hard-codedCouple queries hard-coded Defined according to a parameterDefined according to a parameter Future Extensions…Future Extensions…
Tests and ResultsTests and Results
Ran queries 100 timesRan queries 100 times Gauge meanGauge mean Filter out any possible influencing factorsFilter out any possible influencing factors
Influencing factorsInfluencing factors Network trafficNetwork traffic Active machinesActive machines
Testing and ResultsTesting and Results
Simple querySimple query ““SELECT * FROM ping WHERE (ip = 2464643887) OR (ip = SELECT * FROM ping WHERE (ip = 2464643887) OR (ip =
2464643464) OR (ip = '2464639301') OR (ip = '2464625293')”2464643464) OR (ip = '2464639301') OR (ip = '2464625293')”
ReturningReturning 11 853 rows11 853 rows
Farming MethodFarming Method Averaged 35 secondsAveraged 35 seconds
Normal MethodNormal Method Averaged 94 secondsAveraged 94 seconds
Testing and ResultsTesting and Results
Farming Method
00:00.00000:08.64000:17.28000:25.92000:34.56000:43.20000:51.84001:00.48001:09.12001:17.760
1 12 23 34 45 56 67 78 89 100
Query Number
Qu
ery
Tim
e
Time to Query
Normal Method
00:00.000
00:17.280
00:34.560
00:51.840
01:09.120
01:26.400
01:43.680
02:00.960
02:18.240
1 12 23 34 45 56 67 78 89 100
Query Number
Qu
ery
Tim
e
Time to Query
““SELECT * FROM ping WHERE (ip = 2464643887) OR (ip = SELECT * FROM ping WHERE (ip = 2464643887) OR (ip = 2464643464) OR (ip = '2464639301') OR (ip = '2464625293')”2464643464) OR (ip = '2464639301') OR (ip = '2464625293')”
HypothesisHypothesis
Technique will create a more cost effective Technique will create a more cost effective database farm database farm
Eradicate some problems associated with dealing Eradicate some problems associated with dealing with large databases with large databases
Possible ExtensionsPossible Extensions
Full access to DB via HTTPSFull access to DB via HTTPS Front-endFront-end
Query construction wizardQuery construction wizard
Investigate partitioning techniquesInvestigate partitioning techniques ““Intelligent” queryingIntelligent” querying
Questions ?Questions ?