Chapter 17 Regulation of gene expression in bacteria: lac Operon of E. coli trp operon of E. coli.
Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal...
Transcript of Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal...
![Page 1: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/1.jpg)
Artificial IntelligenceDatabase Performance Tuning
Roel Van de PaarPercona
![Page 2: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/2.jpg)
2
Agenda
● GA: How it works, terminology, variables, example
● Database Tuning & Surrounding thoughts
● gaai
● POC
● Results
![Page 3: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/3.jpg)
3
Define: GA
A Genetic Algorithm (GA) is an lightweight Artificial Intelligence (AI) evolutionary algorithm which mimics Darwin’s theory of natural evolution.
![Page 4: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/4.jpg)
4
GA Terminology
Population (inc. any offspring)
vChromosomes (“Individuals”)
vGenes (“Chromosome Length”)
![Page 5: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/5.jpg)
5
GA: it’s all about the genes
2 Parentsv
Children
Children may get;● Genes mixed from parents (“crossover”)● Modified (“mutated”) genes
![Page 6: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/6.jpg)
6
GA LoopPopulation is created ‘randomly’ (can be pre-populated)
v[loop]>
Population is evaluated (i.e. each individual receive a fitness value)v
A new population is created:Population can be sorted / kept or discarded in part (“selection”)
Crossover, gene mutations etc.v
possible intermediary re-eval<[loop]
![Page 7: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/7.jpg)
7
GA FitnessA fitness value is the result of a chosen fitness function
As a rather simple/limited example: FITNESS = RAND(A) + RAND(B) + RAND(C)
where A,B,C are 0-1000: Highest fitness value=3000, lowest=0
Optimize towards a negative (lowest value=best) calculated fitness; FITNESS=-FITNESS
i.e. -3000 becomes 3000 so the lowest value becomes best
Optimize towards %: 1/FITNESS or 1-(1/FITNESS) etc.
Basically; anything that can be optimized towards a best result can be GA’ed
![Page 8: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/8.jpg)
8
GA VariablesGA Variables are often binary (or represented in binary)
where a single bit is a single gene
But they do not need to be!
One step further is variables as genes, where all variables are alikea=0-100 with step 1, b=0-100 with step 1, c=0-100 with step 1
The most advanced is variables that are in disparate rangesa=-1 to 1 with step 0.01, b=0-100 with step 0.5, etc.
![Page 9: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/9.jpg)
9
GA Dev TIP: GA Value-store GenesEveryone working with GA can benefit from this hack/approach
Example: significant genes (used in fitness): a,b,c
non-significant genes (not used in fitness): d
i.e. sub-eval (think sub-total) data can be stored in another genewhere such gene is never set/updated/mutated, but only used
for tracking certain calculations, results, statuses, etc.
This optimizes (though not in all cases) the number of calculations
![Page 10: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/10.jpg)
10
So why do we need GA’s?
To optimize...
everything
![Page 11: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/11.jpg)
11
How manypersonscan we fit...
![Page 12: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/12.jpg)
12
GA Application Domains #1Bayesian inference links to particle methods in Bayesian statistics and hidden Markov chain modelsArtificial creativityChemical kinetics (gas and solid phases)Calculation of bound states and local-density approximationsCode-breaking, using the GA to search large solution spaces of ciphers for the one correct decryption.Computer architecture: using GA to find out weak links in approximate computing such as lookahead.Configuration applications, particularly physics applications of optimal molecule configurations for particular systems like C60 (buckyballs)Construction of facial composites of suspects by eyewitnesses in forensic science.Data Center/Server Farm.Distributed computer network topologiesElectronic circuit design, known as evolvable hardwareFeature selection for Machine LearningFeynman-Kac modelsFile allocation for a distributed systemFiltering and signal processingFinding hardware bugs.Game theory equilibrium resolutionGenetic Algorithm for Rule Set ProductionScheduling applications, including job-shop scheduling and scheduling in printed circuit board assembly.Learning robot behavior using genetic algorithmsImage processing: Dense pixel matchingLearning fuzzy rule base using genetic algorithmsMolecular structure optimization (chemistry)Optimisation of data compression systems, for example using wavelets.Power electronics design. SOURCE: https://en.wikipedia.org/wiki/List_of_genetic_algorithm_applicationsTraveling salesman problem and its applications
![Page 13: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/13.jpg)
13
GA Application Domains #2Climatology: Estimation of heat flux between the atmosphere and sea iceClimatology: Modelling global temperature changesDesign of water resource systemsGroundwater monitoring networksDesign of anti-terrorism systemsLinguistic analysis, including grammar induction and other aspects of Natural language processing (NLP) such as word sense disambiguation.Automated design of sophisticated trading systems in the financial sectorRepresenting rational agents in economic models such as the cobweb modelReal options valuationAudio watermark insertion/detectionAirlines revenue managementAutomated design of mechatronic systems using bond graphs and genetic programming (NSF)Automated design = computer-automated designAutomated design of industrial equipment using catalogs of exemplar lever patternsAutomated design, including research on composite material design and multi-objective design of automotive components for crashworthiness, weight savings, and other characteristicsContainer loading optimizationControl engineering,Marketing mix analysisMechanical engineeringMobile communications infrastructure optimization.Plant floor layoutPop music record productionQuality controlTimetabling problems, such as designing a non-conflicting class timetable for a large universityVehicle routing problemOptimal bearing placement
![Page 14: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/14.jpg)
14
GA Application Domains #3Computer-automated designBioinformatics Multiple Sequence AlignmentBioinformatics: RNA structure predictionBioinformatics: Motif DiscoveryBiology and computational chemistryBuilding phylogenetic trees.Gene expression profiling analysis.Medicine: Clinical decision support in ophthalmologyComputational Neuroscience: finding values for the maximal conductances of ion channels in biophysically detailed neuron modelsProtein folding and protein/ligand dockingSelection of optimal mathematical model to describe biological systemsOperon prediction.Neural Networks; particularly recurrent neural networksTraining artificial neural networks when pre-classified training examples are not readily obtainable (neuroevolution)Clustering, using genetic algorithms to optimize a wide range of different fit-functions.Multidimensional systemsMultimodal OptimizationMultiple criteria production schedulingMultiple population topologies and interchange methodologiesMutation testingParallelization of GAs/GPs including use of hierarchical decomposition of problem domains and design spaces nesting of irregular shapes using feature matching and GAs.Rare event analysisSolving the machine-component grouping problem required for cellular manufacturing systemsStochastic optimizationTactical asset allocation and international equity strategiesWireless sensor/ad-hoc networks.
![Page 15: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/15.jpg)
15
Simple GA Example@ https://github.com/Percona-QA/gaai/blob/master/ga_example/ga_example.lua
git clone https://github.com/Percona-QA/gaai.git cd ga_examplelua ga_example.lua
Polation: 100, Genes: 10, Generations: 100
This GA simply takes sum(rand(0,9999999/10),rand(idem),...rand(n))i.e. a random number between 0 and 9999999 divided by the number of genes * the number of genes (max fitness=9999999)
![Page 16: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/16.jpg)
16
A bit of (MySQL) database tuning history
● Past: very poor defaults/templates, settings tuning a must
● Current: more optimized/increased defaults, settings tuning may still be recommended for high-use production systems
● Future: automatically adjusting settings (GA or logic based)
Past: MANUAL > Future: AUTOMATED
Automated systems are less error prone and can be optimized over time!
![Page 17: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/17.jpg)
17
GA Database Tuning: a new concept / mindset● It does not really matter which workload GA optimizes
○ i.e. there is no “right”, “wrong”, “common” or “specific” one○ GA will be able to optimize any of them
● This is dissimilar to past performance benchmarking, which is usually tuned towards/optimized for a specific load (or set of loads)
● It matters less here how much effective % is gained using a specific set of options for a specific semi-synthetic workload
● It matters much more here how much overall improvement is seen over time as the workload changes (real production workloads)
![Page 18: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/18.jpg)
18
Thoughts on Database Tuning #1● R/O variable optimization require restart: not suitable for production
systems
● Sysbench load is uniform/synthetic (easier to optimize), though I expect that actual user loads will achieve similar (i.e. 80%) similar ROI’s, unless the data being processed is highly random
● Tuning various memory buffers can be complex and requires surrounding “safety” code calculations (or value ranges) to avoid OOM
● Things may change over time, for ex. the number of client connections
![Page 19: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/19.jpg)
19
Thoughts on Database Tuning #2
● It would be good to cover for special events like checkpoints(Longer sample runtimes may be sufficient to cover this)
● Not all mysqld variables automatically lend themselves to “pure performance tuning” as some variables are features - setting them changes the performance, but only because the server functionality was modified also - i.e. the performance offset may be expected (credit: Laurynas Biveinis)
● Some vars require longer runtime to sample (e.g. InnoDB buffer pool)
![Page 20: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/20.jpg)
20
Thoughts on Database Tuning #3● Optimizations are system dependent!
○ For example, using tmpfs vs ssd vs slow hdd’s, fast vs slow I/O controllors, number of cpu threads, OS configuration etc. - the speed will vary differently for different systems
○ This is one of the great strengths of using GA for database tuning: ■ Optimizes per-load, i.e. load-specific var adjustments/tuning■ Optimizes per-system, i.e. hardware/OS optimized var tuning■ Optimizes per-moment in time, i.e. changes in any area over time■ Optimizes across all factors combined
○ For humans, this is possible only in a (very) limited fashion, and requires a good understanding of each optimization plane.
![Page 21: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/21.jpg)
21
Thoughts on Database Tuning #4
● A human may miss non-obvious areas of optimization○ For example, if many buffers were automatically made smaller
then there would be more room for other workload-specific performance-affecting buffers.
● Performance drops (usually light & short) may be seen○ A possible fix gradual/staged/stepped changes
■ Example; stepped changes, i.e. +100/-100 instead of random Note; the actual change would still be random (e.g. from -100 to +100 with step 1)
■ This may also help with variables that need a larger sample duration window - needs further evaluation
![Page 22: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/22.jpg)
22
Thoughts on Database Tuning #5
● Optimization towards other fitness values is possible○ For example, with what set of options do I see the least amount of
client rejects (locking etc.), network disconnects, etc.○ These can take a smaller/secondary importance value in the fitness
(though “combining discongruous fitness values” is a complex topic)○ Another example is taking errors as a guide for what value areas to
avoid for given parameters, though human smarts is better (OOM etc.)
● Each variable will be more or less optimizable by GA. For example,https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_thread_concurrency would seem highly optimizable
![Page 23: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/23.jpg)
23
gaai
● An advanced proof-of-concept/small framework which could easily be expanded to become a full-fledged GA performance optimizer, or could easily be adapted to use more complex/different GA algo’s etc.
● Code is GPL v2 licensed, GA code is MIT licensed
● Not connected in any way with Ottertune. They’ve done some interesting work also https://db.cs.cmu.edu/papers/2017/p1009-van-aken.pdf
● As a POC, starts with a very poorly optimized server and tunes 13 InnoDB parameters automatically to improve performance
![Page 24: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/24.jpg)
24
gaai continued
● There are likely many other variables which can be tuned by GA
● The POC is (made) hyper-fast in applying changes, but for actual production machines the pace of change can be;
1) slower2) further controlled with sanity checks etc. (avoids major drops)
● Further GA algo optimization is possible○ Limit or eliminate the number of re-evals○ Use a faster/more advanced GA algorithm
![Page 25: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/25.jpg)
25
POC: Start with poorly optimized server
● MYSQLD_PRECONFIG="--innodb-buffer-pool-size=5242880 --table-open-cache=1 --innodb-io-capacity=100 --innodb-io-capacity-max=100000 --innodb-thread-concurrency=1 --innodb-concurrency-tickets=1 --innodb-flush-neighbors=2 --innodb-log-write-ahead-size=512 --innodb-lru-scan-depth=100 --innodb-random-read-ahead=1 --innodb-read-ahead-threshold=0 --innodb-commit-concurrency=1 --innodb-change-buffer-max-size=0 --innodb-change-buffering=none"
![Page 26: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/26.jpg)
26
POC: Genes: Tune 13 InnoDB Variables
= Approx 8.134713296270707e+36 possible combinations
![Page 27: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/27.jpg)
27
Sysbench Prepare/Run● Prepare
sysbench /usr/share/sysbench/oltp_insert.lua --mysql-storage-engine=innodb --table-size=${TABLESIZE} --tables=${NROFTABLES} --mysql-db=test --mysql-user=root --db-driver=mysql --mysql-socket=${BASEDIR}/socket.sock prepareTABLESIZE=1000000, NROFTABLES=4
● Runsysbench /usr/share/sysbench/oltp_read_write.lua --report-interval=${1} --time=0 --events=0 --index_updates=10 --non_index_updates=10 --distinct_ranges=15 --order_ranges=15 --threads=${2} --table-size=${3} --tables=${4} --percentile=95 --verbosity=3 --mysql-db=test --mysql-user=root --db-driver=mysql --mysql-socket=${BASEDIR}/socket.sock run$1=1 (1 SEC SAMPLING), $2=5 (5 THREADS), $3=1000000, $4=4
![Page 28: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/28.jpg)
28(X: QPS, Y: TIME, 5 Threads)
![Page 29: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/29.jpg)
29(X: QPS, Y: TIME, 5 Threads)
![Page 30: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/30.jpg)
30
The future
● Query indexing GA optimization (with thanks @Peter Zaitsev)
● Overall GA, per-option GA, or rule based “smarts” can all be explored
● Variables which require larger sample duration windows; stepped?
● Tuned options expansion, more surrounding logic, solid ranges
● Far future; learning accross systems, more advanced AI
● More immediate; real workload testing
![Page 31: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/31.jpg)
31
Real Workload GA Optimization
If you are interested in trying GA optimized performance for your
production load, we are happy to work with you!
![Page 32: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/32.jpg)
32
Contacts
● Percona○ https://www.percona.com/about-percona/contact ○ https://twitter.com/Percona ○ https://www.linkedin.com/company/percona
● Roel Van de Paar○ [email protected]○ https://twitter.com/RoelVandePaar ○ https://au.linkedin.com/in/roelvandepaar
![Page 33: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/33.jpg)
33
![Page 34: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/34.jpg)
34
Join the Percona Product Managers for Lunch!
● With Tyler Duzan, Michael Coburn, and Alexander Rubin
● Share your feedback
● Get to see the product roadmaps
Wednesday @ the reserved area in back of Gaia Restaurant
![Page 35: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/35.jpg)
35
Thank You Sponsors!!
![Page 36: Database Performance Tuning Artificial Intelligence · 2/17/2016 · Selection of optimal mathematical model to describe biological systems Operon prediction. Neural Networks; particularly](https://reader035.fdocuments.in/reader035/viewer/2022071019/5fd2c2a919292549216474df/html5/thumbnails/36.jpg)
36
Rate My Session: Example