Indexing and Fast Search
Transcript of Indexing and Fast Search
-
8/8/2019 Indexing and Fast Search
1/23
Index ing andFast Search
engineNBITSearch
parameters
www.nbitsearch.com Novosib-BIT LLC
http://www.nbitsearch.com/http://www.nbitsearch.com/ -
8/8/2019 Indexing and Fast Search
2/23
The System is Designed forCompact indexing of huge
arrays of data on a hard disk
2
high-speed exact and fuzzy search for
objects with minimum use of RAM.
for
-
8/8/2019 Indexing and Fast Search
3/23
3
Exact and Fuzzy Search
Interval queries provide
fuzzy (inexact) search.
Precise (exact) search
is a particular case of fuzzy search.
-
8/8/2019 Indexing and Fast Search
4/23
4
Indexable Objects
Volume Weight Speed
54 175 500100 182 700
Objects of any typeswith precise
(exact, point)parameters:
-
8/8/2019 Indexing and Fast Search
5/23
5
Indexable ObjectsObjects of any types
with fuzzy
(inexact, interval)parameters:
Volume Weight Speed
54 59 175 180 500 600100 300 182 200 700 800
-
8/8/2019 Indexing and Fast Search
6/23
6
Indexable Objects
See at www.nbitsearch.com
Example:
http://www.nbitsearch.com/http://www.nbitsearch.com/ -
8/8/2019 Indexing and Fast Search
7/23
7
Indexing of Objects
At first, a user mapsa source objects to the so-called
primitives :
precise/fuzzy parameters,
piecewise functionsor matrixes.
Step 1:
-
8/8/2019 Indexing and Fast Search
8/23
8
Indexing of ObjectsStep 2:
The system automatically transformsthe primitives to numeric masks .
These masks are spatialhashes of objects.
Then, the system automaticallyindexes the masks.
-
8/8/2019 Indexing and Fast Search
9/23
9
Sizes of Indexable ArraysThe most tangible effect is shown for
such arrays of primitives ,
which support 50 100 million and more objects
for one index.
A size of arrays of indexableobjects can be10 100 terabyte and larger .
-
8/8/2019 Indexing and Fast Search
10/23
10
Indexing Limitations
One index supports
2 billion ofits own objects.
Limitations
of number of indexes are artificial .
-
8/8/2019 Indexing and Fast Search
11/23
11
What is a Billion?
1 billion seconds is
32 years .
1 billion pagesfor a laser printer is
a pile with a height of 100 km .
-
8/8/2019 Indexing and Fast Search
12/23
12
Indexing Speed
Estimator:
T ~ (N) * LOG (N)T time of forming one index,
N number of indexable objects.
-
8/8/2019 Indexing and Fast Search
13/23
-
8/8/2019 Indexing and Fast Search
14/23
14
Search SpeedTime estimation
of defining the address the firstpotential block of data:
T ~ LOG (N)T time of logic probing ,
N number of indexed objects.
-
8/8/2019 Indexing and Fast Search
15/23
15
Search SpeedA speed of fetching
the result of interval queriesfrom a hard disk can be
10 100 times higher than(for the large data array) ,
the speed of similar operationin a standard relational DBMS .
-
8/8/2019 Indexing and Fast Search
16/23
16
Search SpeedA speed of fetching
the result of interval queriesfrom a hard disk can be
1000 times (and more) higher than(for the large data array) ,
the speed of similar operationwhen solving the problems
with the use of brute force method .
-
8/8/2019 Indexing and Fast Search
17/23
17
Search SpeedA time of fetching
the result of interval queriesfrom a hard disk
depends linearly
on objects number inresult set .
-
8/8/2019 Indexing and Fast Search
18/23
-
8/8/2019 Indexing and Fast Search
19/23
19
Search Memory
A sizeof memory buffers
to fetch the data dependson users needs.
This size is often infinitesimal(~10 megabyte).
-
8/8/2019 Indexing and Fast Search
20/23
20
Reading of Result Set
Reading
the result setfrom a hard disk
to the RAM
is optimum:magnetic head does not oscillate .
-
8/8/2019 Indexing and Fast Search
21/23
21
Multidimensional of IndexesIndexes are
multidimensional ,
but there is no an effectof explosion of data .
Efficient indexes ofobjects can be formedby 1 32 parameters .
-
8/8/2019 Indexing and Fast Search
22/23
22
MultifunctionalityIndexes are
multifunctional:
Indexing and searchingin the tables can be arranged
by multiple virtual columns,which values are any functions
of values of actual columns .
-
8/8/2019 Indexing and Fast Search
23/23
23
THANK YOU!
www.nbitsearch.com
Technology developed with support from FASIE formed by the Government of Russian Federation
Novosib-BIT LLC 2004 - 2010 Patented
http://www.nbitsearch.com/http://www.fasie.ru/http://www.fasie.ru/http://www.nbitsearch.com/