Supporting RFID-based Item Tracking Applications in Oracle DBMS Using a Bitmap Datatype
description
Transcript of Supporting RFID-based Item Tracking Applications in Oracle DBMS Using a Bitmap Datatype
Supporting RFID-based Item Tracking Applications in Oracle
DBMS Using a Bitmap Datatype
Ying Hu, Seema Sundara, Timothy Chorma, Jagannathan Srinivasan
Oracle New England Development Center
One Oracle Drive, Nashua, NH 03062
VLDB 2005
Talk Outline
• Background• A Bitmap Datatype• Applications• Implementation• Performance Study• Conclusions
VLDB 2005
Background
VLDB 2005
Radio Frequency Identification
• RFID uses radio frequency to automatically detect and identify individual items, which have RFID tags associated with them.
• RFID technology is increasingly being used in applications such as– Asset Tracking, – Supply Chain Management, and– Retail Management.
VLDB 2005
Electronic Product Codes (EPCs)
• EPC is a standard naming scheme proposed by Auto-Id Center for RFID applications.
• An EPC uniquely identifies an item. It contains <Header, Manager No., Object Class, Serial No.>
• The header identifies the format of the EPCtype, the next two components identify the product, whereas the last component uniquely identifies the item within the product.
VLDB 2005
Item Tracking
• RFID Technology enables item tracking applications
• However, there is a need for efficient mechanism for managing the high volume of item tracking information.
VLDB 2005
A Bitmap Type for Item Tracking
• Key Observation: RFID Items can be efficiently tracked by tracking the groups to which an item belongs. For example– Groups of items in the same proximity: e.g.
on a shelf, on a shipment– Groups of items with same property: e.g.
items of a single product, items of a single product with same expiry date
• Our Solution: Use a bitmap type for modeling a collection of EPCs that can occur in item tracking applications.
VLDB 2005
A Bitmap Datatype
VLDB 2005
Example: Product Inventory
• With EPC Collections
• With epc_bitmaps
Store_id
Prod_id Time Item_collection
s1 p1 t1 epc11, epc12, epc13,
…
s1 p2 t2 epc21, epc22, epc23,
…
… … … …
Store_id
Prod_id
Time Item_bmap
s1 p1 t1 bmap1
s1 p2 t2 bmap2
… … … …
VLDB 2005
epc_bitmap Datatype
• A new type to represent a collection of EPCs with a common prefix
CREATE TYPE epc_bitmap_segment ( epc_length NUMBER, epc_suffix_length NUMBER, epc_prefix RAW, epc_suffix_start RAW, epc_suffix_end RAW, epc_suffix_bitmap RAW );
• epc_bitmap type to represent a collection of EPCs
CREATE TYPE epc_bitmap IS TABLE of epc_bitmap_segment;
VLDB 2005
epc_bitmap_segment Datatype
Header EPC_Manager Object_Class Serial_Number
2-bits 21-bits 17-bits 24-bits
0x4AA890001F62C160 ………………………… 0x4AA890001FA0B38E
Len Suff_len
Prefix Suff_start Suff_end bitmap
64 24 0x4AA890001F 0x62C160 0xA0B38E 101001…00010
VLDB 2005
epc_bitmap Operations
Conversion Operations
epc2Bmap, bmap2Epc, and bmap2Count
Pairwise Logical Operations
bmapAnd, bmapOr, bmapMinus, and bmapXor
Maintenance Operations
bmapInsert and bmapDelete
Membership Testing Operation
bmapExists
Comparison Operation
bmapEqual
VLDB 2005
Applications
VLDB 2005
Applications: Shelf Analytics
• Determine the items added to a shelf between time t1 and t2
SELECT bmap2Epc(bmapMinus(s2.item_bmap, s1.item_bmap))
FROM Shelf_Inventory s1, Shelf_Inventory s2
WHERE s1.shelf_id = <sid1> AND
s1.shelf_id = s2.shelf_id AND
s1.time=<t1> AND s2.time=<t2>;
Shelf_Inventory
Shelf_id Time Item_bmap
sid1 t1 bmp1
sid1 t2 bmp2
… … …
VLDB 2005
Applications: Product Recall• Identify the stores that currently have recalled items SELECT Store_id FROM Product_Inventory P
WHERE bmap2Count(bmapAnd(P.Item_bmap,
epc2Bmap(<recall_items>))) > 0
AND P.Product_id=<recall_product_id>
AND P.Time = <current_time>;
Product_Inventory
Prod_id Store_id Time Item_bmap
p1 s1 t1 bmap1
p1 s2 t2 bmap2
… … … …
VLDB 2005
Implementation
VLDB 2005
Implementation
• Leverages Oracle’s Bitmap Index Implementation
• Uses G. Antoshenkov’s “Byte Aligned Data Compression” for creation and operation of bitmaps for rowid collections.
• A collection of EPCs is grouped into different epc_bitmap_segments, – When the previous epc_suffix_bitmap
(maximum size of RAW = 2000) can not hold any more EPCs
– When epc_prefixes are different
VLDB 2005
Persistent epc_bitmap
• Stored in a primary B+-tree structure
primary B+-tree structure
primary key columns
non-key columns
Data1 Data2 Epc_bitmap
… … …
… … set_id_I
… … …
set_id_i epc_length
epc_suffix_length
epc_prefix
epc_suffix_start
epc_suffix_end
epc_suffix_bitmap
VLDB 2005
Query Processing & Optimization
• epc_bitmap function is executed per row• Queries with predicates on bitmap columns
can be speeded up
– Using Function-based B-tree index: On pre-computed value of bmap2Count(bmap_col)
SELECT S.shelf_id
FROM Shelf_inventory S
WHERE bmap2Count(S.item_bmap) > 0 ;
– Using Summary Bitmap Index
VLDB 2005
Summary Bitmap
• Specialized index structure to speed up query to detect the presence of an epc identifier
SELECT s.shelf_id, s.time FROM Shelf_Inventory s WHERE bmapExists(s.item_bmap, <epc1>);
• Create a summary bitmap sbmap using OR operation on {bmap1, …, bmapk}:
bmapExists(sbmap)
i 1…k: bmapExists(bmapi)
VLDB 2005
Summary Bitmap Index Structure
root node
leaf nodes
branch nodes
…
…
EPC Bitmap column
Bmp1
Bmp2
<Bmp1,rowid1>
Table
A Summary Bitmap Tree
…
…Bmpn
<Bmpn,rowidn>
VLDB 2005
Summary Bitmap Index Algorithm
DFS(X, epc):X: a node in the tree index; epc: a given epc;bmp(X): the epc_bitmap associated with node X;
if ( bmapExists(bmp(X), epc)== TRUE){ if (X == leaf node) report bmp(X); else for each child node V of X do DFS(V, epc);}Complexity:O(1) , no candidate O(m log N), m: fan-out, N: no. of rows
VLDB 2005
Performance Study
VLDB 2005
Experimental Setup
• Data represents EPC-64 collections, which are randomly generated with a uniform distribution
• Data sampled every hour for 30 days(720 obs.), 300 days (7200 obs) and 3000 days (72000 obs).
• EPC collections of size 100000, 10000 and 1000 represent typical number of items in department and/or shelf
VLDB 2005
Storage Comparison
0 500 1000 1500
MBytes
7200000rows10EPCs
720000rows100EPCs
72000rows1000EPCs
7200rows10000EPCs
720rows100000EPCs
Epc_ bitmap Type
Collection Type
VLDB 2005
Bulk Load Performance
8782
2367
0
2000
4000
6000
8000
10000
Seco
nds
Collection Type Epc_bitmap Type
7200 rows with 10000 EP Cs in each row
VLDB 2005
Query 1: Enumerate Removed Items
• Enumerate the items removed from a shelf between 2 time intervals
Collection Type Query SELECT b.epc_value FROM epc_coll a, TABLE(a.epcs) b WHERE a.time = '2004-03-04 10:00 AM' MINUS SELECT b2.epc_value FROM epc_coll a2, TABLE(a2.epcs)
b2 WHERE a2.time = '2004-03-04 11:00 AM';
epc_bitmap Type Query SELECT * FROM TABLE( SELECT bmap2Epc(bmapMinus(p1.epcs,p2.epcs)) FROM epc_bmp p1, epc_bmp p2 WHERE p1.time = '2004-03-04 10:00 AM' AND p2.time = '2004-03-04 11:00 AM');
VLDB 2005
Query 1: Enumerate Removed Items
0
0.5
1
1.5
2
2.5
72000rows1000EPCs
7200rows10000EPCs
720rows100000EPCs
Seco
nds
CollectionType
Epc_bitmapType
VLDB 2005
Query 1 with Variant Datasets
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
1 10 100 1000 2500 5000 7500 10000
Number of differing items in epc collections
Sec
onds
CollectionTypeEpc_bitmapType
VLDB 2005
Query 2: Report All Observations
• Report all observations when a given EPC was present
Collection Type Query SELECT a.time FROM epc_coll a WHERE EXISTS(SELECT 1 FROM TABLE(a.epcs) b WHERE b.epc_value =
‘400003000300052A’); epc_bitmap Type Query
SELECT time FROM epc_bmp WHERE bmapExists(epcs, ‘400003000300052A’);
VLDB 2005
Query 2: Report All Observations
0
50
100
150
200
250
300
350
72000rows
1000EPCs
7200rows
10000EPCs
720rows
100000EPCs
Seco
nds
CollectionTypeEpc_bitmapType
VLDB 2005
Performance Discussion
• Storage savings of 2 to 8 times due to compressed bitmaps
• Bulk load performance faster due to smaller storage needs
• For collections of size > 1000, query performance gains of 10% to 1200%
• For smaller collection sizes, performance is same as native collections
VLDB 2005
Conclusions
• epc_bitmap can model transient and persistent EPC collections
• epc_bitmap can be used to support RFID-based item tracking applications
• An efficient primary B+-tree based storage and access mechanism
• Performance experiments validate the feasibility and benefits of the epc_bitmap
• Support can be extended to Row Identifier (rowid) and Life Science Identifier (LSID) collections
VLDB 2005
AQ&
Q U E S T I O N SQ U E S T I O N SA N S W E R SA N S W E R S