Dependently Typed Data Structures Hongwei Xi [email protected] presented by James Hook...

27
Dependently Typed Data Structures Hongwei Xi [email protected] presented by James Hook ([email protected]) Pacific Software Research Center Oregon Graduate Institute
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    215
  • download

    0

Transcript of Dependently Typed Data Structures Hongwei Xi [email protected] presented by James Hook...

Page 1: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Dependently Typed Data Structures

Hongwei Xi

[email protected]

presented byJames Hook ([email protected])Pacific Software Research Center

Oregon Graduate Institute

Page 2: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Hongwei’s Program

• Extend ML-like typechecking with computationally tractable mechanisms modeled on dependent type systems to get more expressive type systems without sacrificing practicality.

• Dependent ML --- his thesis at CMU

• de Caml --- a prototype implementation based on Caml

Page 3: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

This Talk

• Apply “Hongwei’s types” to “Chris’ programs”

• Goal is to express invariants of data structures in the extended type system

• I will focus on red black trees

Page 4: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

The Types• ML types indexed by integers and integer

expressions

datatype ‘a list with nat = nil(0) | {n:nat} cons(n+1) of ‘a * ‘a list(n)

Nil is the list of length 0

Cons builds a list of length n+1 from an element and a list of length n

{}’s are explicit universal quantifiers for constraints

Index expression

Page 5: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Examplelet rec append = function ([], ys) -> ys| (x :: xs, ys) -> x :: append(xs, ys)withtype {m:nat}{n:nat}‘a list(m) * ‘a list(n) ->‘a list(m+n)

Given a list of length m and a list of length n append produces a list of length m+n.

Page 6: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Typecheckingde Camlsource

Caml type check plus constraint generation

ML-style typeerrors

Constraint Solver

Constraintfailures

Page 7: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Red Black Trees

• Balanced, ordered, binary trees, values at nodes

• Every node is colored red or black

• Balance Invariant: – No red node has a red node as a child– There is an equal number of black nodes on all

root/leaf paths

Page 8: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Example6

7

51

4

3

Insert 8

88

Violates equal number of black nodes on all leaf root paths

8

All insertions must be red to maintain the global invariant of equal numbers of black nodes on leaf root paths

Page 9: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Example

2

6

7

51

4

3

Insert 2

Now we have a “red red” violation

Red red violations are repairable; they are limited to the path to the root

Page 10: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Repairing a Red Red Violation

x

z

ya

b c

d

y

x z

a b c d

Page 11: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Repairing a Red Red Violation

y

x z

a b c d

xz

ya

b c

d

x

z

y

ab

c d

zy

x

a bc

d

z

x

y

a

b c

d

Page 12: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Example

2

6

7

51

4

3

Insert 2

x

z

y

a

b c

d

y

x z

a b c d

2

1 3

Page 13: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Example6

7

5

4

Insert 2

2

1 3

1 3 5 7

2 6

44

Page 14: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Okasaki’s Solution

data Color = R | Bdata RedBlackSet a = E | T Color (RedBlackSet a) a (RedBlackSet a)

The Types Capture the Tree Structure:

Page 15: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Okasaki’s SolutionThe balance function does the rotation if there is a red red conflict

yx z

a b c d

xz

ya

b cd

x

zy

ab

c d

zy

x

a bc

d

zx

ya

b c

d

balance :: Color -> RedBlackSet a -> a -> RedBlackSet a -> RedBlackSet abalance B (T R (T R a x b) y c) z d = T R (T B a x b) y (T B c z d)balance B (T R a x (T R b y c)) z d = T R (T B a x b) y (T B c z d)balance B a x (T R (T R b y c) z d) = T R (T B a x b) y (T B c z d)balance B a x (T R b y (T R c z d)) = T R (T B a x b) y (T B c z d)balance color a x b = T color a x b

When given two trees of equal black heightthese clauses produce a tree of black heightone greater

Page 16: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

data Color = R | Bdata RedBlackSet a = E | T Color (RedBlackSet a) a (RedBlackSet a)

balance :: Color -> RedBlackSet a -> a -> RedBlackSet a -> RedBlackSet a

insert :: a -> RedBlackSet a -> RedBlackSet a insert x s = T B a y b where T _ a y b = ins s ins E = T R E x E ins s@(T color a y b) | x < y = balance color (ins a) y b | x > y = balance color a y (ins b) | True = s

Okasaki’s Solution

Where do red red conflicts come from?

To answer this question we expand ins to distinguish on color and we recall that balance only rotates trees with black roots

Page 17: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Okasaki’s Solution

y

a b

ins a

balance :: Color -> RedBlackSet a -> a -> RedBlackSet a -> RedBlackSet ainsert :: Ord a => a -> RedBlackSet a -> RedBlackSet a insert x s = T B a y b where T _ a y b = ins s ins E = T R E x E ins s@(T B a y b) | x < y = balance B (ins a) y b | x > y = balance B a y (ins b) | True = s ins s@(T R a y b) | x < y = T R (ins a) y b | x > y = T R a y (ins b) | True = s

ins on black yields red black tree ins on red may yield one red red

violation at the root

insert always yields a good treebecause it recolors root

Page 18: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Hongwei’s Solution

data Color = R | Bdata RedBlackSet a = E | T Color (RedBlackSet a) a (RedBlackSet a)

sort color == {a:int | 0 <= a <= 1};;

datatype tree with (color, nat, nat) = (* color, black height, violation *) E(0, 0, 0) | {cl:color}{cr:color}{bh:nat} B(0, bh+1, 0) of tree(cl, bh, 0) * key * tree(cr, bh, 0) | {cl:color}{cr:color}{bh:nat}{sl:nat}{sr:nat} R(1, bh, cl+cr) of tree(cl, bh, 0) * key * tree(cr, bh, 0);;

General Approach: Index the datatype to record “black height” and to detect “red red violations”

Detecting red red violations in the type indexs requires having an arithmetic encoding of color in the type index set as well as the value distinction in the program

Page 19: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Reading the Datatype

sort color == {a:int | 0 <= a <= 1};;

datatype tree with (color, nat, nat) = (* color, black height, violation *) E(0, 0, 0) | {cl:color}{cr:color}{bh:nat} B(0, bh+1, 0) of tree(cl, bh, 0) * key * tree(cr, bh, 0) | {cl:color}{cr:color}{bh:nat} R(1, bh, cl+cr) of tree(cl, bh, 0) * key * tree(cr, bh, 0);;

Convention: 0 = black, 1 = red(permits detecting red red with +)The empty tree is black, has height 0,

and no violationsA black node of height bh + 1 with no violationsis constructed from two nodes of arbitrary colorof height bh

A red node of height bh + 1 is constructed from two nodes of arbitrary color of height bh. The violation value of the node is the sum of the colorsof its children, I.e. non-zero if either child is red.The children contain no violations.

Page 20: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Hongwei’s Solution

let balance = function (R(R(a, x, b), y, c), z, d) -> R(B(a, x, b), y, B(c, z, d)) | (R(a, x, R(b, y, c)), z, d) -> R(B(a, x, b), y, B(c, z, d)) | (a, x, R(R(b, y, c), z, d)) -> R(B(a, x, b), y, B(c, z, d)) | (a, x, R(b, y, R(c, z, d))) -> R(B(a, x, b), y, B(c, z, d)) | (a, x, b) -> B(a, x, b)withtype {cl:color}{cr:color}{bh:nat}{vl: nat}{vr:nat | vl+vr <= 1} tree(cl, bh, vl) * key * tree(cr, bh, vr) -> [c:color] tree(c, bh+1, 0);;

balance :: Color -> RedBlackSet a -> a -> RedBlackSet a -> RedBlackSet a

Recall from Okasaki’s solution:

As in red red analysis, Hongwei only calls balance on black nodes, hence Color argument is eliminated

Page 21: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Type of Balancewithtype {cl:color}{cr:color}{bh:nat}{vl: nat}{vr:nat | vl+vr <= 1} tree(cl, bh, vl) * key * tree(cr, bh, vr) -> [c:color] tree(c, bh+1, 0)

Give two trees of equal height bh, arbitrary color, and at most one red red violation, balance yields a tree with unspecified color of height bh+1 containing no violations

Page 22: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Hongwei’s Solution

let rec ins = function E -> R(E, x, E) | B(a, y, b) -> if x < y then balance(ins a, y, b) else if y < x then balance(a, y, ins b) else raise Item_already_exists | R(a, y, b) -> if x < y then R(ins a, y, b) else if y < x then R(a, y, ins b) else raise Item_already_exists withtype {c:color}{bh:nat} tree(c, bh, 0) -> [c':color][v:nat | v <= c] tree(c', bh, v)

ins is essentially as before

The type of ins is now dramatically more expressive!

ins produces a tree of unspecified color with height equal to its input. If the root of the argument was red the tree may contain a violation. If it was black it contains no violations.

Page 23: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Hongwei’s Solution

let insert x t = let rec ins = ... withtype {c:color}{bh:nat} tree(c, bh, 0) -> [c':color][v:nat | v <= c] tree(c', bh, v) in match ins t with R(a, y, b) -> B(a, y, b) | t -> twithtype {c:color}{bh:nat} key -> tree(c, bh, 0) -> [bh’:nat] tree(0, bh’, 0);;

Insert is also essentially unchanged

The type of insert now shows that both invariants are maintained by the operation. In particular, given a key and a red black tree of any height containing no violations, insert produces a tree with black root of some height containing no violations.

Page 24: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

The Paper

• Braun Trees– The type of size guarantees it computes the size

• Random-Access Lists

• Binomial Heaps

Page 25: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Limitations

• Sometimes the programmer knows more than de Caml can figure out

• Not all integer constraints are decidable

Page 26: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Related Work

Refinement types (Freeman, Davies, Pfenning)

Indexed types (Zenger) Sized types (Hughes, Pareto, Sabry) Nested datatypes (Bird & Meertens,

Okasaki, Hinze, etc)

Page 27: Dependently Typed Data Structures Hongwei Xi hwxi@ececs.uc.edu presented by James Hook (hook@cse.ogi.edu) Pacific Software Research Center Oregon Graduate.

Contacting Hongwei

[email protected]

• http//www.ececs.uc.edu/~hwxi

• Tel +1 513 556 4762