Autograder
-
Upload
fiona-buckley -
Category
Documents
-
view
32 -
download
1
description
Transcript of Autograder
![Page 1: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/1.jpg)
AutograderRISHABH SINGH, SUMIT GULWANI, ARMANDO SOLAR-LEZAMA
AG
![Page 2: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/2.jpg)
Feedback on Programming Assignments
• Test-cases based feedback• Hard to relate failing inputs to errors
• Manual feedback by TAs• Time consuming and error prone
![Page 3: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/3.jpg)
6.00 Student Feedback (2013)
"Not only did it take 1-2 weeks to grade problem, but the comments were entirely
unhelpful in actually helping us fix our errors. …. Apparently they don't read the code -- they
just ran their tests and docked points mercilessly. What if I just had a simple typo,
but my algorithm was fine? ...."
![Page 4: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/4.jpg)
Bigger Challenge in MOOCs
Scalability Challenges (>100k students)
![Page 5: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/5.jpg)
Today’s Grading Workflow def computeDeriv(poly):
deriv = [] zero = 0 if (len(poly) == 1): return deriv for e in range(0, len(poly)): if (poly[e] == 0): zero += 1 else: deriv.append(poly[e]*e) return deriv
replace derive by [0]def computeDeriv(poly): deriv = [] zero = 0 if (len(poly) == 1): return deriv for e in range(0, len(poly)): if (poly[e] == 0): zero += 1 else: deriv.append(poly[e]*e) return deriv
Teacher’s Solution
Grading Rubric
![Page 6: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/6.jpg)
Autograder Workflow
AGdef computeDeriv(poly): deriv = [] zero = 0 if (len(poly) == 1): return deriv for e in range(0, len(poly)): if (poly[e] == 0): zero += 1 else: deriv.append(poly[e]*e) return deriv
replace derive by [0]def computeDeriv(poly): deriv = [] zero = 0 if (len(poly) == 1): return deriv for e in range(0, len(poly)): if (poly[e] == 0): zero += 1 else: deriv.append(poly[e]*e) return deriv
Teacher’s Solution
Error Model
![Page 7: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/7.jpg)
Technical ChallengesLarge space of possible corrections
Minimal corrections
Dynamically-typed language
Constraint-based Synthesis to the rescue
![Page 8: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/8.jpg)
Running Example
![Page 9: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/9.jpg)
computeDeriv
Compute the derivative of a polynomial
poly = [10, 8, 2] #f(x) = 10 + 8x +2x2
=> [8, 4] #f’(x) = 8 + 4x
![Page 10: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/10.jpg)
Teacher’s solution
def computeDeriv(poly): result = [] if len(poly) == 1: return [0] for i in range(1, len(poly)): result += [i * poly[i]] return result
![Page 11: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/11.jpg)
Demo
![Page 12: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/12.jpg)
• return a return {[0],?a}
• range(a1, a2) range(a1+1,a2)
• a0 == a1 False
Simplified Error Model
![Page 13: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/13.jpg)
Autograder Algorithm
![Page 14: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/14.jpg)
Rewriter
Translator Solver Feedba
ck
Algorithm
.py
. .sk
.out
--------
![Page 15: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/15.jpg)
Rewriter
Translator Solver Feedba
ck
Algorithm: Rewriter
.py
.
![Page 16: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/16.jpg)
Rewriting using Error Model
range(0, len(poly))
a a+1
range({0 ,1}, len(poly))
default choice
![Page 17: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/17.jpg)
Rewriting using Error Model
range(0, len(poly))
a a+1
range({0 ,1}, len(poly))
![Page 18: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/18.jpg)
Rewriting using Error Model
range(0, len(poly))
a a+1
range({0 ,1}, len({poly, poly+1}))
![Page 19: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/19.jpg)
Rewriting using Error Model
range(0, len(poly))
a a+1
range({0 ,1}, {len({poly, poly+1}), len({poly, poly+1})+1})
![Page 20: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/20.jpg)
Rewriting using Error Model ()
def computeDeriv(poly): deriv = [] zero = 0 if ({len(poly) == 1, False}):
return {deriv,[0]}
for e in range({0,1}, len(poly)): if (poly[e] == 0): zero += 1 else: deriv.append(poly[e]*e)
return {deriv,[0]}
Problem: Find a program that minimizes cost metric and is functionally equivalent with teacher’s solution
![Page 21: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/21.jpg)
Rewriter
Translator Solver Feedba
ck
Algorithm: Translator
. .sk
![Page 22: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/22.jpg)
Sketch [Solar-Lezama et al. ASPLOS06]
void main(int x){ int k = ??; assert x + x == k * x;}
void main(int x){ int k = 2; assert x + x == k * x;}
Statically typed C-like language with holes
![Page 23: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/23.jpg)
Translation to Sketch (1) Handling python’s dynamic types
(2) Translation of expression choices
![Page 24: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/24.jpg)
(1) Handling Dynamic Types
ival bval
lst…
Typeint boo
l
list
struct MultiType{ int type; int ival; bit bval; MTString str; MTList lst; MTDict dict; MTTuple tup;}
![Page 25: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/25.jpg)
Python Constants using MultiType
1
[1,2]
2
1 -
- -INT
int
list
bool
2 -
- -INT
int
list
bool
- -
len=2, lVals = { *, * } -
LISTint
list
bool
![Page 26: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/26.jpg)
Python Exprs using MultiType
x + y
15 -
- -
INTint
list
bool
10 -
- -
INTint
list
bool
5 -
- -
INTint
list
bool
![Page 27: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/27.jpg)
Python Exprs using MultiType
- -
[1,2] -
LISTint
list
bool
- -
[3] -
LISTint
list
bool
- -
[1,2,3] -
LISTint
list
bool
x + y
![Page 28: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/28.jpg)
- -
[3] -
LISTint
list
bool
5 -
- -
INTint
list
bool
Python Expressions using MultiType
Typing rules are encoded as constraints
x + y
![Page 29: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/29.jpg)
(2) Translation of Expression Choices
{ , }
MultiType modifyMTi( , ){ if(??) return else choicei = True return
}
- -
- -
- -
- -
- -
- -
- -
- -
- -
- -
// default choice
// non-default choice
- -
- -
![Page 30: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/30.jpg)
harness main(int N, int[N] poly){
MultiType polyMT = MTFromList(poly);
MultiType result1 = computeDeriv_teacher(polyMT); MultiType result2 = computeDeriv_student(polyMT);
assert MTEquals(result1,result2);
}
Translation to Sketch (Final Step)
![Page 31: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/31.jpg)
harness main(int N, int[N] poly){ totalCost = 0;
MultiType polyMT = MTFromList(poly);
MultiType result1 = computeDeriv_teacher(polyMT); MultiType result2 = computeDeriv_student(polyMT); ……………… if(choicek) totalCost++; ……………… assert MTEquals(result1,result2); minimize(totalCost);} Minimum
Changes
Translation to Sketch (Final Step)
![Page 32: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/32.jpg)
Rewriter
Translator Solver Feedba
ck
Algorithm: Solver
.sk
.out
![Page 33: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/33.jpg)
Solving for minimize(x)
Binary search for x – no reuse of learnt clauses
Incremental linear search – reuse learnt clauses
Sketch Uses CEGIS – multiple SAT calls
MAX-SAT – too much overhead
![Page 34: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/34.jpg)
Incremental Solving for minimize(x)
(P,x) Sketch
Sketch
Sketch
(P1,x=7)
(P2,x=4)UNSAT
{x<7}
{x<4}
![Page 35: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/35.jpg)
Rewriter
Translator Solver Feedba
ck
Algorithm: Feedback
.out
--------
![Page 36: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/36.jpg)
Feedback Generation
Correction rules associated with Feedback Template
Extract synthesizer choices to fill templates
![Page 37: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/37.jpg)
Evaluation
![Page 38: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/38.jpg)
Autograder Tool for Python Currently supports:
- Integers, Bool, Strings, Lists, Dictionary, Tuples
- Closures, limited higher-order fn, list comprehensions
![Page 39: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/39.jpg)
Benchmarks
Exercises from first five weeks of 6.00x and 6.00 int: prodBySum, compBal, iterPower, recurPower, iterGCD
tuple: oddTuple
list: compDeriv, evalPoly
string: hangman1, hangman2
arrays(C#): APCS dynamic programming (Pex4Fun)
![Page 40: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/40.jpg)
Benchmark Test SetevalPoly-6.00 13
compBal-stdin-6.00 52compDeriv-6.00 103hangman2-6.00x 218prodBySum-6.00 268oddTuples-6.00 344
hangman1-6.00x 351evalPoly-6.00x 541
compDeriv-6.00x 918oddTuples-6.00x 1756iterPower-6.00x 2875
recurPower-6.00x 2938iterGCD-6.00x 2988
![Page 41: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/41.jpg)
Average Running Time (in s)
prodByS
um-6.00
oddTuples-6.00
compDeriv
-6.00
evalPoly-
6.00
compBal-s
tdin-6.00
compDeriv
-6.00x
evalPoly-
6.00x
oddTuples-6.00x
iterP
ower-6.00x
recu
rPower-6
.00x
iterG
CD-6.00x
hangman1-6.00x
hangman2-6.00x0
5
10
15
20
25
30
35
Tim
e (in
s)
![Page 42: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/42.jpg)
Feedback Generated (Percentage)
evalPoly-6.00x
compBal-s
tdin-6.00
hangman2-6.00x
evalPoly-6.00
hangman1-6.00x
oddTuples-6.00x
oddTuples-6.00
iterP
ower-6.00x
iterG
CD-6.00x
recurP
ower-6.00x
prodBySum-6.00
compDeriv
-6.00x
compDeriv
-6.000.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
Feed
back
Per
cent
age
![Page 43: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/43.jpg)
Feedback Generated (Percentage)
evalPoly-6.00x
compBal-s
tdin-6.00
hangman2-6.00x
evalPoly-6.00
hangman1-6.00x
oddTuples-6.00x
oddTuples-6.00
iterP
ower-6.00x
iterG
CD-6.00x
recurP
ower-6.00x
prodBySum-6.00
compDeriv
-6.00x
compDeriv
-6.000.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
Feed
back
Per
cent
age
![Page 44: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/44.jpg)
Feedback Generated (Percentage)
evalPoly-6.00x
compBal-s
tdin-6.00
hangman2-6.00x
evalPoly-6.00
hangman1-6.00x
oddTuples-6.00x
oddTuples-6.00
iterP
ower-6.00x
iterG
CD-6.00x
recurP
ower-6.00x
prodBySum-6.00
compDeriv
-6.00x
compDeriv
-6.000.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
Feed
back
Per
cent
age
![Page 45: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/45.jpg)
Feedback Generated (Percentage)
evalPoly-6.00x
compBal-s
tdin-6.00
hangman2-6.00x
evalPoly-6.00
hangman1-6.00x
oddTuples-6.00x
oddTuples-6.00
iterP
ower-6.00x
iterG
CD-6.00x
recurP
ower-6.00x
prodBySum-6.00
compDeriv
-6.00x
compDeriv
-6.000.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
Feed
back
Per
cent
age
![Page 46: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/46.jpg)
Average Performance
TestSet Generated Feedback Percentage AvgTime(s)
13365 8579 64.19% 9.91
![Page 47: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/47.jpg)
Why low % in some cases?• Completely Incorrect Solutions
• Unimplemented Python Features
• Timeout• comp-bal-6.00
• Big Conceptual errors
![Page 48: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/48.jpg)
Big Error: Misunderstanding APIs
• eval-poly-6.00x
def evaluatePoly(poly, x): result = 0 for i in list(poly): result += i * x ** poly.index(i) return result
![Page 49: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/49.jpg)
Big Error: Misunderstanding Spec
• hangman2-6.00x
def getGuessedWord(secretWord, lettersGuessed): for letter in lettersGuessed: secretWord = secretWord.replace(letter,’_’) return secretWord
![Page 50: Autograder](https://reader037.fdocuments.in/reader037/viewer/2022103006/568132af550346895d9962eb/html5/thumbnails/50.jpg)
Conclusion
A technique for automated feedback generationError Models, Constraint-based
synthesis Provide a basis for automated feedback
for MOOCs Towards building a Python Tutoring
Thanks!