Contributions of SCG to SDG
Contributions of SCG to SDGKarl Lieberherr
Northeastern UniversityCollege of Computer and Information Science
Boston, MA joint work with
Ahmed Abdelmeged and Bryan Chadwick
Karl LieberherrNortheastern University
College of Computer and Information ScienceBoston, MA
joint work with Ahmed Abdelmeged and Bryan Chadwick
Supported by Novartis
SCG = Scientific Community Game = Specker Challenge Game
Game Motto
• Want reliable software to solve a computational problem? Design a game where the winning team will create the software you want.
• (Want to teach a STEM domain? Design a game where the winning students demonstrate superior domain knowledge.)
04/22/23 Games for SD 2
SCG
• Make software development more scientific.• Software developers– propose claims about their software– oppose claims made by others about their own
software• refute claims• strengthen claims
• claim defined by refutation protocol
04/22/23 Games for SD 3
Claims and Refutation Protocol
• AliceClaim: I have a program that solves inputs in domain X with quality Q and resources R. – AliceClaim(X,Q,R)
• Bob is critical. He prepares a tough input in X and gives it to Alice who applies her program. Bob refutes iff Alice achieves < Q or uses > R. – Refutation protocol
04/22/23 Games for SD 4
Who are Alice and Bob?
• They are avatars developed by real Alice and real Bob.
• Alice and Bob compete with 10 other avatars in a full-round robin tournament.
• Who is the winner: The avatar with the highest reputation, i.e., the strongest, not successfully opposed claims (like in a real scientific community).
04/22/23 Games for SD 5
What we want
• Engage software developers– let them produce software that models an
organism that fends for itself in a real virtual world while producing the software we want. Have fun. Focus them.
– let them propose claims about the software they produce. Reward them when they • defend their claims successfully or • oppose the claims of others successfully.
04/22/23 Games for SD 6
Opening the development approach
• Problem to be solved: Develop the best practical algorithms for solving computational problems in domain X.
• Issue: There are probably hundreds of papers on the topic with isolated implementations. What are the best practical algorithms?
• Our solution: Use the scientific community game SCG(X) with a suitably designed claims language to compare the software. The winning avatar has the best practical algorithms/software.
04/22/23 8Games for SD
Example:Independent Set
• An independent set in a graph is a set of mutually nonadjacent vertices. The problem of finding a maximum independent set in a graph, is one of most fundamental combinatorial NP-hard problems.
04/22/23 Games for SD 9
Example: Independent Set
• claim IndSet(n, 0.9, t(n)):– Alice can construct graphs G with at most n
vertices and she can construct a secret independent set I1 for G so that Bob, given G, size(I1) and t(n) minutes only, cannot find an independent set I2 with• size(I2) >= size(I1)*0.9.
04/22/23 Games for SD 10
Refutation Protocol
• Alice constructs graph G and deposits her secret independent set I1.
• Alice gives G as well as the size of I1 to Bob.• Bob has 10 minutes to construct his
independent set I2 which he gives to Alice.• Alice reveals her secret set I1.• Bob wins iff size(I2) >= size(I1)*0.9
04/22/23 Games for SD 11
Benefits for IBM of using SCG(X)
• Teams perform know-how retrieval and integration and maybe some research. – Participating teams try to find the best knowledge in
the area.– Claims language gives control!
• The non-opposed claims give hints about new X-specific knowledge.
• A well-tested solver for X-problems that integrates the current algorithmic knowledge in field X.
04/22/23 12Games for SD
Benefits for IBM of using SCG(X)
• Also great for evaluating potential employees.
04/22/23 Games for SD 13
Avatars propose and oppose
04/22/23 Games for SD 14
CA1
CA2
CA3
CA4
egoisticAlice
egoisticBob
reputation 1000 reputation 10
CB1
CB2
opposes (1)
provides problem (2)
solves problem
not as well as she expected based on CA2 (3)WINS!LOSES
proposed claims
transfer 200
social welfare
Life of an avatar: (propose+ oppose+ provide* solve*)*
What is SCG(X)?
TeamsDesign Problem Solver
Develop SoftwareDeliver Avatar
Agent Alice Agent Bob
Administrator SCG police
I am the best
No!!
Let’s play constructive
ly04/22/23 15Games for SD
TeamAlice
TeamBob
competitive / collaborative
04/22/23 Games for SD 16
Avatar Alice: claim H
Avatar Bob: opposes H, refutes: providesevidence for !H
loses reputation r wins knowledge k
wins reputation rmakes public knowledge k
Disadvantages of SCG
• The game is addictive. After Bob having spent 4 hours to fix his avatar and still losing against Alice, Bob really wants to know why!
• Overhead to learn to define and participate in competitions.
• The administrator for SCG(X) must perfectly supervise the game. Includes checking the legality of X-problems.– if admin does not, cheap play– watching over the admin
04/22/23 17Games for SD
How to compensatefor those disadvantages
• Warn the scholars.• Use a gentleman’s security policy: report
administrator problems, don’t exploit them to win.
• Occasionally have a non-counting “attack the administrator” competitions to find vulnerabilities in administrator.– both generic as well as X-specific vulnerabilities.
04/22/23 18Games for SD
Related Work
• TopCoder
04/22/23 Games for SD 19
Conclusions
• SCG has many applications of potential value to IBM– Training employees in constructive domains– Software development process– Hiring– Driving innovation in constructive domains
04/22/23 Games for SD 20
Thank you
04/22/23 Games for SD 21
Software Development Governance
• Software Development Governance (SDG) is defined as:– Establishing chains of responsibility, authority and
communication to empower people within a software development organization
– Establishing measurement and control mechanisms to enable software developers, project managers and others within a software development organization to carry out their roles and responsibilities
04/22/23 Games for SD 22
Applications
• Develop algorithms/software for new computational domain X– Scientific Community Game Software
Development: Describe a problem domain X so that SCG(X) provides the best algorithms and their implementations for problems in X. (best within the participating scientific community)
04/22/23 Games for SD 23
SCG = Scientific Community Game = Specker Challenge Game
04/22/23 Games for SD 24
Plan• Why is it relevant, useful?
– Larger context: Open Innovation, Wikinomics– Applications: Netflix in the small, teaching
• What is it?• What is new?
– Map problem domain to “second life”, find best solution there and map it back to real life.
• What do we improve: benefits of SCG• How to use SCG• Disadvantages• Experience with current implementation• Related work • Detailed example• Conclusions
04/22/23 Games for SD 25
Introduction (2)
• Scientific Community Game(X) [SCG(X)]– Goal: Foster innovation and reliable software for
solving optimization problems in some domain X
• A virtual scientific community consists of virtual scholars that propose and oppose claims maximizing their reputations
04/22/23 Games for SD 26
Claim
• Subdomain N– subset of problems
• Confidence [0,1]• Valuation [0,1]
04/22/23 Games for SD 27
confidence
0
1
valuation(how wellproblems inN can be solved)
Claim
04/22/23 Games for SD 28280
1
valuationstrengthening
correct valuation
over strengthening
Hypothesis
• hypothesis by Alice: for all problems F in niche N there exists a solution J: p(F,J)• Bob opposes: F’ to Alice, Alice cannot find
J’:p(F’,J’) therefore she loses reputation.
04/22/23 Games for SD 29
Full Round Robin Tournaments or Swiss-Style
• Agents to play the SCG(X). Repeat a few times with feedback used to update agents.
• Within the group of participating agent, the winning agent has the– best solver for X-problems – best supported knowledge about X
04/22/23 30Games for SD
What is the purpose of SCG?• The purpose of playing an SCG(X) competition is to assess
the "skills" of the agents in: – "approximating" optimization problems in domain X, – "figuring-out" the wall-clock-time-bounded approximability of
niches in domain X, – "figuring-out" hardest problems in a specific niche, and – "being-aware" of the niches in which their own solution
algorithm works best. • This multi-faceted evaluation makes SCG(X) more superior
to competitions based on benchmarks that only test the player's skills in approximating optimization problems. During SCG, players cross-test each others' skills.
04/22/23 Games for SD 31
How to use SCG
• Company A provides a problem domain description X and submits it to the SCG server. The game SCG(X) runs on the web (with human algorithm/software developers involved) and company A receives good, tested software and knowledge about problem domain X
04/22/23 Games for SD 32
Plan• Why is it relevant, useful?
– Larger context: Open Innovation, Wikinomics– Applications: Netflix in the small, teaching
• What is it?• What is new?
– Map problem domain to “second life”, find best solution there and map it back to real life.
• What do we improve: benefits of SCG• How to use SCG• Disadvantages• Experience with current implementation• Related work • Detailed example• Conclusions
04/22/23 Games for SD 33
From Benchmark-Driven to SCG-Driven Algorithm Development
• Hard to measure and detect what is fraud.• Instead: Design a system that needs a much
weaker “gentleman’s agreement” or none at all• The Static Benchmark Problem is ONE problem
that SCG solves. Dynamic Benchmarks• Others: crowd sourcing management, new
software development process that engages software developers and that fosters ease of evolution (e.g., good separation of concerns, …)
04/22/23 Games for SD 35
Problems with Static Benchmarkshttp://www.cs.kuleuven.be/~dtai/events/ASP-competition/index.shtml
Policy against special purpose solutionsThe purpose of the competition is to be as informative as
possible about strengths and weaknesses of … Submission of special purpose programs for solving certain benchmark problems falsifies the information that we get from the rankings and goes against the spirit of the competition. … the use of special purpose programs for certain benchmarks can rightfully be considered as scientific fraud.
We appeal to participants …
04/22/23 Games for SD 36
SCG-Driven Algorithm Development
• Differences to Benchmark-Driven– You don’t rank chess players by giving them a
benchmark; you let them play–We turn the algorithms into egoistic virtual
scientists that fend for themselves– social welfare: constructive knowledge based on
good algorithms
04/22/23 Games for SD 37
What is SCG(X)
04/22/23 Games for SD 38
no automationhuman plays
full automationagent plays
degree of automation used by scholar
our focus
some automationhuman plays
0 1
more applications:test constructive knowledge
transfer to reliable, efficient software
agent BobAlice
Scholars and Agents:Same rules
• Are encouraged to 1. offer results that are not easily improved.2. offer results that they can successfully
support.3. strengthen results, if possible. 4. stay active and publish new results or
oppose current results.5. become famous!
04/22/23 39Games for SD
More Applications
• Special issue editors for problem domain X. publish top 15 submissions
• Professor teaching a software development class: students develop fighting agents for full-round robin tournament
• Teaching constructive topics• etc.
04/22/23 Games for SD 40
Soundness Theorem
• SCG is sound: The agent with the best algorithms / knowledge wins (there is no way to cheat)– best: within the group of participating agents– issues:
• Does an agent win because she is good at solving? Or good at proposing, opposing and providing? Answer: proposing, opposing and providing all reduce to solving.
04/22/23 Games for SD 41
Justifying benefits (1)• Benefit: competitive – collaborative• Game component: hypotheses propose-oppose :
problems provide-solve• How this game component brings the benefit– hypothesis by Alice: for all problems F in niche N there
exists a solution J: p(F,J)– Bob opposes: F’ to Alice, Alice cannot find J’:p(F’,J’)
therefore she loses reputation.– Alice lost but she now knows F’ where she cannot
achieve what she claimed. F’ was harder than what Alice expected.
04/22/23 Games for SD 42
Justifying benefits (2)• Benefit: competitive – collaborative• Game component: hypotheses propose-oppose :
problems provide-solve• How this game component brings the benefit– hypothesis HA by Alice: for all problems F in niche N
there exists a solution J: p(F,J)– Bob opposes by non-trivially strengthening HA to HB:
HB => HA. Alice cannot discount HB. Therefore she loses reputation.
– Alice lost but she now knows that her hypothesis HA might not be the strongest.
04/22/23 Games for SD 43
Benefits of SCG-driven
• Focus on understanding problem domain.–What are the niches where specialized algorithms
perform well?–What are the hard problems in a niche?
• Knowledge maintenance system• Control of niches to be explored
04/22/23 Games for SD 44
Reputation GainChallenging (C)
Gain for A (A supporting), Loss for A (B discounting)
04/22/23 45Games for SD
Plan• Why is it relevant, useful?
– Larger context: Open Innovation, Wikinomics– Applications: Netflix in the small, teaching
• What is it?• What is new?
– Map problem domain to “second life”, find best solution there and map it back to real life.
• What do we improve: benefits of SCG• How to use SCG• Disadvantages• Experience with current implementation• Related work • Detailed example• Conclusions
04/22/23 Games for SD 46
How to use SCG(X)• ABB needs new ideas about how to solve
optimization problems in domain X.• Define hypothesis language for X– X-problems– hypotheses, includes protocol
• Submit hypothesis language definition to SCG server.
04/22/23 47Games for SD
How to use SCG(X)• Offer prize money for winner with conditions,
e.g., performance must be at least 10% higher as performance of agent XY that ABB provides.
• 10 teams from 6 countries sign up, committing to 6 competitions. Player executables become known to other players after each competition. One team from ABB.
• The SCG server sends them the basic agent and the administrator for testing.
04/22/23 48Games for SD
How to use SCG(X)
• Game histories known to all. Data mining!• First competition is at 23.59 on day 1.
Registration starts at 18.00 on same day. The competition lasts 2.5 hours.
• Repeat on days 7, 14, … 42.• The final winner is: Team Mumbai, winning
10000 Euro. Delivers source code and design document describing winning algorithm to ABB.
04/22/23 49Games for SD
Benefits for ABB of using SCG(X)
• Teams perform know-how retrieval and integration and maybe some research. – Participating teams try to find the best knowledge in
the area.– Hypothesis language gives control!
• The non-discounted hypotheses give hints about new X-specific knowledge.
• A well-tested solver for X-problems that integrates the current algorithmic knowledge in field X.
04/22/23 50Games for SD
Plan• Why is it relevant, useful?
– Larger context: Open Innovation, Wikinomics– Applications: Netflix in the small, teaching
• What is it?• What is new?
– Map problem domain to “second life”, find best solution there and map it back to real life.
• What do we improve: benefits of SCG• How to use SCG• Disadvantages• Experience with current implementation• Related work • Detailed example• Conclusions
04/22/23 Games for SD 51
GIGO: Garbage in / Garbage out
• If all agents are weak, no useful solver created.• WEAK against STRONG:– STRONG refutes a claim that is true but WEAK cannot
support it. Correct knowledge might be discounted.– STRONG strengthens a hypothesis too much that it
becomes discountable, but WEAK cannot discount it. Incorrect knowledge might be supported
– STRONG is discouraged to exploit WEAK by game rules
04/22/23 Games for SD 52
Plan• Why is it relevant, useful?
– Larger context: Open Innovation, Wikinomics– Applications: Netflix in the small, teaching
• What is it?• What is new?
– Map problem domain to “second life”, find best solution there and map it back to real life.
• What do we improve: benefits of SCG• How to use SCG• Disadvantages• Experience with current implementation• Related work • Detailed example• Conclusions
04/22/23 Games for SD 53
Experience• Used for 3 years in undergraduate Software
Development course. Prerequisites: 2 semesters of Introductory Programming, Object-Oriented Design, Discrete Structures, Theory of Computation.– Collect and integrate knowledge from prerequisite
courses, lectures, and literature. – Teach it to the agent.
• 30% of grade is allocated for agent performance in weekly competitions.
04/22/23 54Games for SD
Mechanics of using current implementation
• We define X = MAX-CSP.• We produce administrator and baby agent for
X at beginning of course.• Game flow:– Agents register with administrator– After deadline, administrator tells agents when it
is their turn (1 minute) sending them all currently proposed hypotheses
– After 1 minute, agent sends back transactions.04/22/23 Games for SD 55
Mechanics of using current implementation
• 3 competitions per week. Last about 12 hours each. 75% of competitions count towards grade. 1 competition: attack the administrator.
04/22/23 Games for SD 56
Experience MAX-CSP
• MAX-CSP Problem Decompositions• T-Ball (one relation), Softball (several
relations, one implication tree), Baseball (several relations).
• ALL, SECRET
04/22/23 57Games for SD
Stages for SECRET T-Ball
• MAXCUT – R(x,y)= x!=y– fair coin ½ – maximally biased coin ½ – semi-definite programming / eigenvalue
minimization 0.878
04/22/23 58Games for SD
Stages for SECRET T-Ball
• One-in-three– R(x,y,z) = (x+y+z=1)– fair coin: 0.375– optimally biased coin: 0.444
04/22/23 59Games for SD
Stages for ALL Baseball
• Propose/Oppose/Provide/Solve – based on fair coin– optimally biased coin
• correctly optimize polynomials
– correctly eliminate noise relations– correctly implement weights– …
04/22/23 60Games for SD
How to model a hypothesis
• A problem space.• A discounting predicate on the problem
space.• A protocol to set the predicate through
alternating “moves” (decisions) by Alice and Bob. If the predicate becomes true, Alice wins.
04/22/23 62Games for SD
How to model a hypothesis
• Proposing and challenging a hypotheses is risky: your opponent has much freedom to choose its decisions within the game rules.
• Alternating quantifiers.• Replace “exists” by agent algorithm kept by
administrator.
04/22/23 63Games for SD
Hypothesis [Example]
• 1in3 example.
04/22/23 64Games for SD
X = Boolean MAXCSP
• Given a sequence of Boolean constraints formulated using a set R of Boolean relations, find an assignment that maximizes the fraction of satisfied constraints.
• Niche defined by R.
04/22/23 65Games for SD
1in3 niche
• Only relation 1in3 is used.• 1in3 problem F:
v1 v2 v3 v4 v51in3( v1 v2 v3)1in3( v2 v4 v5)1in3( v1 v3 v4)1in3( v3 v4 v5)secret 1 0 0 1 0
Truth Table 1in3
000 0001 1010 1011 0100 1101 0110 0111 0
Secret quality SQ = 3/4
04/22/23 66Games for SD
1in3 Hypothesis• 1in3 hypothesis H proposed by Alice: exists F in
1in3 niche so that for all SBob that opponent Bob searches in time t (small constant) seconds: Quality(F,SBob) < 0.4 * Quality(F,SAlice).
• H = (niche = (1in3), AR =0.4, confidence = 0.8)• Bob has clever knowledge that Alice does not
have. He opposes the hypothesis H by challenging it using his randomized algorithm.
04/22/23 67Games for SD
Bob’s clever knowledge4/9 for 1in3
• 4/9 for 1in3: For all F in 1in3 niche, exists S so that Quality(F,S) >= 0.444 * SQ.
• Proof: la(p)=3*p*(1-p)2 has the maximum 4/9. • argmax p in [0,1] la(p) = 1/3.• Without search, in PTIME.• Derandomize• Bob successfully discounts• Alice gets a hint – Was Bob just lucky?
Truth Table 1in3000 0001 1010 1011 0100 1101 0110 0111 0
04/22/23 68Games for SD
1in3 Hypothesis
• Bob does not know whether 4/9 is best possible. Should check Semidefinite Programming.
• Bob only knows that the set of 1in3 problems having a solution satisfying 4/9 + eps, eps > 0, is NP-complete.
04/22/23 69Games for SD
Related Work
• Renaissance mathematicians• Various benchmark based competitions• What is new?– Software that has an ego– Holistic software with introspection– Evaluating software through a game– Scientific Community Game Software
Development
04/22/23 Games for SD 70
Conclusions• To address a problem domain X:
– “map it to second life”: define a scientific community game for X on the web: SCG(X)
– let the game SCG(X) run a few times and choose the winner
• Benefits– Evaluates fairly, frequently, constructively and dynamically.
Encourages retrieval of state-of-the-art know-how, integration and discovery.
– Challenges humans, drives innovation, both competitive and collaborative.
– Agents point humans to what needs attention in problem solution / software.
04/22/23 Games for SD 71
Conclusions
• SCG(X) provides a structured process for developing software for optimization problems.
• Benefits– Social Engineering: makes it fun through game.– Fair: Only hard work makes you win.– Engage a large community on one domain X.
• Tools
04/22/23 Games for SD 72
Top Related