Bulitko Thesis Ideas

5

Click here to load reader

Transcript of Bulitko Thesis Ideas

Page 1: Bulitko Thesis Ideas

Vadim Bulitko's Thesis Ideas (September 12, 2011)

(1) True Distance Heuristics vs Pattern Databases vs Subgoals

In this project, we will compare the efficiency of true distance heuristics(Goldberg and Harrelson 2005, Sturtevant, Felner, Barrer, Schaeffer, & Burch 2009, Goldenberg,Felner, Sturtevant,Schaeffer 2010 ), pattern databases (Culberson and Schaeffer 1998, Holte, Newton, Felner, Meshulan,and Furcy 2004) and subgoals (Bulitko, Lustreck, Schaeffer,Bjornsson, and Sigmundarson, 2008 Bulitko, Bjornsson,Lawrence 2010,Lawrence and Bulitko 2010)

The primary question is this - given an algorithm (eg: LRTA* (Korf 1990) or A* (Hart, Nilsson, Raphael 1968) and n bytes of memory, which of these three techniques should one use to achieve the best suboptimality as well as minimize the precomputation time.?

(2) Minimum Amount of Work in Real Time Search

It is known that A* expands the smallest number of states to guarantee an optimal solution (with a certain fine print). No such guarantees exist for realtime heuristic search. This project will first review existing work, e.g Sturtevant, Bulitko and Bjornsson 2010) in the field and then attempt to derive a lower bound on the amount of work necessary to find a solution in realtime search. Such a lower bound should be a function of the required solution quality, the maximum amount of work allowed between actions in a realtime algorithm and, possibly other attributes of the search space. The amount of work can be expressed as the number of states expanded and/or the amount of learning(i.e the amount of cumulative changes in the heuristic).

(3) Best Subgoal Selection

This project will review different subgoal selection techniques for real time heuristic search (Bulitko et al 2008, 2010; Lawrence and Bulitko 2010;Ulloa and Baier, 2011). It will then implement in a single testbed and compare them empirically. It will compare these techniques with an optimal (brute force) way of selecting subgoals.

(4) Visual Path Quality vs Solution Suboptimality

Suboptimal heuristic search algorithms are judged based on suboptimality of their solutions. This is defined as the ratio of the produced solution to the cost of an optimal (i.e minimal) solution. Because this measure is easy to compute, it has been used as proxy to a true measure of solution quality - something that the end user would care about.

The project will empirically investigate a correlation between a solution sub optimality and user's perception of the solution quality in the domain of video game pathfinding. Specifically, users will be shown an animation of various paths on a video game map and asked oto rate it on a discrete scale. User - supplied an animation of various paths on a videa game map and asked to rate it on the discrete scale. User-supplied ratings will then be correlated with solution suboptimality. These will be run with several algorithms (e.g kNN LRTA* Bulitko et al 2010), HCDPS (Lawrence and Bulitko 2010), LSS LRTA* (Koenig & Sun), TBA* (Bjornsson, Bulitko and Sturtevant,2009), fLRTA* (Sturtevant and Bulitkoo 2011) etc and a number of subjects.

(5)UCT for Real Time Search

UCT (Kocsis and Scepezvari, 2006) has enjoyed a great success in a number of pro

Page 2: Bulitko Thesis Ideas

blems, including applications to video games.(e.g Balla & Fern 2009, Fern and Lewis, 2011, Laviers and Sukthankar, 2011). It has not been however, used in Real Time Heuristic Search. This project will attempt to do so and compare the resulting algorithm to fLRTA* (Sturtevant and Bulitko,2011), LSS LRTA* (Koenig and Sun 2009) and HCDPS (Lawrence and Bulitko, 2010)

(6) An Approximate MDP planner for Real Time Heuristic Search

A recent algorithm for planning Markov Decision Processeses abstracts the original problem by randomly selecting a set a substate of states and inducing a smaller/simpler MDP over them (Szepesvari,2001). Solving that MDP with value iteration enables finding approximate solutions to the original MDP.This project will investigate the effectiveness of this algorithm when applied to hueristic search problems and compare its performance to differential heuristics (Sturtevant et al, 2009)

(7) Subgoaling Algorithms on Dynamic Graphs

Most real time heuristic search algorithms have assumed stationary search graphs. This assumption does not hold in many applications, including video game pathfinding where the map often changes over the course of a game. This project will investigate the degradation of the real time search performance when the search graph gains and loses edges. Different graph dynamics models will be proposed and tried with modern algoritms (e.g: HCDPS (Lawrence and Bulitko,2010 ),kNN LRTA*, (Bulitko et al 2011), LSS LRTA (Koenig and Sun 2009),fLRTA*(Sturtevant and Bulitko,2011). If possible, ways to repair the databases of HCDPS, kNN LRTA* and others in response to changes in the graph will be proposed.

(8) Ants with sticks and glasses

Heuristic Search Algorithms by definition use heuristics to guide their search. However heuristics are usually inaccurate and algorithms spend substantial effort and time working around these inaccuracies.(Ishida 1997). A recent project proposed search via a population of agents ranging from heuristic trusting to heuristic averse.(Jahns 2011). When run in parallel,(e.g in CUDA) such a population has a chance to find a solution quicker. This project will investigate this line of work further.

(9)Player Modeling for Contingency Planning in Interactive Story Telling

Interactive Story Telling in the context of a video game or intelligent tutor system, afford the player/trainee a certain degree of freedom. As a result,the narrative intended by the author maybe broken by the player. For every possible user caused inconsistency, the Automated Story Director (Riedel, Stern, Dini,& Alderman 2008) pre computes a contingency narrative offline. Then online, when an inconsistency occurs and breaks the current narrative, a new narrative can be selected and executed.

This project will investigate using a player model from PaSSAGE(Thue,Bulitko, Spetch and Wasylichen 2007 , Thue Bulitko, Spetch and Romanuik, 2011) to select a more user tailored contingency narrative within the Automated Story Director.

(10) Simulated Evolution for RTS units

This project will investigate suitability of simulated evolution (Ackley and Littman, 1991; Yong, Stanley,Miikkulainen, and Karpov 2006; Shrum and Miikkulainen,2008; Liapis, Yannakakis and Togelius, 2011) to RTS units. Specifically, using an existing RTS engine (e.g StarCraft with wapi or Strategus) we will set up an evolutionary environment where the agent's genome specifies their visual appearance and combat (characteristics). Agent interaction will be responsible for maki

Page 3: Bulitko Thesis Ideas

ng agents balanced. To evolve several distinct races,evolution may proceed on several separate "continents" which will later be joined to ensure inter racial balancing.

(11) The Traveling Sales Pigeon: Machine Learning a pigeon bot

Recent work used data derived from human movements to machine learn models of human navigation for various tasks. (Hladky and Bulitko 2008, Cenkner, Bulitko and Spetch 2011). The methods used are not limited to human produced trajectories though. An ongoing project at the department of psychology is investigating how pigeons solve a traveling salesman problem.(Applegate, Bixby, Chvatal and Cook, 2006). Thus this project will attempt to machine learn a computational model of a traveling "salespigeon" from the collected pigeon trajectories.

References

(1) Ackley D.H., and Littman M.L.(1991) Interactions between learning and evolution. In Langton.C., Taylor.C.,Farmer, J.D,& Ramussen.S (Eds), Artificial Life 2,vol 10 pp 487-509, Addison Wesley, Redwood City, CA, Santa Fe Institute Studies in the science of complexity

(2) Applegate, Bixby, Chvatal and Cook, 2006 The Traveling Salesman Problem Princeton University Press

(3) Balla R.K and Fern A. UCT for tactical assault planning in Real Time games in Proceedings of the International Conference on Artificial Intelligence (ICJAI) pp 40- 45

(4) Bjornsson, Bulitko, Sturtevant Time-bounded A* in Proceedings of the International Conference on Artificial Intelligence (ICJAI) pp 431-436 Pasadena California AAAI Press

(5) Bjornsson, Bulitko, Lawrence.R (2010) Case Based Subgoaling in Real Time Heuristic Search, Journal of Artificial Intelligence Research(JAIR), 39 269-300

(6) Bulitko, Lustrek, Schaeffer J,Sigmundarson S(2008) Lawrence.R Bjornsson,(2010) Dynamic Control in Real Time Heuristic Search Journal of Artificial Intelligence Research(JAIR), 32 419-452

(7) Cenkner,A Bulitko,V Spetch,M (2011) A generative computational model for human hide and seek behaviour. In Proceedings of the Seventh Conference an Artificial Intelligence and Interactive Digital Entertainment (AIIDE), p 6, Stanford, California AAAI Press

(8) Culberson,J., and Scheaffer,J.(1998). Pattern Databases. Computational Intelligence 14(3) 318-334

(9) Fern,A., and Lewis,P. (2011) Ensemble Monte Carlo Planning An Empirical Case STedy in International Conference on Automated Planning and Scheduling.(ICAPS-2011)

(10) Goldberg,A.V.; and Harrelson,C (2005). Computing the shortest path. A* search meets Graph Theroy. In SODA pp 156-165

(11) Goldenberg,M; Fellner,A.;Sturtevant N, and Schaeffer,J Portal Based true distance heuristics for path finding

(12) Hart,P. Nillson,N and Raphael,B (1968) A formal basis for heuristic determination for minimum cost paths

Page 4: Bulitko Thesis Ideas

(13) Hladky,S and Bulitko,V (2008) An evaluation of models for predicting opponent positions in first person shooter video games. in Proceedings of IEEE Symposium on Computational Intelligence and Games (CIG) pp 39-46, Perth, Australia.

(14)Holte,R., Newton,J.,Felner,A.,Mushulan,R., and Furcy,D. (2004) Multiple Pattern Databases in Proceedings of the International Conference on Automated Planning and Scheduling/ Artificial Intelligence Planning Systems ICAPS/AIPS pp 122-131

(15) Ishida,T (1997) Real Time Search for Learning Autonomous Agents Kluwer Academic Publishers

(16) Jahns,S. (2011) GPU solutions to pathfinding problems

(17) Kocsis,L. and Szepesvari,C.; Bandit based Monte Carlo planning in Furnkranz,J., Scheffer,T., and Spiliopoulou (Eds) Machine Learning, ECML 2006, Vol 4212 of lecture notes in Computer Science. pp 282-293 Springer Berlin Heidelberg

(18) Koenig,S. and Sun,X. (2009) Comparing realtime and incremental heuristic search for real time situated agents - Autonomous Agents and multi agent systems 18(3) 313 - 341

(19) Korf,R. (1990) Real Time Heuristic Search. Artificial Intelligence 42(2-3) 189-211

(20) Laviers,K and Sukthankar,G. (2011) A real time opponent model for rush football in Proceedings of the International Conference on Artificial Intelligence (ICJAI)

(21) Lawrence,R., and Bulitko,V. (2010) Taking learning out of real time heuristic search for video game pathfinding in Proceedings of the twenty third Australasian Joint Conference on Artificial Intelligence pp 405-414 Adelaide, Australia.

(22) Liapis,A., Yannakakis,G.N, and Togelius,J. (2011) Optimizing visual properties of game content through neuro evolution in Proceedings of Artificial Intelligence, and Interactive Digital Entertainment (AI-IDE) conference

(23) Riedl,M.O.,Stern,A., Dini,D., Alderman,J. (2008) Dynamic Experience Management in Virtual Worlds, for entertainment, education, and training. International Transactions on Systems Science and Applications, Special Issue on Agent Based Systems for Human Learning 4(2) 23-42

(25) Shrum,J., and Mikkulainen,R. (2008) Constructing complex npc behaviour via multi objective neuro evolution in Proceedings of Artificial Intelligence and Interactive Digital Entertainment (AIIDE) conference

(26) Sturtevant, N. and Bulitko,V. (2011) Learning where you are going and from whence you came h- and g-cost learning in Real Time Heuristic Search in Proceedings of the 22nd International Joint Conference on Artificial Intelligence (ICJAI)

(27) Sturtevant, N., Bulitko,V., and Bjornsson,Y (2010) On learning in agent centred search,in Proceedings of the 9th International Conference on Autonomous Agents and Multi Agent Systems -AAMAAS pp 330-340 Toronto Canada

(28) Sturtevant,N.R, Felner,A., Barrer,M., Schaeffer,J., and Burch,.N (2009) Memory based heuristics for explicit state spaces in Proceedings of the International Joint Conference on Artificial Intelligence (ICJAI)pp 609-614

Page 5: Bulitko Thesis Ideas

(29) Szepesvari,C (2001) Efficient Approximate Planning in continuous space markovian decision problems AI Communications 13(3) 163 -176

(30)Thuc,D., Bulitko,V., Spetch,M., and Romanuik,T. (2011) A computational model of perceived agency in video games in Proceedings of the 7th conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE) p.6 AAAI Press

(31) Thuc,D., Bulitko,V., Spetch,M., and Wasilyshen,E. (2007) Interactive Story Telling - A player modelling approach in 3d Artificial Intelligence and Interactive Digital Entertainment (AIIDE) conference pp 43 - 48 Palo Alto California AAAI Press

(32) Ulloa,C.H., and Baier,J (2011) Real Time Adaptive A* with depression avoidance in Proceedings of Symposium on Combinatorial Search SoCS

(33) Yong,C.H, Stanley,K.O., Mikkulainen.R, and Karpov,I.V (2006) Incorporating advice into neuroevolution of adaptive agents in Proceedings of the 7 Artificial Intelligence and Interactive Digital Entertainment (AIIDE) conference