TIN2010-20900-C04-04 UPM Group
Transcript of TIN2010-20900-C04-04 UPM Group
Team Objectives Tasks Results Collab
TIN2010-20900-C04-04UPM GROUP
Concha Bielza
Computational Intelligence GroupDepartamento de Inteligencia Artificial
Universidad Politecnica de Madridhttp://cig.fi.upm.es
Granada, May 12, 2011
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Outline
1 Team
2 Objectives
3 Tasks and commitments
4 Results
5 Collaborations within the project
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Outline
1 Team
2 Objectives
3 Tasks and commitments
4 Results
5 Collaborations within the project
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
12 Members and 10 EDPs
2 Full ProfessorsConcha BielzaPedro Larranaga
2 foreign collaborators, Full ProfessorsTom Heskes (Nijmegen, The Netherlands)Qingfu Zhang (Essex, UK)
1 Supply Associate Professor: Juan A. Fernandez del Pozo
2 PostDoc ResearchersRuben Armananzas (Juan de la Cierva researcher)Roberto Santana (Cajal Blue Brain Project)
5 PhD StudentsHanen Borchani (FPI last TIN)Alfonso Ibanez (Consolider)Hossein Karshenas (Consolider)Pedro L. Lopez-Cruz (FPU)Diego Vidaurre (Cajal Blue Brain Project)
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Outline
1 Team
2 Objectives
3 Tasks and commitments
4 Results
5 Collaborations within the project
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Objectives
1. Joint probability distribution function learningDefinition of new scores and structural algorithms for learning PGMs.Ideas based on self-similarity, regularization, multicriteria, interaction, andwith complex data (noisy, missing, high-dimensional)Learning the parameters (densities) in models with continuous variables
2. Supervised classificationBNs classifiers in problems with an imbalanced classAdvance in the design of well-known BN classifiers (TAN, KDB, AODE,HAODE, WAODE, FBC, multinets...)Development of new methods for multi-dimensional classificationExtensions to massive data sets and data streamsDevelopment of new methods to convert a problem of classification intoregression modelsExtension of PGMs to hybrid domains (discrete and continuous variables)for its application to classification and regressionExtension to credal classifiers (use imprecise probabilities)Algorithms for learning utility-based classifiers
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Objectives
1. Joint probability distribution function learningDefinition of new scores and structural algorithms for learning PGMs.Ideas based on self-similarity, regularization, multicriteria, interaction, andwith complex data (noisy, missing, high-dimensional)Learning the parameters (densities) in models with continuous variables
2. Supervised classificationBNs classifiers in problems with an imbalanced classAdvance in the design of well-known BN classifiers (TAN, KDB, AODE,HAODE, WAODE, FBC, multinets...)Development of new methods for multi-dimensional classificationExtensions to massive data sets and data streamsDevelopment of new methods to convert a problem of classification intoregression modelsExtension of PGMs to hybrid domains (discrete and continuous variables)for its application to classification and regressionExtension to credal classifiers (use imprecise probabilities)Algorithms for learning utility-based classifiers
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Objectives
3. InferenceApproximate algorithms for MTE hybrid networks, credal networks,probabilistic decision graphs, precise and imprecise influence diagramsand for BNS using fast factorization and recursive treesAlgorithms based on query importance sampling for hybrid Bayesiannetworks
4. ApplicationsTechnological applications: evolutionary computation, mobile robotics,requirements tracing and classificationLife sciences: biomedicine, agriculture, environment, genomicsSocial domains: bibliometry, prediction of arrival times of city buses,detection of credit card frauds
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Objectives
3. InferenceApproximate algorithms for MTE hybrid networks, credal networks,probabilistic decision graphs, precise and imprecise influence diagramsand for BNS using fast factorization and recursive treesAlgorithms based on query importance sampling for hybrid Bayesiannetworks
4. ApplicationsTechnological applications: evolutionary computation, mobile robotics,requirements tracing and classificationLife sciences: biomedicine, agriculture, environment, genomicsSocial domains: bibliometry, prediction of arrival times of city buses,detection of credit card frauds
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Outline
1 Team
2 Objectives
3 Tasks and commitments
4 Results
5 Collaborations within the project
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Tasks as PI
JPD LEARNING1.1 Scores for PGM learning
1.1.1 New scores, multi-objective and LP-regularization-based→ Learning BNs based on Lp-regularized scores, both in the space of
DAGs and of equivalence classes–already for GBNs [Vidaurre et al., 2010]
→ Learn structures by using multiobjective scores
1.2 New structure learning algorithms
1.2.1 Algorithms based on self-similarity→ Define a BN that admits the self-similarity property (and in 3D)→ New learning algorithms from data (score+search) and even new
simulation methods
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Tasks as PI
JPD LEARNING1.1 Scores for PGM learning
1.1.1 New scores, multi-objective and LP-regularization-based→ Learning BNs based on Lp-regularized scores, both in the space of
DAGs and of equivalence classes–already for GBNs [Vidaurre et al., 2010]
→ Learn structures by using multiobjective scores
1.2 New structure learning algorithms
1.2.1 Algorithms based on self-similarity→ Define a BN that admits the self-similarity property (and in 3D)→ New learning algorithms from data (score+search) and even new
simulation methods
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Tasks as PISUPERVISED CLASSIFICATION
2.3 Multi-dimensional classification with PGMs→ Learn general multi-dimensional Bayesian network classifiers from data→ New type of models decomposable (max connected components) to
alleviate computational burden of MPE computation→ Adapt the ideas of random forests to this context using tree-tree or
polytree-polytree→ Develop a stratified CV scheme in this context→ Missing data→ MPE provided by the consensus of partial MPEs of small components→ Extension to logistic regression and to regression with multiple outputs
2.4 Extensions to large data sets→ Massive databases: transform the classification into a regression and
analyze the data set as (smaller) blocks→ Data streams + some unlabeled observations: adapt the EM algorithm→ Data streams in multidimensional classification problems
2.5 Relationship between classification and regression→ Convert CPTs of a BN into logistic models (parametric and parsimonious),
beyond BN classifiers with perfect independence graphs→ Locally weighted regression to solve highly nonlinear and sparse problems→ Use regularization to help in FSS and then use the regression to solve classification problems
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Tasks as PISUPERVISED CLASSIFICATION
2.3 Multi-dimensional classification with PGMs→ Learn general multi-dimensional Bayesian network classifiers from data→ New type of models decomposable (max connected components) to
alleviate computational burden of MPE computation→ Adapt the ideas of random forests to this context using tree-tree or
polytree-polytree→ Develop a stratified CV scheme in this context→ Missing data→ MPE provided by the consensus of partial MPEs of small components→ Extension to logistic regression and to regression with multiple outputs
2.4 Extensions to large data sets→ Massive databases: transform the classification into a regression and
analyze the data set as (smaller) blocks→ Data streams + some unlabeled observations: adapt the EM algorithm→ Data streams in multidimensional classification problems
2.5 Relationship between classification and regression→ Convert CPTs of a BN into logistic models (parametric and parsimonious),
beyond BN classifiers with perfect independence graphs→ Locally weighted regression to solve highly nonlinear and sparse problems→ Use regularization to help in FSS and then use the regression to solve classification problems
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Tasks as PISUPERVISED CLASSIFICATION
2.3 Multi-dimensional classification with PGMs→ Learn general multi-dimensional Bayesian network classifiers from data→ New type of models decomposable (max connected components) to
alleviate computational burden of MPE computation→ Adapt the ideas of random forests to this context using tree-tree or
polytree-polytree→ Develop a stratified CV scheme in this context→ Missing data→ MPE provided by the consensus of partial MPEs of small components→ Extension to logistic regression and to regression with multiple outputs
2.4 Extensions to large data sets→ Massive databases: transform the classification into a regression and
analyze the data set as (smaller) blocks→ Data streams + some unlabeled observations: adapt the EM algorithm→ Data streams in multidimensional classification problems
2.5 Relationship between classification and regression→ Convert CPTs of a BN into logistic models (parametric and parsimonious),
beyond BN classifiers with perfect independence graphs→ Locally weighted regression to solve highly nonlinear and sparse problems→ Use regularization to help in FSS and then use the regression to solve classification problems
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Tasks as PIAPPLICATIONS
4.1 Technological applications
4.1.1 Evolutionary computation→ Parameter control in evolutionary computation, with parameter
values changing during the run→ Regularization in EDAs, with many generations but with learning and
simulation steps based on reduced populations
4.2 Life Sciences
4.2.1 Applications to Biomedicine→ Predict how HIV mutations influence resistance of many HIV drugs
(Hospital Carlos III, INAOE) –Task 2.3→ Neuroinformatics: Modelling and simulation of dendritic morphology
–Task 1.2.1→ Neuroinformatics: Discrimination between Alzheimer’s disease
patients and controls based on microarray data
4.3 Social domains
4.3.1 Applications to bibliometry→ Relationships, redundancies and properties of different indices that
measure the scientific productivity of a researcher→ Predict how these indices evolve in time→ Real data from Spanish researchers in CCIA, LSI and ATC areas
collected to have a picture of the Spanish productivityC. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Tasks as PIAPPLICATIONS
4.1 Technological applications
4.1.1 Evolutionary computation→ Parameter control in evolutionary computation, with parameter
values changing during the run→ Regularization in EDAs, with many generations but with learning and
simulation steps based on reduced populations
4.2 Life Sciences
4.2.1 Applications to Biomedicine→ Predict how HIV mutations influence resistance of many HIV drugs
(Hospital Carlos III, INAOE) –Task 2.3→ Neuroinformatics: Modelling and simulation of dendritic morphology
–Task 1.2.1→ Neuroinformatics: Discrimination between Alzheimer’s disease
patients and controls based on microarray data
4.3 Social domains
4.3.1 Applications to bibliometry→ Relationships, redundancies and properties of different indices that
measure the scientific productivity of a researcher→ Predict how these indices evolve in time→ Real data from Spanish researchers in CCIA, LSI and ATC areas
collected to have a picture of the Spanish productivityC. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Tasks as PIAPPLICATIONS
4.1 Technological applications
4.1.1 Evolutionary computation→ Parameter control in evolutionary computation, with parameter
values changing during the run→ Regularization in EDAs, with many generations but with learning and
simulation steps based on reduced populations
4.2 Life Sciences
4.2.1 Applications to Biomedicine→ Predict how HIV mutations influence resistance of many HIV drugs
(Hospital Carlos III, INAOE) –Task 2.3→ Neuroinformatics: Modelling and simulation of dendritic morphology
–Task 1.2.1→ Neuroinformatics: Discrimination between Alzheimer’s disease
patients and controls based on microarray data
4.3 Social domains
4.3.1 Applications to bibliometry→ Relationships, redundancies and properties of different indices that
measure the scientific productivity of a researcher→ Predict how these indices evolve in time→ Real data from Spanish researchers in CCIA, LSI and ATC areas
collected to have a picture of the Spanish productivityC. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Commitments
Publish, per group and year, ≥ 4 JCR papers + 5 papers in conferenceproceedings
Participation in conferences, with communications, tutorials and as organizers
Apply for patents (e.g. dendritic morphology and classification problems forAlzheimer’s disease and HIV drug resistance)
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Outline
1 Team
2 Objectives
3 Tasks and commitments
4 Results
5 Collaborations within the project
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Publications
JCR (with acknowledgments to this project)
Bielza, C., Li, G., Larranaga, P. (2011) Multi-dimensional classification withBayesian networks, International Journal of Approximate Reasoning 52, 705-727
Garcıa-Torres, M., Armananzas, R., Bielza, C., Larranaga, P. (2011) Comparisonof metaheuristic strategies for peakbin selection in proteomic mass spectrometrydata, Information Sciences
Armananzas, R., Saeys, Y., Inza, I., Garcıa-Torres, M., Bielza, C., van de Peer,Y., Larranaga, P. (2011) Peakbin selection in mass spectrometry data using aconsensus approach with estimation of distribution algorithms, IEEE/ACMTransactions on Computational Biology and Bioinformatics 8, 760-774
Lopez-Cruz, P., Bielza, C., Larranaga, P., Benavides-Piccione, R., DeFelipe, J.(2011) Models and simulation of 3D neuronal dendritic trees using Bayesiannetworks, Neuroinformatics
Ibanez, A., Bielza, C., Larranaga, P. (2011) Using Bayesian networks to discoverrelationships between bibliometric indices. A case study of computer scienceand artificial intelligence journals, Scientometrics
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
PublicationsConferences (with acknowledgments to this project)
Karshenas, H., Santana, R., Bielza, C., Larranaga, P. (2011) Multi-objectiveoptimization with joint probabilistic modeling of objectives and variables,Evolutionary Multi-Criterion Optimization (EMO-2011), Lecture Notes inComputer Science, 6576, 298-312, Springer
Zaragoza, J., Sucar, E., Morales, E., Bielza, C., Larranaga, P. (2011) Bayesianchain classifiers for multidimensional classification, IJCAI-2011
Santana, R., Bielza, C., Larranaga, P (2011) Affinity propagation enhanced byestimation of distribution algorithms, Proceedings of the 2011 Genetic andEvolutionary Conference (GECCO-2011)
Santana, R., Karshenas, H., Bielza, C., Larranaga, P (2011) Quantitativegenetics in multi-objective optimization algorithms: From useful insights toeffective methods, Proceedings of the 2011 Genetic and EvolutionaryConference (GECCO-2011)
Santana, R., Karshenas, H., Bielza, C., Larraaga, P (2011) Regularized k-orderMarkov Models in EDAs, Proceedings of the 2011 Genetic and EvolutionaryConference (GECCO-2011)
Borchani, H., Bielza, C., Larranaga, P. (2011) Learning multi-dimensionalBayesian network classifiers using Markov blankets: A case study in theprediction of HIV protease inhibitors, Probabilistic Problem Solving inBioMedicine (ProBioMed’11) at AIME
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Other activities
Other activities
Tutorial at CAEPIA-2011 (Nov’11): “Aprendizaje Automatico y Optimizacion enNeurociencia” (Bielza, Larranaga)
2nd position at MEG mind reading challenge within International Conference onArtificial Neural Networks (Santana, Bielza, Larranaga)
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Near future
Accepted stays for European thesis
D. Vidaurre at Nijmegen with T. Heskes (2012)
H. Borchani at Utrecht with L. van der Gaag (2012)
Proposals: Projects and contracts
European Project FET Flagship Initiative Preparatory Actions [granted:FP7-ICT-2011-FET-F-284941] during 2011
European project (Subprogramme ERASMUS within the Lifelong LearningProgramme) for 1 year: “Towards a rational policy decision making on Erasmusmobility based on intelligent data analysis”
National Network Atica on Applied Computational Intelligence
National project (CDTI) with Incita on multimodal biometry
National project (Avanza-Innpacto) with Apara and Fundacion CIEN onParkinson’s disease
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Near future
Accepted stays for European thesis
D. Vidaurre at Nijmegen with T. Heskes (2012)
H. Borchani at Utrecht with L. van der Gaag (2012)
Proposals: Projects and contracts
European Project FET Flagship Initiative Preparatory Actions [granted:FP7-ICT-2011-FET-F-284941] during 2011
European project (Subprogramme ERASMUS within the Lifelong LearningProgramme) for 1 year: “Towards a rational policy decision making on Erasmusmobility based on intelligent data analysis”
National Network Atica on Applied Computational Intelligence
National project (CDTI) with Incita on multimodal biometry
National project (Avanza-Innpacto) with Apara and Fundacion CIEN onParkinson’s disease
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Near future
Stays at UPM
Barbara Pieters (Utrecht U) Nov’10
Kangil Kim (National U of Seoul) Feb-Aug’11
Ferran Reverter (UB) Oct-Nov’11
Collaborations
Maestu-Nevado at CTB (UPM) in neuroscience
Hospital de la Santa Creu i Sant Pau (Barcelona) in neuroscience
Instituto Cajal and Columbia U. in neuroscience
P.Rudomin-UlisesCortes-ErikaRodrıguez (Mexico-UPC) in neuroscience
G. Ascoli (George Mason U) in neuroscience
CIEMAT in bioinformatics
Hospital Carlos III in HIV
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Outline
1 Team
2 Objectives
3 Tasks and commitments
4 Results
5 Collaborations within the project
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
4 groups collaborating
Specializations
Granada: imprecise probabilities and decision making, approx. inferencealgorithms
Almerıa: continuous variables, approx. inference algorithms
Albacete: abductive inference, machine learning algorithms and metaheuristics
Madrid: machine learning algorithms, decision making and evolutionarycomputation
Expected collaborations
Granada-Madrid: defining scores for structural learning, learning with impreciseprobabilities and modelling with influence diagrams for decision making
Almerıa-Albacete: learning hybrid networks
Granada-Almerıa: decision making
Albacete-Madrid: multi-dimensional classification where the class vectorassignment is performed by means of abductive inference
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
4 groups collaborating
Joint expected activities
Mobility and exchange of researchers
Shared supervision of PhD theses and research works (D.E.A.)
4 workshops, one in each city (work carried out, next plans, difficulties found):months 5, 14, 23 and 31 approx.
4 project supervisors will celebrate intermediate meetings, at least one persemester
A server to make papers, software and documentation generated accessible toall the participants
Inclusion of procedures in some open software tool (e.g. Elvira, WEKA, Mateda,ProGraMo) or by making available to the scientific community some specificroutines
Applics results extended to conferences and journals non-specific of PGMs; useof results/software by EPOs. Ours are: Atos, Instituto Cajal, Panda Security
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Keywords per member
T. Heskes→ regularization and neurocomputing
Q. Zhang→ evolutionary computation
J.A. Fernandez del Pozo→ decision analysis, evolutionary computation
R. Armananzas→ evolutionary computation, classification, bioinformatics
R. Santana→ evolutionary computation, neuroscience
H. Borchani→ multi-dim classification, semi-supervised, data streams, missingdata
A. Ibanez→ classification, bibliometry
H. Karshenas→ evolutionary computation
P.L. Lopez-Cruz→ new Bayesian classifiers, neuroscience
D. Vidaurre→ regularization, continuous variables, neuroscience
C. Bielza UPM-Madrid
Team Objectives Tasks Results Collab
Proposals to collaborate
With Granada
Advances in known Bayesian networks classifiers: PGMs for hybrid domains inclassification (MOP)
Learning IDs from data
With Albacete
Abductive inference for multi-dimensional classification (constraints on thevalues of the class vector)
With Almerıa
Learning MOPs from data
C. Bielza UPM-Madrid