A research of text mining applied to Big Data: situation analysis and forecasting of “historia del...

25
A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor Presentation by Prof. Dr. José Pino Díaz

description

"Les Journées d'Intelligence Économique - BIG DATA MINING 22 et 23 Mai 2014 à l'Hôtel Mövenpick - Tanger, Maroc Les Journées d'Intelligence Économique - JIE renforceront les actions du Colloque international VSST organisé pour présenter des travaux de recherche et de développement industriel innovants dans le domaine des systèmes de Veille Stratégique Scientifique et Technologique. Après 2 congrès à Toulouse en 1995 et 1998, Barcelone en 2001, Toulouse en 2004, Marrakech en 2007 Toulouse en 2010 et Nancy 2013. Comme la périodicité de 3 ans semblait trop grande pour certains, il a été décidé d’intercaler entre 2 colloques, un séminaire organisé sur 2 jours. Le premier séminaire a eu lieu à l’université de Lille 1 en janvier 2006 en symbiose avec le colloque EGC, le second s’est déroulé en mars 2009 à l’INIST-CNRS de Nancy, le troisième à Ajaccio en 2012 avec à chaque fois plus de 120 participants. Une première journée thématique "les apports de l’intelligence économique à la gouvernance de l’entreprise" s’est tenue à l’ENSIAS de Rabat le 3 mars 2010 avec plus de 250 participants. La thématique de cette deuxième version des journées, porte sur "Big Data Mining". Les articles seront soumis à la sélection du Comité Scientifique et les meilleurs articles seront publiés dans Intelligences Journal (IsJ)."

Transcript of A research of text mining applied to Big Data: situation analysis and forecasting of “historia del...

Page 1: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

A research of text mining applied to Big Data: situation

analysis and forecasting of “historia del arte" descriptor

Presentation by Prof. Dr. José Pino Díaz

Page 2: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

Dynamic analysis of a term and its knowledge network; the case of historia del arte descriptor

by

J. Pino-Díaz, A. Cruces-Rodríguez, R. Bailón-Moreno & N. Rodríguez-Ortega.

J. Pino-Díaz, Universidad de Málaga, Andalucía Tech, Departamento de Historia del Arte, «Techné, Knowledge and product engineering» et «iArtHis Lab» research groups. Campus de Teatinos. 29071 – Málaga (Espagne)

A. Cruces-Rodríguez, Universidad de Málaga, Andalucía Tech, Departamento de Historia del Arte, «Techné Knowledge and product » et «iArtHis Lab» research group. Campus de Teatinos. 29071 – Málaga (Espagne)

R. Bailón-Moreno, Universidad de Granada, Departamento de Ingeniería Quimica, «Techné, Knowledge and product engineering» research group . Campus de Fuentenueva. 18071 – Granada (Espagne)

N. Rodríguez-Ortega, Universidad de Málaga, Andalucía Tech, Departamento de Historia del Arte, «iArtHis Lab» research group. Campus de Teatinos. 29071 – Málaga (Espagne)

Page 3: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

“historia del arte” Spanish term in regression?

Page 4: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

“historia del arte” Spanish topic search in regression?

Page 5: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

y = -0,103x3 + 612,69x2 - 1E+06x + 8E+08R² = 0,9956

0

20

40

60

80

100

120

140

1974 1976 1978 1980 1982 1984 1986 1988 1990

y = -0,0081x4 + 64,458x3 - 193325x2 + 3E+08x - 1E+11R² = 0,9998

0

100

200

300

400

500

600

700

800

900

1000

1985 1990 1995 2000 2005 2010 2015

1974-1990 1990-2013

ISOC DB 1976-2012

0

100

200

300

400

500

600

700

800

900

1000

1970 1975 1980 1985 1990 1995 2000 2005 2010 2015

2009-20121986-1989

Production of documents with “historia del arte” descriptor (title, abstract, keywords) at ISOC DB

Page 6: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

“Historia del arte” network of ISOC DB. Cowords analysis. Dinamic analysis.

Knowledge System: Techné Co-word

Documentary corpus: 873 documents

Field of analysis: Keywords - Journal - Authors.

Parameters:• Minimal occurrence, 3• Minimal co-occurrence, 2• Minimal number of nodes, 4• Maximal number of nodes, 12

Periods:1. 1976-19832. 1984-19893. 1990-19954. 1996-20015. 2002-20076. 2008-2013

Page 7: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

Strategic diagram and graphs “subnetwords” (1976-1983)

Graphs Subnetwords

Click inside the box

Page 8: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

Strategic diagram and graphs “subnetwords” (1984-1989)

Graphs Subnetwords

Click inside the box

Page 9: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

Strategic diagram and graphs “subnetwords” (1990-1995)

Graphs Subnetwords

Click inside the box

Page 10: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

Strategic diagram and graphs “subnetwords” (1996-2001)

Graphs Subnetwords

Click inside the box

Page 11: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

Strategic diagram and graphs “subnetwords” (2002-2007)

Graphs Subnetwords

Click inside the box

Page 12: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

Strategic diagram and graphs “subnetwords” (2008-2013)

Graphs Subnetwords

Click inside the box

Page 13: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

Frequencies of the “historia del arte” descriptor in the netword of each period.

(*) [Only are considered the descriptors of the network linked with two or more occurrences (minimal co-occurrence , 2)].

0

5

10

15

20

25

30

1976-1983 1984-1989 1990-1995 1996-2001 2002-2007 2008-2013

Descriptor "art history" ISOC DBCo-word analysis. Relative frequencies* (%).

0

50

100

150

200

250

1976-1983 1984-1989 1990-1995 1996-2001 2002-2007 2008-2013

Descriptor "art history" ISOC DBCo-word analysis. Absolute frequencies*.

PeriodAbsolute

frequencies*

Relative frequencies*

(%)1976-1983 52 261984-1989 51 221990-1995 48 211996-2001 103 152002-2007 225 112008-2013 135 13

Page 14: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

Temporal evolution of subnetwords. Dynamic analysis of “historia del arte”.

1976 1983 1989 1995 2001 2007 2013

Descriptors

Page 15: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

Descriptors of the

subnetwords. New / old descriptors

descriptors that remain from one period to another

new descriptors

Page 16: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

Descriptors of the subnetwords

Descriptors that remain from one period to another New descriptors of period

Page 17: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

Descriptors analysis of the subnetwords

Subnetwords were obtained with the simple centers algorithm.

Page 18: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

CENTRAL GROUP

SUBNETWORD “FONDOS BIBLIOGRÁFICOS / BIBLIOTECAS ESPECIALIZADAS”

SUBNETWORD[MEMORIA ECLESSIAE (J) / PATRIMONIO ECLESIÁSTICO”

(Remove lines with value lower than 1660)

Subnetwords map. Strategic analysis of “historia del arte”.

Page 19: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

historia del arte

SUBNETWORD “RODRÍGUEZ ORTEGA, NURIA / TERMINOLOGÍA”

Centrality of the subnetwords

Decision making, Strategy of future: Acting for outlying

hight density subnetwords approach the

center (support interdisciplinarity)

Central group of subnetwords

Page 20: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

Separate subnetwords

Line width indicates density

Page 21: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

Strategics subnetworks examples

Page 22: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

Conclusions (I)In Google Ngram search and Google Trends, in the period 2000-2010, regressive trend in the number of absolute frequencies of “Historia del Arte" is observed. The same downward trend in the number of absolute frequencies in ISOC Database is observed.

Has been found that the descriptor “Historia del Arte" in the last two periods of analysis no is the main connector in subnetworks of greater centrality; is displaced by “Iconografía" (2002-2007), and by “Teoría del Arte" in (2008-2013). It coincides with the change in downward trend in the rates of renewal and dynamism of the subnetworks.

The regression observed in the presence of the descriptor “Historia del Arte" on the results of research in the last decade is justified by the decrease in research production (due to adverse economic conditions) and by the changes in the dynamics of scientific field.

Dynamic analysis of research related to the descriptor “Historia del Arte" shows that at present we are in a period of maturity and immobility. In this context is observed the development of the research topics in the visual arts (History of Cinema) and no the research topics by other fields (Musicology, Library, Documentation, Psychology of Art, Geometry or Gender Studies). The interdisciplinarity is virtually absent.

Page 23: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

Conclusions (II)These periods of immobility are usually the result of internal depletion and precede other periods of expansion (see chart production of documents (changes occurred between 1986-1989 and 2009-2012) and are observable in the dynamic analysis of scientific disciplines.

Prospectively, we can infer that the future development of the research in art history should encourage the promotion of interdisciplinary and implementation of new tools and methods, characteristic of the knowledge society, and the communication and information technology; the future lies in the expansion of the new paradigm Digital Art History.

Page 24: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

Conclusions (III)The “Historia del Arte" Networds Map indicates that the research could have a good development in the future, they are (excepted the teaching of art history):

- Conservation and Restoration of the Historic-Artistic Heritage - Museums / Collections of Museums - Special Libraries / Digital libraries / Digitization - Fashion / Apparel / Clothing / Dresses - Patronage / Artistic Patronage - Illustration / Jesuits - Musicology / Study of the Organ (musical instrument) / Musical Research - Graphical Representation / History of Science - Internet / Web / New Technologies - Sources of Information / Access / Resources - Historic Public Infrastructure / Town Planning- Terminology / Text Analysis / Text Mining / Big Data - Geometric / Architectural Ornamentation - Silversmiths / Silver Artisans - Iconographic Analysis / Iconographic Influences - Gender Studies / History of Women - Art Psychology / Art Philosophy - Image Spain / Travel Books

Page 25: A research of text mining applied to Big Data: situation analysis and forecasting of “historia del arte" descriptor

Questions? ... Comment? Thank you

Prof. Dr. José Pino DíazUniversidad de Málaga, Andalucía Tech, Departamento de Historia del Arte,

Grupos de investigación Techné (UGR) e iArtHis-Lab (UMA), Campus de Teatinos s/n, 29071 Málaga, España