Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Secretary for...
-
Upload
taus-enabling-better-translation -
Category
Presentations & Public Speaking
-
view
332 -
download
0
Transcript of Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Secretary for...
Spanish Language Technologies Plan
1. Introduction
2. Plan for Language Technologies
3. LT Plan Road Map
4. Governance
5. Implementation
2
3
OPORTUNITIES
• High potential for internationalization of the Spanish language and
cooperation with Latin America.
• New public services for citizens and enterprises on strategic sectors (health,
justice, tourism, security, etc.).
• Strong market growth associated with innovation and development.
Introduction
4
STRENGTHS• Good governance of the Spanish language (RAE, ASALE, Cervantes Institute,
BNE).
• High research level in NLP, with a right coordination (SEPLN)
• Available linguistic resources in the Administration as a data source for the
industry and the research development (RISP open data policy).
Introduction
5
WEAKNESSES• SMEs don't reach enough industrial capacity for:
• Compete in the international market.
• Complete de value chain in Spain.
• Difficulties at the knowledge transfer from the research sector to the
Industry
Introduction
6
THREATS• Loss of economic and industrial competitiveness from Spain and Latin
America against other countries (United States).
• Digital underdevelopment of the Spanish language technologies and digital
extinction of the co-official languages.
• Researchers and specialized professionals brain drain and damage of the
Spanish research sector.
Introduction
Introduction
Conclusions:
• The sector of language technologies is an emerging cross sector with a capacity
to encourage growth, competitiveness and quality jobs.
• It’s development is unstoppable, but if we don’t take advantage of the
opportunity, other will occupy this site.
• Spain has the means, but it is necessary to drive and coordinate actions of
Public Administration collaboratively with the Autonomous Communities,
European Union and Latin America.
7
8
Kick off: Meeting SE, SS y AGE CIO
13/051/06
15/06 28/07
Setting up: Steering Committee and
Technical Secretary(SETSI)
Setting up Experts Committee
Presentation Experts Report
1
2
3
4
Preliminary study0
16/9
Proposal of the Plan form the steering
committee 5
Schedule of the development of the Plan
7/10
Approval of the Plan Steering Committee
6
Development of the Plan
Dirección técnica del
Plan
Dirección ejecutiva del Plan
Comité Directivo
Designación Comité de Expertos
Solicitud de Informe
Comité de Expertos
Secretaría Técnica (SETSI)
Coordinación
Presentación de Informe de
Expertos
Elaboración de Plan
Aprobación de Plan por el
Comité Directivo
Plan de Impulso Industria Lenguaje Natural en Español
1/jun/2015 15/jun/2015
15/sep/2015 30/oct/2015
13/may/2015
Acciones Órganos
ReuniónS.E., S.G. y DTIC
de la AGE
9
Presentación de Borrador de Informe de
Expertos
15/jul/2015
Comité Directivo + Secretaría Técnica
Comité de Seguimiento del Plan
Language Technologies Plan Action axis
11
1. Linguistic infrastructures development.
2. Boost of the language technologies industry
Improvement of the visibility and knowledge transference of the sector (from academy to industry).
Support for internationalization and commercialization of the sector.
3. Public Administration as a driver of Language Industry
Platforms for natural language processing and automatic translation in the public administrations.
Linguistic resources of the public administrations and reuse policy of public sector information (open ling data on RISP).
4. Flagship projects Health
Justice
Education
Tourism, Sectorial Monitoring, Digitised Heritage, etc.
http://www.agendadigital.gob.es/planes-actuaciones/Paginas/plan-impulso-tecnologias-lenguaje.aspx
Axis 1: Linguistic infrastructures development
12
Axis 1
Axis 3
Action1
Action2
Action2
Action1
Axis 2
• Linguistic infrastructure = resources + processors +
evaluation campaigns.
• They are the asset of the language industry.
• Elaborate and implement a linguistic infrastructure
development Plan of general purpose in Spanish and co-
official languages.
Budget: 30 M€.
Axis 1: Linguistic infrastructure development
Purposes:
• Boost NLP industry in Spanish and co-official languages.
• Improve public sector and industry LT innovation.
Actions:
• Elaborate and implement a linguistic infrastructure development Plan. Infrastructure governance and sustainability
• Technical standards for interoperability, license policies and mechanisms of personal data protection.
• Common tools for resource generation and evaluation.
• Facilitate the public access to linguistic infrastructure.
13
14
Axis 1
Axis 3
Action 1
Action 2
Action2
Action 1
Axis 2
Axis 2: Boost of the language technologies industry
• Action 1: Improvement of the visibility and knowledge transfer
between academy and industrial sector.
• Action 2: Support for internationalization and commercialization
of the sector.
Budget: 2 M€.
Axis 2.1: Sector visibility and transfer
Purposes:
• Improve the transference from the academic sector to the industry.
• Increase the visibility of the language technologies sector.
Actions:
• Improve the training (MOOCs, Training sessions to teachers). Promote studies (University Masters, specialised courses).
• Research support (viability of a Network of Centres of Excellence, aid programs for reasearch excellence).
• Promotion and detection of talent (hackatones, university sessions and youth olympics).
• Informative sessions: general (InfoDays), specific domains (Language Technologies applied to health).
• Enterprise register, products. Create a network of experts.
15
Axis 2.2: Internationalization and commercialization
Purposes:
• Improve the internationalization of the Spanish enterprises on this sector.
Actions:
• Participation in congresses and international events (LT-Summit, TAUSS, LREC, META-FORUM, etc.)
• Spanish participation in organizations and European research infrastructures (CEF, CLARIN, ELRA, META-NET).
• National congress and events (SEPLN, IODC, MWC).
• ICEX missions on language technologies.
• Latin america: Events Ibero-American Summit (SEGIB, AECID). Collaboration ASALE, BNE network. Domain specific.
• Others: commercial missions, MOUs, OFECOMES, Invest in Spain.
16
Axis 3: Administration as a driver of the Language
Industry
17
Axis 1
Axis 3
Action1
Action2
Action2
Action1
Axis 2
• Action 1: Development of platforms for natural language processing
and automatic translation in the public administrations.
Objectives:
- Promote advanced services to the citizen.
- Improve the Administration performance.
- CORA: reuse, simplify and achieve economies of scale.
- Improve the accessibility for people with special needs.
Budget: 4 M€.
Axis 3: Administration as a driver of the Language
Industry
Design and creation of a common platform of natural language processing and automatic translation for the Public Administration:
• Develop a scalable infrastructure oriented to interoperable multi-supplier components
• Maintain confidentiality guarantees of the public services.
• Add different components and linguistic resources to the linguistic processing flow with various models of licenses and execution scenarios.
• Multiple instances. Advanced distribution model of interconnected scalable components: extensible, light distribution, multi-cloud.
18
20
Eje 1
Eje 3
Línea 1
Línea 2
Línea2
Línea 1
Eje 2
Axis 3: Administration as a driver of the Language
Industry
• Action 2: Linguistic resources of the public administrations and reuse policy
of public sector information.
Objective:
- Within the framework of RISP policy: new line, open data of linguistic interest, to take advantage of the huge potential of the public sector information for the language industry.
(names, people and enterprises; place names; taxonomies; glossaries; multilingual vocabularies; translation memories; etc.)
Budget: 2 M€.
Flagship projects
Objectives:
• New public services or to improve the capacity and quality of the existing public services by the application of the language technologies.
• Facilitate the work of the Administrations in the internal treatment of the information and its use for defining and monitoring public policies.
• Demonstrate, in Spain and abroad, the capacities and benefits of language technologies.
• Generate reusable elements for other projects. • Immediate implementation of Plan cross actions; using general linguistic
infrastructure and common platforms.
21
Budget: 49 M€.
22
Flagship projects
Search criteria of flagship projects:
• Compromise of involved bodies. Ensure the leadership of who knows the
issue and have competences to solve it.
• Precision. Answer to already identified problems justifying the suitability
and the right time to start the Project.
• High economic and social impact.
• Generation of reusable resources.
• Stablish synergies with the other plan actions.
• Particular attention to the acquisition of experience for future projects.
Main projects WP2016 (I)
23
Flagship projects:
• Health 1: Electronic Medical Record processing (EMR)
• Health 2: Drug data sheets processing (FTM)
• Health 3: Phenotyping and genomics
• Justice: legal information processing
• Touristic intelligence
• Sectorial monitoring for innovation
• Digitized and online heritage
• Advanced attention to the citizen
• Education
Cross projects
• Linguistic infrastructure
• Natural language processing platform for public administrations
• Automatic translation platform for public administrations
Other actions
• Studies and strategies
• Internationalization
• Training
• Open data of linguistic interest
New
ver
tica
ls
He
alth
Tou
rism
Edu
cati
on
Platform NLP y TA
Linguistic Infrastructure
General Resources
Domain Resources
Open Ling Data
Citizen service Innovation Investigation
Axis 1
Axis 3.1
Axis 1
Axis 4
Axis 3.2
Road map of the Plan
Axis 4
26
Co
mit
é D
irec
tivo
Co
mit
é d
e Ex
per
tos
Ofi
cin
a Té
cnic
a G
ener
al
G o
b e
r n
a n
z a
P
l a
n
Coordinador
Infr
aest
ruct
. Li
ngü
ísti
cas
WP1, WP2…
WP3, WP4…
Pla
tafo
rma
NLP
Pla
tafo
rma
TAV
igila
nci
a se
cto
rial
Turi
smo
OTG
Coordinador
San
idad
1
WP1, WP2…
WP3, WP4…
Coordinador
San
idad
2
WP1, WP2…
WP3, WP4…
San
idad OTG
OTG
OTG
Expertos NLP/TA + Ejecutivos + Administrativos
28
• Presidency: SETSI• State Secretaries and Sub-
secretaries • MAEC, MINHAP, MECD, MINETUR,
MPRE, MINECO, MSSSI, MJUSFuturo: Interior, Defensa …
• Strategic planning of the Plan. • Periodic evaluation about progress and impact of the
Plan. • Elect and remove members of experts committee. • Supervise proposal from OTG.
Steering Committee
Experts Committee
General Technical Bureau
Governance Members Functions
• Research sector.• Industrial sector.• Academic and institutional sector.• Technical from public sector.
• Technical advisory to Steering committee. • Mechanism of interaction with the sector.• Facilitate collaboration and exchange of experiences
and best practices.• Spreading the plan actions.
• Technical from SETSI and supporting staff:
• Legal profile/ Adm. • Executive profile. • NLP and AT technical
• Administrative and technical management of the projects. Interlocutor CD.
• Defining projects with vertical • Setting interoperability standards , license models,
etc.
Governance