Letter of Intent - DFG
Transcript of Letter of Intent - DFG
DFG form nfdi10 - 05/20 page 1 of 15
Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de
Letter of Intent
DFG form nfdi10 - 05/20 page 2 of 15
Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de
1 Binding letter of intent as advance notification or non-binding letter of intent
Binding letter of intent (required as advance notification for proposals in 2020)
☐ Non-binding letter of intent (anticipated submission in 2021)
2 Formal details
Planned name of the consortium
Business, Economic and Related Data @ NFDI
Acronym of the planned consortium
BERD@NFDI
Applicant institution
University of Mannheim, Schloss, 68161 Mannheim;
Head: Prof. Dr. Thomas Puhl
Spokesperson
Prof. Dr. Florian Stahl, [email protected], Chair of Quantitative Marketing
and Consumer Analytics and Co-Director of Mannheim Center for Data Science at the
University of Mannheim
Co-spokesperson
Prof. Dr. Hartmut Höhle, [email protected], Management Analytics
Center and Chair of Enterprise Systems at the University of Mannheim
Co-applicant institution
Ludwig-Maximilians-Universität München, Ludwigstr. 33, 80539 München;
Head: Prof. Dr. rer. pol. Bernd Huber
Co-spokespersons
Prof. Dr. Bernd Bischl, [email protected], Chair of Statistical Learning &
Data Science and Director of the Munich Center of Machine Learning (MCML);
Prof. Dr. Göran Kauermann, [email protected], Chair of
Statistics - in Economics, Business Administration and Social Sciences
Co-applicant institution
DFG form nfdi10 - 05/20 page 3 of 15
Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de
University of Cologne, Albert-Magnus-Platz, 50923 Köln;
Head: Prof. Dr. Dr. h.c. Axel Freimuth
Co-spokesperson
Prof. Dr. Marc Fischer, [email protected], Chair in Marketing Science and
Analytics at the University of Cologne
Co-applicant institution
Mannheim University Library, Schloss, 68161 Mannheim;
Head: Dr. Sabine Gehrlein
Co-spokesperson
Dr. Sabine Gehrlein, [email protected], Mannheim University Library
at the University of Mannheim
Co-applicant institution
Universität Hamburg, Mittelweg 177, 20148 Hamburg;
Head: Prof. Dr. Dr. h.c. Dieter Lenzen
Co-spokesperson
Prof. Dr. Mark Heitmann, [email protected], Chair of Marketing &
Customer Insight at the Universität Hamburg
Co-applicant institution
Institute for Employment Research (IAB), Regensburger Str. 100, 90478 Nürnberg;
Head: Prof. Bernd Fitzenberger, PhD
Co-spokesperson
Prof. Dr. Frauke Kreuter, [email protected], IAB Statistical Methods group
and Professorship for Statistics and Methodology at the University of Mannheim
Co-applicant institution
Leibniz Information Center for Economics (ZBW), Düsternbrooker Weg 120, 24105 Kiel;
Head: Prof. Dr. Klaus Tochtermann
Co-spokesperson
Prof. Dr. Klaus Tochtermann, [email protected], ZBW and Digital Information
Infrastructures Group at the Christian-Albrechts-Universität zu Kiel
Co-applicant institution
DFG form nfdi10 - 05/20 page 4 of 15
Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de
GESIS - Leibniz-Institut für Sozialwissenschaften, B2,1 68159 Mannheim;
Head: Prof. Dr. Christof Wolf
Co-spokespersons
Prof. Dr. Christof Wolf, [email protected], GESIS and Chair of Social Stratification at
the University of Mannheim;
Prof. Dr. Stefan Dietze, [email protected], GESIS and Group Data & Knowledge
Engineering at the Heinrich Heine Universität Düsseldorf
Participants
o AG Informationskompetenz des BVB (AGIK BAY), Dr. Fabian Franke
o GSWG – Gesellschaft für Sozial- und Wirtschaftsgeschichte, Prof. Dr. Mark Spoerer
o GUG – Gesellschaft für Unternehmensgeschichte, Dr. Andrea H. Schneider-
Braunberger
o Institut für Bank- und Finanzgeschichte e.V., Hanna Floto-Degener
o Leibniz-Rechenzentrum (LRZ) der Bayerischen Akademie der Wissenschaften, Prof.
Dr. Dieter Kranzlmüller
o Leibniz Institute for Financial Research SAFE Sustainable Architecture for Finance
in Europe – Data Center House of Finance, Goethe University Frankfurt, Prof. Dr.
Jan Pieter Krahnen, Prof. Dr. Uwe Walz
o Leibniz Institute of Ecological Urban and Regional Development (IOER), Prof. Dr.
Marc Wolfram
o Netzwerk-Informationskompetenz Baden-Württemberg (NIK-BW), Dr. Marianne Dörr
o RatSWD – Rat für Sozial- und Wirtschaftsdaten, Prof. Dr. Monika Jungbauer-Gans
o Universitäts-IT der Universität Mannheim, Dr. Alexander Pfister, Kerstin Bein
o VHB – Verband der Hochschullehrer für Betriebswirtschaft e.V. German Academic
Association for Business Research, Prof. Dr. Hans Ulrich Buhl, Tina Osteneck
o Verein für Socialpolitik – Wirtschaftshistorischer Ausschuss, Prof. Dr. Ulrich Pfister
o Rayid Ghani, Carnegie Mellon University
o Prof. Julia Lane, Ph.D., New York University
o Dr. Georg Licht, ZEW – Leibniz-Zentrum für Europäische Wirtschaftsforschung
(ZEW)
o Dr. Katrin Moeller, Historisches Datenarchiv Sachsen-Anhalt
o Dana Müller, Institute for Employment Research of the Federal Employment Agency
(IAB)
o Prof. Dr. Isabella Peters, Leibniz Information Centre for Economics (ZBW)
DFG form nfdi10 - 05/20 page 5 of 15
Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de
o Prof. Dr. Mark Spoerer, University of Regensburg
o Prof. Dr. Jochen Streb, University of Mannheim
o Prof. Dr. Heiner Stuckenschmidt, University of Mannheim
o Dr. Peter Wittenburg, formerly Max Planck Computing and Data Facility; Member of
the GOFAIR Foundation Board
3 Objectives, work programme and research environment
Research area of the proposed consortium (according to the DFG classification system: o 12 Social and Behavioral Sciences
• 112 Economics
− 112-02 Economic Policy and Public Finance
− 112-03 Business Administration
− 112-04 Statistics and Econometrics
− 112-05 Economic and Social History
• 111 Social Sciences
− 111-02 Empirical Social Research
− 111-03 Communication Sciences
− 111-04 Political Science
• 110 Psychology
− 110-03 Social Psychology, Industrial and Organizational Psychology
• 109 Educational Research
− 109-04 Educational Research on Socialization, Welfare and Organizations
Concise summary of the planned consortium’s main objectives and task areas Social agents are the focal study object in the social sciences. It can be individuals such as in
psychology, or sociology, or larger aggregates in terms of organizations such as companies in
business administration, political parties in political science, or even economies such as for
macro-economic research. A key characteristic of the new digital era is that social agents leave
traces of their life and behavior in form of unstructured, non-standard data from new digital
sources such as social media, Google search, or geo-satellite services. By unstructured we
refer to data (“big data”) that is available in form of text, image, voice, or video data and is
generated from data sources not primarily constructed for analytical purposes. Existing research
data infrastructures are not prepared to handle this new form of data that is very different from
DFG form nfdi10 - 05/20 page 6 of 15
Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de
traditional structured data in terms of size and required computing capacity. Both traditional,
structured and new unstructured data are necessary for scientific progress in the social
sciences. By definition, structured data are ready to be used for theory testing and prediction
models of the empirical researcher. Unstructured data, however, need to be processed and
analyzed to turn it into a structured form that serves the research question to be studied.
Researchers in business, economics and related fields need efficient processes and tools, an
efficient data infrastructure and a comprehensive implementation of pertinent methodological
knowledge for their research and teaching. BERD@NFDI is a cooperation between the
Universities of Mannheim, Cologne, Hamburg, and Munich, the Institute of Employment
Research (IAB), the ZBW – Leibniz Information Center for Economics as well as GESIS –
Leibniz Institute for the Social Sciences, and is supported by community partners, such as the
German Academic Association for Business Research (VHB). That means, BERD@NFDI brings together leading institutions in business, economics,
educational research, psychology, social science, and communication science with method
experts in the area of artificial intelligence and machine learning, who intend to contribute their
best resources in order to exploit the new types of data for evidence-based empirical research.
The consortium will be supported by leading research institutions, such as the Leibniz Institute
for Financial Research SAFE, and infrastructure organizations, such as ZEW-FDZ, ZBW,
GESIS, the Mannheim University Library, and the Leibniz Supercomputing Centre (LRZ) in
Garching. As a structural contribution to NFDI, BERD@NFDI aims to develop and disseminate
transparent, FAIR and innovative standards, methods and tools to manage, (pre-)process and
archive unstructured and non-standard data as well as to combine and connect them with
structured data in economics, business and related research fields. The intended work program
comprises seven task areas (TA, see figure 1).
TA1 – BERD@NFDI Community Involvement: As an initiative driven by researchers,
community involvement and user-centered design lies in the DNA of BERD@NFDI. Users will
be involved in project steering and requirements analysis. A user-centered design and an agile
development methodology ensure that the infrastructure is aligned with the actual needs and
requirements of the user community.
TA2 – Creating BERD: In social sciences in general and business and economics in particular,
both the amount and type of data sources have proliferated. Creating transparency of the types
of data access and collection approaches will enhance the potential to reuse prior data
collection efforts for replication, extension, and application to new research problems.
DFG form nfdi10 - 05/20 page 7 of 15
Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de
Figure 1: Work program of BERD@NFDI with seven task areas for the management of unstructured and non-standard research data.
TA3 – Processing BERD: Researchers have to deal with a variety of data and different levels
of data quality. BERD@NFDI will support the research community in offering and describing
suitable methods for (pre-)processing, e.g. to extract text and semantics from images, and
linking business and economic research data, as well as in documenting and making data
accessible. As a result, users will be able to assess the strength and weaknesses of the
available data
TA4 – Analyzing BERD: To investigate substantive research questions, researchers do not
only search for relevant data but also need to apply appropriate machine learning algorithms to
transform unstructured data into a form that is amenable to further (causal) analysis.
BERD@NFDI will connect research data with algorithms used in business, economics and
related areas so users can exploit the available data to its full potential, find common standards
in terms of data processing and better understand the weaknesses, strengths, and performance
characteristics of individual algorithms for applied research purposes.
TA5 – Preserving & Accessing BERD: Preserving and maintaining a sustainable degree of
accessibility for digital content is one of the main challenges in research data management.
BERD@NFDI will provide data preservation and data handling operations, metadata
standardization concepts and furthermore develop data migration strategies. The BERD@NFDI
DFG form nfdi10 - 05/20 page 8 of 15
Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de
information portal will package all necessary components and provide access to functionalities
like searching, remote access, identity management and persistent identifiers.
TA6 – Re-using BERD: Researchers generating data often face resource or legal constraints
when intending to share data. Even if the data is being shared, it is often located in archives
with only a small number of users and little impact on academic insights. BERD@NFDI will
demonstrate the value of sharing to those involved in data production and support them to
overcome existing barriers.
Brief description of the proposed use of existing infrastructures, tools and services that are essential in order to fulfil the planned consortium’s objectives
BERD@NFDI builds on the following existing infrastructures: o ZBW will bring in its entire portfolio for research data management, including the
technologies developed within the DFG-funded project GeRDI as well as the entire
portfolio of FAIR data implementation networks of the GoFAIR initiative. ZBW will provide
its federation and harvesting technologies for research data repositories as well as its
tools for metadata normalization. Latest developments within the FAIR data movement
will be ensured through ZBW’s leading role in this initiative.
o GESIS will bring in its entire portfolio and technologies for research data management.
Together with ZBW, GESIS will exploit the synergies with the consortium KonsortSWD.
o DFG Research Group on the impact of social media headed by the University of
Hamburg and the University of Cologne, which generates, shares and works with both
structured and unstructured data from various online channels.
o The Mannheim Center for Data Science (MCDS) will contribute the results of the
BERD@BW project regarding professional training in the area of analysis and
management of unstructured (big) data.
o OpenML (https://www.openml.org/) which is a collaborative and open platform for
machine learning where Prof. Bernd Bischl is a developer and member of the core team.
o LRZ with its world-leading high-performance computing and storage systems will provide
its infrastructure for high-performance services and software for data analytics which are
specially tailored to the needs of AI methods.
o The BMBF-funded International Program in Survey and Data Science (IPSDS)
established in collaboration between the University of Mannheim and the University of
Maryland, with inputs from the IAB, the LMU and the Bundesbank, will provide a platform
and starting point for the asynchronous professional training opportunities on the next
types of data.
DFG form nfdi10 - 05/20 page 9 of 15
Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de
o The research data center of the Mannheim University Library will provide data and
infrastructure services (see https://fdz.bib.uni-mannheim.de/), e.g. the automated text,
layout and structure extraction from digitized publications as done in the Aktienführer-
Datenarchiv (https://digi.bib.uni-mannheim.de/aktienführer/data/index.php). Furthermore,
the ontology of German firms will be employed and extended in BERD@NFDI and used
for entity identification purposes in (un-)structured data.
o The Coleridge Initiative (https:// coleridgeinitiative.org/) – with a partnership to
BERD@NFDI’s co-spokespersons – successfully implemented a secure Administrative
Research Data Facility which holds in addition to more structured records unstructured
data as part of a joint data schema. The environment runs within AWS and has been
successfully tested to provide a training platform and to allow remote access to
researchers. This platform serves as a role model for the planned infrastructure.
Interfaces to other proposed NFDI consortia: brief description of existing agreements for collaboration and/or plans for future collaboration
KonsortSWD: KonsortSWD is an established consortium with a long history and expertise in
the integration and management of primarily structured data from standard sources in the Social
Sciences. Scientific progress in the social sciences needs a powerful and interconnected
infrastructure for handling both structured and unstructured data originating from both standard
and non-standard sources. BERD is the complement to KonsortSWD, with which we closely
work together to offer an integrated access to the full breadth of data. The two consortia will
build up the research data infrastructure for the future covering all subfields of the social
sciences and which is unique to the world. NFDI4Memory: Since much historical (economic, business and social) information lies within
pictures and unstructured text, we will closely interact with the NFDI4Memory. The cooperation
will comprise the exchange of relevant metadata for the Data Space (NFDI4Memory) and the
data pool (BERD@NFDI) as well as ontologies and vocabularies (e.g. the firm ontology of
BERD@NFDI, the standard thesaurus for economics of ZBW, the ontology of job titles of
NFDI4Memory). We will also work together on methods regarding the analysis of unstructured
data as needed, e.g. automated text, layout and structure recognition.
MaRDI – Mathematical Research Data Initiative: MaRDI and BERD@NFDI will cooperate on
interdisciplinary topics of machine learning led by Bernd Bischl, who is co-spokesperson in both
DFG form nfdi10 - 05/20 page 10 of 15
Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de
consortia. Both consortia aim at advancing machine learning approaches and at enabling
researchers to successfully apply them to specific research questions. While MaRDI contributes
its expertise in algorithms, their implementation and empirical benchmarks, BERD@NFDI
focuses on the application of these tools to data and research questions in economics and
social sciences. Another field of collaboration is the assessment of data quality in
heterogeneous and unstructured data.
Text+: Text+ is also a consortium in the Humanities and has its focus on building an
infrastructure for text and speech-based data with tools and services applied on this kind of data
in the Digital Humanities. We will strive for cooperation on topics regarding the management of
and analysis methods for text and speech data as types of unstructured data.
4 Cross-cutting topics
Please identify cross-cutting topics that are relevant for your consortium and that need to be designed and developed by several or all NFDI consortia.
BERD@NFDI supports the Berlin Declaration of handling NFDI-cross-cutting topics. All
areas mentioned in the declaration are important with BERD@NFDI seeing a special relevance
in the following topics: o Teaching & education: The report of the German Council for Information Infrastructures
(RfII) and the High-Level Expert Group on the EOSC have identified a clear need for
building up new competences, more capacities and new curricula for research data
management.
o User involvement: To build a useful and valuable infrastructure for research data that
users will contribute to, user involvement also on the governance level and user-centered
design is a crucial point, no matter what the underlying research discipline is.
o Legal aspects: Legal questions are also expected to arise in all consortia along the whole
data management process. This includes, among others, questions on licensing external
data from other research institutions, research data centers, companies etc. for their own
research purposes, as well as data protection, privacy issues and the proper license for
publishing data at the end of the research process.
o Quality assurance: Data quality is crucial for the (re-)usability of data in all research areas.
For the BERD@NFDI consortium this holds even more since sources of unstructured data
are often non-standard and not primarily research-focused. Therefore, the authenticity of
data is a major issue to be addressed.
DFG form nfdi10 - 05/20 page 11 of 15
Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de
o Standardization and harmonization of terminologies: Almost every research community
has its own terminologies. In order to gain a common understanding of data, methods and
concepts, a certain degree of standardization and harmonization of terminologies across
the consortia is essential.
o FAIR metadata: To promote interdisciplinary re-use of data from all NFDI consortia,
interoperability has to be ensured with the FAIR (meta) data principles as central
guidelines, while incorporating the specific requirements of a particular user community at
the same time.
Please indicate which of these cross-cutting topics your consortium could contribute to and how.
BERD@NFDI supports the Berlin Declaration of handling NFDI-cross-cutting topics. The
BERD@NFDI consortium will treat these and other possible cross-cutting topics in its cross-
cutting topics committees. These committees will actively cooperate with other consortia and
feed the results in the strategic planning and work program of BERD@NFDI.
BERD@NFDI will especially contribute to the following topics: o Teaching and education: BERD@NFDI can build on extensive experience of several
participants in the conception and implementation of online courses and workshops about
data processing and analysis in the social sciences. The concepts which have proven
successful at an international level can also inform the development of similar activities in
other disciplines.
o User involvement and user-centered design engineering play an essential role in the
whole process of developing, pre- and post-implementing the services of BERD@NFDI.
The consortium can bring in its high expertise of these topics to provide innovative impulses
for the NFDI as a whole.
o Concerning legal aspects, BERD@NFDI will closely interact with KonsortSWD to leverage
synergies in this area. We will bring in the issues arising in connection with unstructured
data, AI and ML algorithms. BERD@NFDI can contribute to this with its clear disciplinary
focus.
o Quality assurance: BERD@NFDI will focus on the development of standards for data
quality assessment, normalization, and pre-processing of unstructured data. Furthermore,
we will strive to foster the standardized documentation of these processes in research
publications. The adoption of such generally accepted standards could significantly
advance good scientific practice in the field of empirical economics and social sciences,
and serve as a model for other disciplines.
DFG form nfdi10 - 05/20 page 12 of 15
Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de
o BERD@NFDI will contribute to the standardization and harmonization of terminologies
because it brings in-depth knowledge of the vocabularies and ontologies used in
economics. In addition, it develops its own ontology and knowledge graph for company
data. This offers numerous opportunities to further develop the existing structures into a
semantic web for scientific data.
o FAIR metadata: For the services of BERD@NFDI, a way must be found to bring specific
requirements of a particular user community in line with the requirements of interdisciplinary
re-use and general guidelines such as FAIR principles. BERD@NFDI will contribute its
perspective to the discussion on FAIR metadata for interoperability in NFDI and beyond.