Improvement of the use of administrative sources …...the use of geospatial data from the main...
Transcript of Improvement of the use of administrative sources …...the use of geospatial data from the main...
The project is
funded by European
Statistical
System
National
Statistical
Institute
IMPROVEMENT OF THE USE OF ADMINISTREATIVE SOURCES (ESS.VIP ADMIN WP6: pilot studies and applications)
Improvement of the use of administrative sources
(ESS.VIP ADMIN WP6 pilot studies and applications)
Subtitle: Establishing a point-based foundation for address geocoding of
statistical and administrative data
Grant agreement_07112.2017.007-2017.441
Final methodological report
Sofia, April 1, 2019
2
Table of contents
List of acronyms and abbreviations ....................................................................................................... (2)
Executive summary ............................................................................................................................... (3)
I. Introduction and background ............................................................................................................. (5)
II. Summary of project activities and main results .............................................................................. (11)
III. Proposed solution for establishing a point-based foundation for address geocoding at NSI ........ (14)
Conclusion ........................................................................................................................................... (31)
References ........................................................................................................................................... (31)
Annex: Summary table of the analysis and conclusions on selected data sources to establish a point-
based infrastructure ............................................................................................................................. (32)
List of acronyms and abbreviations:
AO Addressable Object
ATU Administrative Territorial Units
CA Geodesy, Cartography and Cadaster Agency (in short Cadastral Agency)
CLU Classificatory of Localisation Units
DG CRAS DG Civil Registration and Administrative Service (in short Civil Registration)
EKATTE Unified classificatory for administrative- territorial and territorial units
GSGF Global Statistical Geospatial Framework
SDG Sustainable Development Goals
NAR Centralized Information System “Address Register” (in short National Address Register)
NRPP National Register of Populated Places
NSDI National Spatial Data Infrastructure
NSI National Statistical Institute of Bulgaria
SEGA State e-Government Agency
SBR Register of Statistical Units (Statistical Business Register)
SPR Information System Demography (Statistical Population Register)
UN-GGIM United Nations initiative on Global Geospatial Information Management
3
Executive summary
The purpose of this report is to propose a methodological framework to guide the work on
establishment of a point-based geocoding infrastructure for Census 2021 and post-census
statistical production. The paper presents the use of geospatial data at NSI and proposes possible
solutions to enable consistent address geocoding at organizational level. It outlines an
implementation strategy that could be followed by executing certain tasks and activities. More
particularly the report tries to answer the following questions:
What information to be used and how?
What institutional arrangements are needed?
How to harmonise address data and enable it spatially?
How to make geocoding to a point a consistent process?
What can improve the situation?
For answering these questions and building a consistent analytical approach the project team
used different inputs as: outcomes from the project kick-off meeting, assessment results from
specially designed survey sent to the partner organisations and some of the subject matter
departments of NSI, status quo and plans for establishing and maintenance of important
registers at national level, results from EFGS/GEOSTAT projects, as well as UN-GGIM papers
on integration of statistical and geospatial information, core geospatial data themes,
recommendations for data content for 2021 Population and Housing Censuses, and a number
of good practices.
The report is divided into three sections.
Section I provides an overview on geospatial data use at NSI, introduces the strategic and
legislative preconditions for identification of needs, describes the challenges and issues in the
national context and summaries the general recommendations taking into account the current
national context.
Section II gives a summary of the project activities and main results, marking the scope of
methodological work and the steps followed to complete the project tasks.
Section III lays out the proposed solution for establishment of a point-based geocoding
framework for Census 2021. It focuses on the data sources and on expanding and improving
the use of geospatial data from the main spatial data provider (Cadastral Agency). The section
elaborates on conceptual model to harmonise address data collected from a range of sources,
methods for geo-enabling this data and organisation of geocoding, as well as findings and
recommendations for organisational set-up to implement the infrastructure.
Expanding and improving the use of cadastral data within NSI and close collaboration with CA
was essential for the project work and is essential for NSI to meet the requirement for full
geocoding of census data and integration of statistical and geospatial information.
4
[NOTE on basic data configurations - The ESSnet KOMUSO typology] The data sources in the
focus of the project were not intended for statistical content production. The sources were
assessed particularly for their value to provide location for statistical information, as purely
infrastructural data.
Acknowledgements
The project team from NSI would like to thank all the partner institutions and experts involved
in this project, namely:
o Geodesy, Cartography and Cadaster Agency
o DG Civil Registration and Administrative Service
o Municipality of Gabrovo
o State e-Government Agency
5
I. Introduction and background
1. Background
GIS technologies and geospatial data concept were introduced for the first time in between
1997-1998 at NSI, in support of pre-enumeration phase or so-called census mapping process.
At the beginning, NSI started by creating digital models and spatial data layers for few
settlements through scanning and digitizing paper maps and plans, and consequently after
Census 2001 to add census data to these digital models and layers. This method of spatial data
collection and processing continued until 2006 when the Cadastral Agency provided on the
voluntary basis first vector data from the digital cadastral maps. In 2009, recognising the
importance of having official and more accurate spatial data, NSI and the CA have signed
bilateral agreement for data exchange to support Census 2011. The agreements had been the
main precondition for cooperation and implementing joint activities.
2. Identifying the needs
There are certain strategic and legislative drivers that bring the needs for integration of high-
resolution geospatial data to support the statistical production at NSI.
2.1. Strategy for Development of the National Statistical System of the Republic of
Bulgaria 2013 - 2017, amended by an extension until 2020 1
The strategy highlights as a horizontal priority the process of establishing conditions for
production and integration of spatial (georeferenced) information with statistical information
by:
Using the infrastructure for spatial information in the European Community (INSPIRE),
in particular via an EU geoportal.
Integration of statistical data, when applicable, in order to establish an infrastructure
with multiple information sources for providing a spatial and temporal analysis.
Enlarging the use and dissemination of regional geo-referenced statistical information.
Furthermore, this strategic document also recognises that the process of integration of statistical
and geospatial data will be one of the main challenges that will determine the development of
the NSI until 2020.
2.2. Law on Population and Housing Census 2021 2
In 2019, the national law on population and housing census 2021 was adopted. The law
identifies certain stages and actions of Census preparations that require the use of official
geospatial data, orthophoto imagery, collaboration with relevant bodies as well as the
application of georeferencing activities in order to ensure reliable, detailed and comparable data
1 http://www.nsi.bg/sites/default/files/files/pages/uplf_e/Strategy2013-2017_2020.pdf 2 http://www.nsi.bg/sites/default/files/files/pages/Census2021/ZPNJF2021.pdf (BG only)
6
from the survey. The law is in line with the EU regulatory framework related to the Census
2021.
When drafting the law, NSI organized a series of public consultations, where the demand on
statistical data from Census on very detailed territorial levels was highlighted by key users as
policy makers (national and sub-national level), academia and researchers, general public etc.
The address registration process and geocoding activities will be a milestone in Census 2021
and will provide valuable mechanism for linking individuals to housing units and dissemination
of data on the smallest possible territorial units.
2.3. COMMISSION IMPLEMENTING REGULATION (EU) 2018/1799 of 21
November 2018 on the establishment of a temporary direct statistical action for the
dissemination of selected topics of the 2021 population and housing census geocoded to a
1 km2 grid 3
In 2018, EC adopted an implementing regulation on the establishment of a temporary direct
statistical action in order to develop, produce and disseminate selected topics of the 2021
population and housing census geocoded to a 1 km2 grid. The action is justified by a common
need across the Union for reliable, accurate and comparable information on population
distribution with sufficient spatial resolution, founded on harmonised output requirements and
intended in particular for pan-European regional policy-making.
The plans of NSI to geocode the census data on point level, or in certain parts of the territory at
least on 1 km2 grid, will provide precise enough information to fulfil the task.
2.4. Regulation (EC) no 177/2008 of the European Parliament and of the council of 20
February 2008 establishing a common framework for business registers for statistical
purposes and repealing Council Regulation (EEC) No 2186/934
The regulation establishes a common framework for business registers for statistical purposes
in the Community and demands that EU Member States shall set up one or more harmonised
registers for statistical purposes, as a tool for the preparation and coordination of surveys, as a
source of information for the statistical analysis of the business population and its demography,
for the use of administrative data, and for the identification and construction of statistical units.
The regulation also specify the information content of the business register, where collecting
data on geographical location code and address (including postcode) at the most detailed is
mandatory for the local units.
At operational level the MSs represented in the permanent working group on business registers
and statistical units of Eurostat are trying to find solutions for identification of the local units
via their physical geographic location. In order to handle this issue the working group propose
several operational rules related to identification through geographical localisation:
3 https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32018R1799 4 https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A32008R0177
7
Operational rule: Identification
For the identification of a local unit, the physical geographic location has to be identified. Such
a single physical location is normally best approximated by the postal address. Several physical
locations of the same enterprise within the same community or within the same region are to be
treated as several local units of that enterprise.
Operational rule: Physical geographic location
A physical location of a local unit may be found within a building, may correspond to one
building or may comprise more than one building. In the latter case, the various buildings do
not form separate local units if they are physically close together and have a common postal
address.
Operational rule: Local unit without postal address
A local unit may not be situated in a building at all. If in that case the other criteria are fulfilled
a separate local unit should be identified. In such a case a postal address may not exist; however,
the geographical identification could be represented by geographical coordinates or other
measures.
2.5. Sustainable Development Goals5
The 2030 Agenda for Sustainable Development of UN is another strategic driver for NSI to
establish a point-based foundation for address geocoding of statistical and administrative data.
The demand for statistical data on SGDs is growing each year and NSI is trying to address this
needs by combining different sources and planning concrete activities related to SDGs in the
National Statistical Programme elaborated on a yearly basis.
The deliverables of this project can be used to expand the current statistical production models
and provide NSI with a powerful tool for ensuring statistical data disaggregated on levels that
will ease and enrich the reporting process on SDGs and its targets at national level.
The project recognizes the need for further practical application of the proposed point-based
foundation in data production for relevant indicators from the Global Indicators Framework
(GIF) or substituent proxy indicators from the national list which is under development.
Indicators from the GIF that can be tested:
9.1.1 Proportion of the rural population who live within 2 km of an all-season road.
11.1.1 Proportion of urban population living in slums, informal settlements or inadequate
housing.
11.7.1 Average share of the built-up area of cities that is open space for public use for all, by
sex, age and persons with disabilities.
5 https://sustainabledevelopment.un.org/?menu=1300
8
3. Issues and challenges at national and organisational level
3.1. A common standard for addresses
Standardisation of address data is applied in the framework of civil registration system,
regulated by the national law on civil registration. The law provides a very general definition
and set of rules to be applied by registration authorities.
The address according to the law is the unambiguous description of the place where the person
lives or where he or she receives correspondence. The address in the Republic of Bulgaria shall
contain the district, municipality and settlement names. Depending on the location described,
the address may also include a localisation unit name (square, boulevard, street, residential
complex, neighborhood, etc.), street number, entrance, floor and apartment number. The locator
unit number may consist of a combination of up to four characters, the first three numerical
digits, and the last character a letter. The entrance can consist of one letter or a number up to
two digits, the floor up to two digits and the apartment up to three digits. The mayor of each
municipality defines the addresses in a given municipality by issuing official act. The collection
of addresses for all municipalities are forming the National classification of current and
permanent addresses (NCCPA).
The classification does not contain any other addresses except those, which describe units for
residential purposes. On other hand, very few municipalities out of 265 maintain a local address
register or database in digital format. In many cases, the addresses are stored on paper and being
inserted in the classification only if any actualisation “event” is initiated by the citizens, as
example - property or civil registration.
3.2. Delay and reset of the National Address Register project
The national strategy on the development of e-government in the republic of Bulgaria 2014-
2020 6 set as priority establishment of conditions for normal functioning of the primary
electronic registers existing in the country. The NAR was included in the action plan, as it was
considered as one of the foundational registers and one of the pillars for the development of e-
governance.
Initially, NAR was determined to synchronise address information in administrative registers
by providing Unique Address Identifier (UAI), but not serving physical location in the form of
geographical coordinates for the requested address.
After a number of discussions at national level and recognition of user needs on geo-enabled
address national dataset, Cadastral Agency was appointed as an authority to be responsible of
the national address system and administration of address register. Finally, in the middle of
2018, the NAR-project was assigned to CA for implementation.
6 https://www.e-gov.bg/en/about_us
9
However, a fully functioning NAR with relevant coverage suitable for census purposes is not
guaranteed to be present in time for census activities. The needs for results driven by the EU
and national legislative obligations, including identified user priorities are another reason to
find solutions in cooperation with the CA as a partner organisation of NSI.
3.3. National spatial data infrastructure
Availability and access to fundamental geodata is one of the main challenges at national level.
The national spatial data infrastructure is in a very un-mature stage of development. Currently,
access to geodata is left to bilateral agreements between providers and users. Interoperability
and availability of national reference geodata is still insufficient for statistical purposes.
3.4. Challenges in organisational context
Current in-house production methodology for geocoding is inefficient and not capable to cover
user needs for regular production of spatial statistics. Investments in technical infrastructure
and a stepwise transformation to service oriented architecture are needed in order to ensure
more effective and consistent automated processes and interoperability. Certainly, the lack of
organizational integration of geospatial data and common platform are the challenges to be
worked on. The problems generated from that leads not only to production capacity, but also
ensuring a quality measures that usually can be detected with geospatial analysis and
assessment activities.
4. Methodological and organisational recommendations
In order to cover the identified needs and to prepare a business plan for establishing a point-
based geospatial framework in NSI, the project team developed methodological and
organizational recommendations that should be followed. The recommendations are structured
in a way that answers given questions/issues.
4.1. A point-based spatial statistical framework
Use high-quality point-based location data. All features making up the infrastructure need to
be time-aware and have a start- and end date, and relevant metadata to know when and how the
geocodes were derived. The data should be regularly updated with relevant timestamps.
Presence of high-precision and standardized geocodes/identifiers at a unit record level.
Geocodes are standardised codes or identifiers usually from national classifiers or coding
systems and are used to link unit record data with location data.
Priority on address geocoding process. Geocoding is the process of assigning geocodes to
statistical unit records or its geographical localisation attributes as addresses. Addresses are
collected and used from a number of statistical surveys and registers and currently are the only
option to spatially enable statistical data that is already collected. Addresses can be turned into
10
geocodes if they are linked to geospatially referenced object, such as is building, managed
properly and provided with identifiers.
Establish address location and identification system. NSI should work on establishment of
address reference dataset because there are no national address or building registers. Cadastral
data maintained by CA contains ID of properties but integration with statistical records only by
cadastral identifiers is not possible without an intermediate reference framework that contains
all key identifiers. In parallel, a model to maintain collected address data in a standardised and
consistent way should be developed. A centralized address location and identification system
that can harmonise all address data across the organisation by using permanent address
identifiers is needed. The system should provide a permanent location identifier to all the
datasets containing address by validating service. In addition, it is good point-of-entry
validating against the central address database to be developed as late as for the time of census.
Ideally, the system should in the future provide integration with the national address register.
In all cases such system is needed because it could maintain location description that exist and
are used by people but are not officially recognised.
4.2. Institutional priorities and activities
Build the infrastructure required to answer the needs. Covering the identified needs
requires certain changes to be initiated by NSI in its corporate infrastructure management.
These changes should take into account all current issues (constraints) and challenges at the
national context, including the burden that will be generated on the production and institutional
management processes.
Ensure resources. Beside identification and accessing new sources to feed the system of point-
based geolocation data, investments on capacity and resources are needed to answer the
challenges. The new infrastructure will come with new architecture, new or significantly
revised data flows, which NSI has to manage together with the external data providers.
Coordination and cooperation. Building new infrastructure requires sustainable and
consistent coordination between all parties related to the maintenance of the infrastructure. The
components and the overall business process has to be developed in close cooperation with all
data or service providers.
11
II. Overview of project activities and results
Methodology work was carried out in four stages.
Experts from National Geodesy, Cartography and Cadastre Agency (CA), Directorate General
for Civil Registration and Administrative Services (CR), State e-Government Agency (SEGA)
and Municipality of Gabrovo participated in the project activities.
Fig. 1: Focuses of methodological work on setting point-based geocoding infrastructure
12
Stage 1: Analyze and assess the potential of administrative and statistical sources for setting up
a point-based infrastructure for geocoding of data.
(1.1) Assess usefulness and accuracy of recognized geospatial data sources.
(1.2) Assess completeness and geographical coverage of the datasets for the moment of
Census enumeration planning and collecting phase.
(1.3) Assess quality and scheme of maintenance of the datasets.
(1.4) Assess temporal aspects maintenance and temporal cohesion with statistical data.
The address data within the following datasets were examined and assessed for being
useful for initial establishment, regular and complementary updating and standardization
of address foundation. The address data was assessed for quality of address content -
consistency and completeness:
Information System Demography (NSI)
Business register (NSI)
Census 2011 (NSI)
Survey on newly built residential buildings and dwellings (NSI)
Nomenclature of permanent and current addresses in Bulgaria (DG CRASS)
Cadastral map and cadastral registers (CA)
Local taxes (Municipality of Gabrovo)
Stage 2: Improve access to geospatial data in regard of setting up building/dwelling and address
register for statistical purposes.
(2.1) Harmonise address location data in regard European and International standards
and coordinate address identifiers/ geocodes.
(2.2) Develop uniform approach to spatially enable address data with cadastral map
objects.
(2.3) Build formal working relationship with location data providers for sustainable
geospatial data flows and feedback managing routines.
Stage 3: Develop strategy for setting up and maintenance of point-based geocoding
infrastructure at NSI:
(3.1) Assess resources needed/ assess processing capacity.
(3.2) Set-up organization for obtaining and management of geospatial data.
(3.3) Specify geo-statistical census output.
Stage 4: Develop methodology for consistent geocoding of statistical/ administrative data.
(4.1) Develop routines for geocoding, geocoded data verification and managing
geocoding errors.
(4.2) Develop consistent methodology for complementary geocoding (for areas where
point location does not exist).
13
Main results
Conclusions on sources to be used and proposal for data to set up a point-based
foundation.
Methodology for enabling address data with geolocation. Spatial address data model.
Methodology for geocoding of administrative and statistical data records. Components
of consistent geocoding.
Organisational setup - key findings and recommendations for activities.
14
III. Proposed solution for establishing a point-based foundation for address geocoding at
NSI
“As to methods there may be a million and then some, but principles are few. The man who grasps principles
can successfully select his own methods.” ― Harrington Emerson
The Global Statistical Geospatial Framework provides a common method for statistical and
geospatial data integration and sets five high-level strategic principles to form the basis for the
statistical geospatial infrastructure development:
Principle 1: Use of fundamental geospatial infrastructure and geocoding.
Principle 2: Geocoded unit record data in a data management environment.
Principle 3: Common geographies for dissemination of statistics.
Principle 4: Statistical and geospatial interoperability – Data, Standards and Processes.
Principle 5: Accessible and usable geospatially enabled statistics.
The team followed these principles and the connected objectives, consolidating the outcomes
from GEOSTAT projects together with UN-GGIM recommendations for content of the core
data, to elaborate on how to implement the address geocoding framework in the contemporary
situation.
1. A Point-based address reference dataset
For the census, an address system for locating buildings and dwellings or identifying housing
units is needed. The address reference system is location system in a human readable form. It
defines set of address components and the rules for their combination into addresses.
Currently, the exhaustive set of addresses in the country cannot be provided by one single
source.
1.1. Data sources and frameworks
To find the high-accuracy data appropriate for an address point-based foundation, a number of
data sources from the public domain were initially selected for assessment, following the
recommendations to use data from trusted, authoritative sources.
Addresses in statistical datasets:
Addresses of buildings and dwellings from Census 2011 survey (NSI)
Addresses of current residence of population, Statistical Population Register (NSI)
Addresses of businesses, Statistical Business Register (NSI)
Addresses of newly built residential buildings, quarterly exhaustive statistical survey
collecting data from local authorities (NSI)
Addresses in administrative sources:
Addresses of physical location of land parcels, buildings, units within buildings from
Cadastral registers (CA)
Nomenclature of permanent and current addresses in Bulgaria (DG CRAS)
Local taxes (Municipality of Gabrovo)
15
Location reference data from Cadastral map
Vector data of land parcels, buildings, units within buildings (CA)
The overall quality of the sources and datasets was assessed - as regulations and ordinances for
maintenance, schemes of collection, storage, update, usage of thematic coding systems, etc. The
quality of thematic content of addresses/geocodes was evaluated for completeness of attributes
and consistency of coding for the address components by comparing to the standardised
national list/ nomenclatures. A summary table of the analysis can be found as annex to the
report.
Regarding the establishment and maintenance of spatial address reference dataset the data
sources were classified to have the following roles in the process:
Initial – sources to establish initial collection of addresses and addressable objects;
Updating – sources for regular update or address foundation;
Complementing – sources for complementary updating, contributing to address
collection by extending the thematic scope of address information.
Standardising – national registers and nomenclatures that provide standard for address
components.
Addresses are defined as structured descriptions of a place, and often an address consists of a
number of hierarchical components and identifiers. No official national address standard is in
place, but coding systems and frameworks to help in data harmonisation do exist.
National classificatory and registers:
EKATTE – Unified classificatory for administrative- territorial and territorial units
National Register of Populated Places and Unified Classificatory of Administrative-
territorial and Territorial Units (NSI);
Register of Geographic Names (CA);
Classificatory of Localisation Units (thoroughfares) (DG CRAS);
Classificatory of Addresses – List of all addresses capable for registration of citizens,
defining the numbering ranges for every localisation unit (DG CRAS).
First we want to establish possibly the most exhaustive list of trusted, valid set of addresses.
During the census survey in 2011 the addresses of residential buildings and dwellings were
collected in a standardised structure and coded in the time of collection. The census address
collection is the most complete and standardised address dataset and proposed for initial setup
of address reference dataset.
DG CRAS maintains address information within the framework of Civil Registration System
and provides national coding system for thoroughfares and national list with addresses where
citizens could be registered for residence.
Nomenclature from DG CRAS maintained with the help of local authorities is proposed for
address components/ characteristics update. For registering new addresses and retiring, one that
is not in use and updating changes in address characteristics.
1.2. Address content and harmonisation
16
Address definition in Cadastral and Property Act is: “Address of an immovable property" shall
be the description of its physical location comprising obligatorily the names of the district, of
the municipality and the populated place/the settlement unit, and including (as appropriate) the
name of the street, respectively square or boulevard, housing complex or neighborhood, street
number, entrance, floor, self-contained property within a building, and for immovable
properties in agricultural and forest areas, respectively the name of the locality.
Addresses differ in content and quality of detail in urban areas of the settlements. Addresses
outside urban areas (in agricultural and forest areas) also in small villages may have no assigned
street names so address is given only at populated place/ village level (address area name).
Buildings in this village share the same address and do not have distinctive address. This issue
is problematic when conducting on-line census for identification of housing units.
According to INSPIRE, five subclasses of address components are defined: administrative unit
name, address area name, thoroughfare name, address locator, and postal descriptor.
Every component represents a level of objects in this hierarchical framework and defines a level
of accuracy of addressing.
District name
Municipality name
Capital city municipality subdivision name
Settlement name (type prefixes)
Plovdiv /Varna city subdivision name
Postal area
Localisation unit name (type prefixes)
Street locator
Building locator
Entrance locator
Floor locator
Unit within building
House numbers or names are important to distinguishing one location from neighboring
addresses. This is the mandatory information. It can be a systematic designator, such a number
or a name. Addresses can have other locators, such as an entrance number or apartment/ unit
number.
Two types of temporal information are recommended by the theme specification:
temporal information on when this version of the address is valid in real world
temporal information on the changes of the address record in the database or spatial
dataset.
Provides metadata about the lifecycle of the connection between the address and the object.
The status value (Current, Retired) of address data relates to the real world address or address
component and not to the property to which the address or address component is assigned, the
addressable object. The addressable object has its own status value.
17
A characteristic point represents the position of the address. The address record metadata should
provide information on how this point is captured and by whom:
Captured from cadastral addressable object or
Created manually by NSI staff by pinpointing the position on map
Created by NSI by field surveying, capture GPS coordinates
Address geographic position should be specified by the type of spatial object used to derive the
position. This could be: building, part of the building/ entrance for residential multifamily
buildings, land parcel, and administrative unit. Wherever possible, building or entrance is
recommended to be used, for reasons of precision.
position - a pair of geographic coordinates;
position object – the type of the object that provides position
position object identifier- permanent identifier of AO;
position level the level of accuracy from which is the position;
position source – provides information how is position captured;
Position GridID – holds the grid-cell ID of ETRS89_1km grid net.
Fig 2. Characteristic points captured from cadastral map objects
Address location can be determined by the following component combination/ location styles:
The address should allow the unambiguous determination of an object for purposes of
identification and location. Ambiguity of naming was found on every level of address hierarchy,
except on district level (Units from the same level of hierarchy with the same name). This issue
has to be followed with metadata informing every step of address identification.
1.3. Address Matching
There are three general parts of addressing which identification is essential for finding the
physical location:
Locate within the country Identify settlement
Locate within the settlement Identify the localisation unit
Locate within localisation unit Identify building or parcel
Locate within the building Identify dwelling/ unit within building
18
To identify unambiguously the settlement, "full name" is needed, which includes the settlement
type, name and administrative-territorial belonging. Three alternatives for settlement
identification exist in the datasets:
[District], [Municipality], [Settlement type], [Settlement name] or
[EKATTE code] or
[Postal code], [Settlement name].
Statistical registers and databases use EKATTE consistently to code ATUs. In administrative
datasets, EKATTE is also widely known and used. Within 3 days of ATU change published in
State Gazette, NRPP and EKATTE are updated and updates are made visible to the users in a
structured way. Automation in update is possible but no standard network services are available.
EKATTE is in use in cadastral registries. Some inconsistencies were found in coding, which
may need joint expert work to be cleared, in order to perform proper references of address data
by codes.
Postcodes are assigned to populated places/ settlements. For cities, sub division of postal areas
exists. The postcodes are four digit codes. No boundaries available. The advantage of using
postcodes is unambiguous identification of settlement without stating the upper level
administrative units. People usually are aware of the code of the postal area where they live,
which can be useful in address collecting processes. It is recommended postcodes to be taken
into account when implementing the address dataset.
To identify localisation unit within settlement the name and the type of localisation unit are
needed or localisation unit code from CLU.
[Localisation unit type], [Localisation unit name] or
[Localisation unit code]
Localisation unit can be street, boulevard, square, alley, housing complex, neighborhood,
system of small unnamed roads, allotment, settlement formations or localities, which are named
geographical areas situated in the lands of the settlement, outside urban area of the settlement.
Addresses describing physical location of an object include different address components
depending of the location of addressable object (are location specific). It depends if the
building/land parcel is situated in the settlement urban regulation area where addresses are more
detailed and accurate or outside this area.
The addressing style (specification) is determined by the type of localisation unit. Three types
of address specification could be distinguished – urban street type addressing, urban
quarter/area type addressing, and rural addressing. What components are sufficient to provide
unambiguous identification of place were identified by each of address styles.
One place could have many address descriptions. This is the case when housing complex/
quarters and building designator are used together with street name/ code and street locator.
This happens in the quarters where there are named streets and mixed address styles are in
place. Building on the corner of two streets have a potential for either address to be used to
19
define that location. It is recommended all the descriptions of one place that are in use to be
included in the dataset.
We can have more than one address descriptions for one object, however, it should be assured
that one address determines only one object and only one object should be selected from
cadastral data to represent the address position.
For addresses that give ambiguous location, coding of address names solves the problem but
still there is an issue with the automated coding of text addresses.
Issue with ambiguousness of addresses descriptions are usually due to merging of settlements,
not followed by readdressing in the municipal level. DG CRAS has currently solved this issue
in its databases by applying unique coding of thoroughfare. Same names of localisation units
within the address area, from different type. Still, equal names in one addressable area are a
problem when matching textual addresses.
2. Components of consistent geocoding framework
What is required for a dataset to become a spatial statistical framework?
Have standardised identifiers/geocodes
Use of high quality location data regularly updated with time stamps
Presence of high-quality geocodes at statistical unit records
Data management and documentation of processing
2.1. Persistent identifiers
To turn addresses into geocodes, we need to establish identifier system. In the current practice,
when some tasks need address matching at NSI, address components are coded and locators are
formatted and are combined into a structured identifier. This practice have proved to bring
inconsistencies in address identification and matching between different datasets and especially
in different periods. Semantic identifiers are not preferred as a practice for consistent
maintenance of location identification. It is recommended a unique and persistent identifier to
be used because it is more reliable than simple coding and matching.
Centralised maintenance of address reference dataset that supply unique numeric address code-
the persistent identifier for an address unit is a key requirement. Additionally, the model can
reserve a field to hold the future national identifier from NAR.
20
Fig 3: Sources for initial set-up, regular update and standartisation of records of address
reference dataset (address index)
2.2. Hierarchical geocoding framework
The process of obtaining locations and geocodes for different addressing levels should use
relevant and fundamental geospatial data that is why the cadastral data is currently the choice
for location reference objects.
The level of buildings, parts of buildings and land parcels is the level that provides the highest
accuracy of geocodes.
Cadastral map and cadastral registers are produced and maintained in digital format, and pass
through a number of mandatory procedures, for quality assurance and acceptance. The approved
cadastral map and cadastral register data are maintained in information system and the date of
the entry in the system is indicated. Each cadastral object is attributed an identifier. The
21
structure and the content of the identifier of a real estate is prescribed by an ordinance, issued
by the Minister of Regional Development and Public Works, and is attributed by the cadastral
office.
The land parcel part contains: identifier; boundaries and area, fixed by the geodetic co-
ordinates of the points defining them; permanent purpose of use of the territory; method
of permanent use; address;
The building part contains: identifier; boundary and/or outline of the building and of
facility; built-up area determined by the geodetic co-ordinates of the defining points;
number of floors; purpose of use; building type; address. A polygon feature (the
building footprint) on the map represents building entity.
The self-contained object in a building (unit in a building) part contains: identifier;
floor; outline; number of levels in the object; area according to the documents; purpose
of use; information about individual units/dwellings; address. The dwellings have
spatial representation. The outlines of units in the building are available as polygon
features on the map.
Part of building- Cadastral map also contains helpful point information used for
labeling the map, which is not part of official cadastral data. These point tags give
information for the type of the object that is labeled and the label itself. Could be used
for extracting characteristic point for entrance in apartment buildings, where available.
Entrance locators are part of standard address description for residential multifamily
buildings with more than one entrance.
Fig 4. Tags for building entrances in cadastral dataset.
Addresses that are recorded in the cadastral registers are physical description of the location of
the objects. Address data in cadastral records is collected when the immovable properties are
registered in the cadaster. The address is declared by the property owner or collected from
documents and is recorded as an attribute to the corresponding object. Currently, addresses in
cadastral are not updated when some changes in address components occur like renaming of
thoroughfare, change in designators, etc. The addresses in the cadastral registry need pre-
processing that includes address repairing, coding before matching with the reference address
list.
Addresses in cadastral attributes provide the path to geocode statistical data.
22
Coverage of the territory with cadastral parcels in the beginning of 2018 was 39%, coverage in
the beginning of 2019 was 73%.
Fig 4. Cadastral parcel coverage at the beginning of 2018 (light colour) and lands covered
with cadastral parcels between January 2018 and January 2019 (dark colour).
Although the percentage of territory covered with digital cadastral map will be close to 100%
at the end of 2019, the remaining uncovered parts of the territory are predominantly urban areas.
(Fig 5) This means that geographical coverage over the building cadastral objects by cadastral
map is much less.
The ‘building’ is selected as basic addressable unit. For assuring geographical coverage of this
important layer and achieving fully geocoded census, NSI need to fill the gaps in data
availability and collect building data in the parts where cadastral data is not completed.
For every captured building, a record in the address index should exist and mandatory attributes
populated so compliant addresses to be available for all buildings. NSI objective is at least 95%
of basic addressable units of residential type to be covered before census collection. Human
and financial resources for implementing this task are planned in the frame of census activities
in the pre-enumeration phase. The building data will be updated on continuous basis by
cadastral building dataset update when cadastral data is produced and accepted or updated.
Around 40% of residential building positions need to be captured by NSI.
23
Fig 5. Cadastral data in urban areas
Requirements for location accuracy of address characteristic point are to be set within 5 meters
of the true position of the building centroid or entrance. Particular care is required to locate the
address on the correct side of the street.
The level of localisation units
The street level objects are important and valuable dataset for address allocation and geocoding.
Unfortunately, nationwide public vector data of localisation units does not exist. Commercial
products and OpenStreetMap roads are option, but do not provide the national coding used in
administrative and statistical records. The coding system for street entities is provided by DG
CRASS on a monthly basis, including status and date of change of units.
Classificatory of localisation units contains: identifier, localisation unit name, type, code,
validity period and status. The identifier is 5 digit code which identify the unit within the
settlement.
Features for localisation units could be extracted from cadastral land parcels by filtering the
usetype (for transport) of individual land parcel and matching by street code or name. Routines
for capturing parcels and updating street features from cadastral land parcels and from tag
information were developed. OpenStreetMap is used as additional source for information.
The type of localisation unit is important information to unambiguously distinguish the place.
f.e street and square can have same names within the settlement but define different locations.
No cadastral data in
settlement urban area
Cadastral data in
settlement urban
area and in
settlement lands
No cadastral data
for the entire
settlement land
24
Fig 6. Capturing of street address parcels for establishing settlement street reference dataset
The level of settlements
Settlements are territorial units that define address area and are important as feature layer for
address data referencing and management. They are complex features.
Settlements need to be modeled as complex features for the purpose of addressing. Settlement
urban areas and settlement lands are needed as polygon features. Furthermore, a characteristic
point for the settlement need to be provided within the settlement main urban area for geocoding
purposes. Settlement polygons are a basic territorial units that enable construction of upper
hierarchical levels of administrative units.
The settlement codes are provided by NSI as administrator of NRPP and EKATTE. CA
provides boundaries of settlements. Characteristic point need to be calculated by NSI.
The grid net framework.
The grid nets are important for spatial, grid-based statistics. No national grid systems for spatial
analysis and reporting are set in use, so NSI plans to apply European grid system, which should
be reflected when setting the production environment
25
2.3. Address data management
For applying consistent management of address reference data and geocoding standard
processes and sequences of processes should be applyed. Every process populates its own part
metadata variables in address records, designed for the purposes of information management at
the record level.
Address Standardisation processes. There are several processes that could contribute to
address standardisation, depending on the input address representation (text, coded, etc).
Parsing splits up address text into address components. Cleaning, clears spelling errors
and erroneous symbols, Repairing adds implicit information, fills the gaps and formats
address.
Address Verification is the process that checks if address is ‘true address’ and has all
mandatory attributes required by specification. In one word- is the address correct,
usable, but not necessarily in the address index.
Address Validation or Identification is the process that matches formatted address to the
index of valid addresses. An indication for address matching confidence should be
developed in the system. The more addresses matches are found the less is the
confidence of identification. The value could be the number of matched addresses for
example. If process fail to validate the address, information on which level matching
failed should be provided.
Address Learning –Addresses are “learned” from spatial reference dataset (from cadastral
objects) if they are “true addresses” (provide settlement information, street and street
number) and be consistent with neighboring addresses and neighboring thoroughfare
feature. Addresses can be learned from every dataset, if address learning process is
switched on.
Address Allocation finds the address position among map objects from one reference level, f.e
buildings. The process populates address geographic position attribute and position
metadata of an existent address, if find object on a map with the same address identifier.
Finding address location on the map includes coding and repairing of address data in
cadastral unit records and selecting the object if address matches. The addresses within
cadastral object attributes are first standardised and validated. For every address in the
Address index
Address Management -Update, Registration, Archiving – processes are needed to manage
address registration and updating routines. They populate time and status values.
26
Fig 7. “Learning” addresses from map features
Fig 8. Allocating addresses from Address Index on a building point or grid cell (from address
list to map)
Geocoding data by address is the ability to “travel” the path from the address description in data
records to the geographic coordinates on map. Preferably, the point represented by these
geographic coordinates should be the most accurate representation of the physical location of
the addressable object. The most precise location is by default recorded in position attribute in
records within address reference dataset. Every address in the Address Index should be
positioned on a building object or if not, on a grid cell.
27
When allocating addresses from address index, the position attribute will be populated by
Allocation process with coordinates of a characteristic point from one of the following accuracy
levels: [Unit within building], Building entrance, building, parcel, street or settlement.
Ideally, we want these coordinates to fall within building footprint. But when, in some, cases
this is not possible and then the allocation passes on the upper accuracy level and returns the
geocode of the upper level that is possible.
We want to assure that every address to be geocoded at least to 1kmETRS89 grid cell. This will
need some manual processing and as a final step – imputation of gridID in the address record.
2.4. Geocodes in statistical unit records
To provide quality address data for high precision geocodes it is important how address data is
entered and stored. In statistical records address data is relatively well structured and formatted
and standard nomenclatures are used.
There are measures that are recommended to be taken for providing accuracy of address
components in statistical records and reduce inefficiency and duplication of work on address
management.
Quality requirements about completeness or correct spelling of address components can be
provided by consistent collection mechanisms for avoiding spelling errors and entering of
incorrect value. If the collection interfaces standardise address data and check its existence in
the reference dataset they provide point-of-entry validation.
Fig. 9: Web interface for collection of addresses for Statistical Units in SBR.
Point-of-entry address validation against address reference dataset is recommended to be
implemented for computer and internet based capture of address information. During the e-
census collection phase, such validation within online survey forms could provide a mechanism
to link the household to the frame of cadastral units/features, returning a valid geocode. It
should also be considering the address and location capture if address is not recognised by the
address identification system. Standardised and formatted addresses in the time of entry into
statistical dataset provide high quality geocodes.
28
Central address services would help ensure consistent standards and help control quality.
Fig. 10: Provide address services based on established address reference dataset for the for
statistical registers and datasets.
3. Organisational setup - key findings and recommendations for activities.
Implementation of the foundational level of statistical spatial framework requires changes at
organisational level and that brings financial and operational challenges to the organisation.
Three points were found clearly important:
Single standard for addresses
Point-of-entry validation of address data
Development of central address processing system
Clear guidelines are needed both for staff that will maintain address reference dataset
and for users that will use address services.
3.1. Technical conditions
Automated corporate services based on one central reference address dataset is the preferred
solution to establish consistent address maintenance in statistical data and provide quality
geocoding.
NSI needs a centralised management and storing of collected geospatial data. Which is one of
the major steps and resources needed to enable an enterprise organisation of work and data and
metadata access.
Manual tasks take time and effort and increase the risk for errors. Automation provides
efficiency and consistency and can significantly improve geocoding workflows.
A user friendly and easy to use web application is needed for address and building data analysis
and management. It should provide tools for in-house and mobile capturing of building data.
29
The application should enable joint work of staff from central and regional statistical offices,
staff from local authorities, staff hired for pre-enumeration tasks to work together on building
and address data. Furthermore, styles for visual indication of missing or incorrect data could be
applied. For instance, streets, buildings and address characteristic points with the same street
name could be highlighted in the same colour.
3.2 Proposal for a sequence of steps to establishing a point-based address infrastructure
for geocoding
The capability for address geocoding needs to be in place for the time of census enumeration.
Implementation can be planned in a stepwise approach so some priorities and sequence of tasks
in establishing different levels of geocoding is proposed:
1. Enable Address Management and Address Identification.
Design spatial address reference dataset, design process metadata at record level.
Create guiding materials for address data management and addressing issues.
Implement.
Populate dataset by Registering selected initial collection of addresses (Census
2011)
Update to current
Standardise cadastral addresses and Validate/Identify
2. Enable Address Allocation.
Establish Geocoding Level of Settlements.
Allocate addresses on Settlement points and Analyze, Repair.
Establish Geocoding Level of Streets.
Allocate addresses on Streets points and Analyze, Repair.
Establish Geocoding Level of Buildings
Allocate addresses on Building points and Analyze, Repair.
[Establish Geocoding Level of Units within Building]
Allocate addresses on the level of Unit within building points and Analyze,
Repair.
3. Enable Address Geocoding.
Geocode Address index to ETRS89Grid_1km
4. Enable Point-of-entry Validation for computer and internet based capture of
address information to provide high precision address geocodes in statistical records.
3.3 Institutional arrangement required to conduct and support address geocoding framework.
Addresses from Population Register and Nomenclature of permanent and current addresses in
Bulgaria are received on a monthly basis from DG CRAS in an agreed content and in a defined
csv format in the framework of SPR. No additional action is needed.
CA provides copies of cadastral map and cadastral register to NSI, excluding the information
of ownership of immovable properties. The address data is attributed to spatial features and
address components are provided in separate fields. The copies are acquired once a year,
30
contemporarily, and are received in a structured geodatabase, provided with metadata for the
content and in a predefined coordinate reference system based on ETRS1989 datum.
Two bilateral agreements were signed between NSI and CA at the end of October 2017 that
allow general access to the Cadastral information system and the Cartography fund.
There is option for accessing the cadastral map as a standard web map service for a small annual
maintenance fee. It is recommended to NSI to use this map service, as it contains the most up
to date information, when provided technical conditions for that. NSI should work on the
development of automation of the updates and ability to streamline the collection and
processing of cadastral data, as the volume of data is continuously increasing.
For achieving fully geocoded census in two years, it is essential to continue close collaboration
with CA and municipalities and to stay focused on the selected development priorities.
For maintaining and improving the quality of address information between cooperating
organizations the following opportunities for working groups were identified:
Clearing coding issues on settlement level
Clearing coding issues on street level
31
Conclusions
The main goal of the project was to identify and propose applicable solution for point-based
foundation for geocoding by using available and trusted address and location data maintained
in statistical and administrative datasets.
Taking into account the initial level of maturity of the national spatial data infrastructure,
finding a consistent path for high-resolution geocoding of statistical data was a real challenge.
The project benefited from the valuable guidance of GEOSTAT reports/outcomes on how to
set up and use a point-based foundation for statistics and the dozens of good practices.
Based on identified needs concrete initiatives for geocoding and address collection during the
Census 2021 were proposed and included in the Census Programme. Additional financial
resources were allocated by the central budget to Census 2021 for implementation. The role of
CA as a partner institution of NSI in this process was recognised as essential and highlighted
in the Census 2021 act and the Programme.
Work on the project brought benefit from improved understanding and expanded knowledge
on administrative data from cadastral registers within NSI. Additional use cases were marked
and NSI can initiate activities to use the data not only as reference infrastructure but also as a
source for calculating statistics, like land use statistics, land area statistics, housing statistics
and more. Integrating cadastral identifier in statistical framework can facilitate linking more
administrative data sources.
References
EFGS/GEOSTAT 2 (2017). A Point-based Foundation for Statistics - Final report from the
GEOSTAT 2 project.
EFGS/GEOSTAT 3 National report (v0.96_Draft). Implementing the Statistical Geospatial
Framework at Statistics Sweden.
EFGS/GEOSTAT 3
UN-GGIM: Europe (2017). Core Spatial Data Theme Address. Recommendation for Content.
Version 1.0 2017-11-10.
UN-GGIM: Europe (2017). Core Spatial Data Theme Buildings. Recommendation for Content.
Version 1.0 - 2018-06-01.
32
Annex
Summary table of the analysis and conclusions on selected data sources to establish a point-based infrastructure
Statistical sources Administrative sources
Quality Factor Census 2011 Statistical
Population
Register
Statistical Business
Register
Newly built
residential
buildings/dwellings
Cadastral map and
cadastral registers
NCPCA Local taxes
Relevant Dataset
Units
(with attributed
addresses)
Residential
Buildings,
Dwellings
Population Enterprises, Local
units
Newly built and
destroyed residential
buildings and the
dwellings in
residential
buildings.
Land parcels,
Buildings, Unit of
property within
building
Addresses for
population
registration,
Localisation Units
(Thoroughfares)
Immovable
properties (real
estates) declared for
taxation.
Relevance of
location represented
by address
Addresses describe
physical location of
residential buildings
and dwellings
Addresses describe
physical location of
current residence of
population.
Address describe
postal address
(correspondence
address) for the
Enterprise.
Addresses describe
physical location of
the place of activity
for Local Units
Addresses describe
physical location of
newly built
residential buildings
Addresses describe
physical location of
units
List of addresses
currently approved
for population
registration,
nomenclature of
localisation units
Addresses of
physical location of
real estates declared
by the owners for
property taxation
Relevance of data
source
Exhaustive
statistical survey on
population and
housing as of 1st of
February 2011. The
dataset is result of
one-off data
collection.
Microdata in
statistical register,
maintained by
information system,
source for
demography
statistics.
Statistical register,
maintained by
information system. Collecting and
integrating data
from several
administrative
registers.
Exhaustive
statistical survey.
The information on
newly built and
destroyed residential
buildings is obtained
quarterly through
regular reports from
all local
administrations. For
the period 2004-
2007 annual data,
since 2008 -
quarterly and annual
data.
Basic data on the
location, boundaries
and dimensions of
immovable
properties (real
estate) within the
territory of the
country, submitted
and kept up to date
as well as in
accordance with the
law. Maintained for
administrative
purposes.
Nomenclature
maintained for
administrative
purposes in the
framework of
population
registration.
Local/municipality
registers of declared
immovable
properties for which
local taxes are
collected.
Maintained for
administrative
purposes.
33
Relevance of
provider
Internal provider,
authoritative,
trusted.
Follows statistical
production
standards.
Internal provider,
authoritative,
trusted.
Follows statistical
production
standards.
Internal provider,
authoritative,
trusted.
Follows statistical
production
standards.
Internal provider,
authoritative,
trusted.
Follows statistical
production
standards.
The cadastral
authority is the
Agency of Geodesy,
Cartography and
Cadastre of the
Ministry of Regional
Development and
Urban
Development.
External provider,
authoritative,
trusted.
DG CRAS, Civil
Registration and
Administrative
Services of the
Ministry of Regional
Development and
Urban
Development.
External provider,
authoritative,
trusted.
Local authorities.
Legal basis Statistical act,
Census act
Statistical act, EU
and EC Regulations
setting up common
basic standards in
the area of
demography
statistics, Civil
Registration Act
Statistical act, EU
Regulations in legal
framework for
business registers
for statistical
purposes,
Guidelines on
Statistical Business
Registers
Statistical act,
National Statistical
Programme
Cadaster and
Property Register
Act
Civil Registration
Act defines address
content and format
for address
registration.
Local Taxes and
Fees Act
Standard identifiers No standard
identifiers available
for housing units,
only census-system
specific unique
identifiers. Cannot
provide links to
administrative data
for
buildings/dwellings.
Not suitable for use
in the point based
infrastructure.
National: Personal
Identity Number
(PIN)
Unique
Identification Code
(UIC) of buusiness
entites
Regulation Plan
Identifier of the
property from the
territory regulation
plans and/or
cadastral ID of the
building.
Standardised by
ordnance national
cadastral IDs: for
land properties, for
buildings and for
units within
buildings; Not
standard/Internal.
Identifiers Suitable
for infrastructure.
Identifier of
Localisation unit
within settlement.
Identifier of
declaration
document;
Regulation Plan
Identifier of the
property
Address resolution
level/ highest spatial
level available
Address resolution
to the level of
property location
(including within
building locations).
Address resolution
to the level of
property location
(including within
building locations).
For Enterprises:
Postal address to the
highest available
level - location
within building. For
Local Units
Address resolution
to the level of
property location/
street number.
Geodetic accuracy
standards for
surveyed objects.
No accuracy
standard for related
addresses of the
Resolution of
addresses for citizen
registration included
in Classificatory of
addresses Address is
to the level of
Address resolution
to the level of
property location.
34
addresses provide
accuracy to the level
of settlement. A
higher resolution
address for Local
Units is needed.
objects, but address
description
formatting is set.
property location/
street number.
Availability of
Spatial data
representation
No spatial data
available, only
address description.
No spatial data in the
information source,
only address
description.
No spatial data
available, only
address description.
No spatial data
available, only
address description.
Vector polygons of
landed properties,
vector polygons of
building footprints,
vector polygons of
schemes of property
units within
buildings
No spatial data
available, only
address description.
No spatial data
available, only
address description.
Address collection
and storage format
Rules for control and
validation in the
time of electronic
collection and data
entry. Formatting
and standardization
of address
components.
Updated monthly
from demographic
events. System rules
on validation and
control. Address
attributes stored in
separate fields.
Collected by
standardised form,
stored in separate
fields in the
database.
Electronic form
filled quarterly by
information from
municipality
authorities. Stored in
one field, comma
delimited.
Addresses collected
by special semicolon
delimited format.
Address elements
stored in separate
fields and can be
provided as separate
address attributes.
Addresses are
defined by local
authorities and
reported to the
regional structures
of DG CRAS on a
daily basis.
In declaration
document, formatted
text. Usually one-
field text. Technical
implementations
differ by
municipality.
Maintenance of
updates of
addresses,
documentation of
the updates and time
stamps
No updates of the
dataset. Addresses
are valid for
1.02.2011
Monthly updates
received from the
Unified System for
Civil Registration
and Administrative
Service of
Population by
information on
demographic events
(incl. migration e.g
change of address of
current residence).
Updated and
documented in the
time when event
occurs in the source
administrative
registers.
Addresses in the
dataset are not
updated. Time stamp
is related with the
date when the
building receives
completion
certificate.
Address data in
cadastral records is
updated upon
request of the
property owner. No
updates of address
attributes are
maintained
currently.
Changes reported
from municipalities
on a daily basis and
consolidated in the
national database.
Period of validity
and status are
documented on unit
level.
Declaration by the
property owner.
Correcting
declaration to
change information.
Use of address
attributes coding
and standardisation
Settlement and
administrative units
national coding,
Localisation Unit
national coding,
NUTS coding,
Settlement and
administrative units
national coding,
Localisation Unit
national coding,
NUTS coding,
Settlement and
administrative units
national coding,
Localisation Unit
national coding,
Settlement and
administrative units
national coding,
NUTS coding.
Settlement and
administrative units
national coding,
Localisation unit
national coding,
Postal codes,
Settlement and
administrative units
national coding,
Localisation unit
national coding.
No coding applied to
address attributes.
Settlement code
available in the
declaration
document.
35
controls in
formatting of
designators
controls in
formatting of
designators
NUTS coding,
Postal codes.
Registered
geographic name
coding.
Consistency of
address coding
Consistent with
national coding
systems for as of
02.2011
Consistent with
current national
coding systems.
Consistent with
national
administrative-
territorial coding.
Street codes not
updated on every
change/ frequency
for street codes
updates not set.
High percentage of
unfilled street/street
number attributes.
(addresses may not
be assigned by the
local authority for
the time of data
collection).
Coding of
administrative units
is consistent. Some
inconsistencies on
the level of
settlement were
found. Localisation
units are barely
coded.
Provides national
coding systems.
Consistency of
settlement coding
not checked.
Completeness of
address attributes
Around 8% of
address descriptions
are not geocodable
(are not “true
addresses”) due to
missing address
components.
Around 7.9% of
address descriptions
in the dataset are not
“true addresses” due
to missing street
name and/or street
number.
For Enterprises
around 10% and for
Local Units around
4% of address
descriptions are not
geocodable to the
point.
Around 40% of units
are not geocodable
by address
description.
Cadastral ID or
Regulation Plan
Identifier of the
property could be
used for geocoding.
It was difficult to
assess accuracy of
addresses in the part
of street name
because co.
Completeness of
localisation unit is
difficult to assess
All addresses listed
in the classificatory
of addresses are
“true addresses”.
Address accuracy
not checked for
completeness of
address components.
Geographic
coverage
(as of beginning of
2018)
National. National. National. National. 39% of the territory
of the country is
covered by digital
cadaster in the
beginning of 2018,
73% in the
beginning of 2019,
above 93% coverage
estimated for the
2020.
All settlements
registered in the
country (5256)
define address areas.
In 2003 settlements
no localisation units
are defined so no
street addresses are
formed.
All properties on the
territory of
municipalities
within the country
excluding properties
with assessed value
less than 1680 lv.
Complexity of pre-
processing,
standardisation
Easy to obtain and
check the data. Easy
to populate
corresponding
standard Address
reference dataset
structure.
Easy to No parsing needed.
For Local Units in
SBR more detailed
address need to be
collected from
corresponding legal
units in
administrative
register.
Cadastral IDs
provide direct link to
cadastral map and
corresponding
building object.
Location of property
by Regulation plan
ID is an option
CR Require time-
consuming
standardization of
Localisation units.
Clearing coding
errors in settlement
coding requires joint
work with experts of
the provider.
Easy to obtain and
populate standard
Address reference
dataset structure.
Addresses of
properties need to be
parsed, normalized
and standardised
before use. Different
formats of address.
High resource
consuming task.
36
where no digital
cadaster is available.
Availability and
costs, conditions of
access
Internal source,
information for
addresses is
available for use and
processing and can
be provided in
format needed.
Internal source,
information on
addresses is
available for use and
processing. Can be
provided in
XLS/XSLX, CSV,
SAV or other
formats.
Internal source,
information for
addresses is
available for use and
processing.
Addresses attributes
together with UIC
can be provided in
dbf, txt и xls
formats.
Internal source,
information for
addresses is
available for use and
processing in xls
format.
Full or partial copy
of cadastral map in
vector CAD format.
Other formats by
agreement. WMS
available for annual
fee. On-line access
http://kais.cadastre.b
g for viewing,
querying and
downloading of
parts of map.
Received monthly
by an agreed CSV
format, together
with the information
on demographic
events (births,
deaths, marriages,
divorces, migration)
in the IS
Demography
Agreements are
needed with the
municipalities.
Relevance for
establishing
Address reference
dataset for the
Census 2021
Suitable for initial
setup of Address
reference dataset.
Suitable for
updating addresses
of dwellings from
individual
population records.
Suitable to extend
the scope of address
information – add
new categories of
addresses.
Appropriate for
updates where
cadastral
information is not
available. Cadastral
ID is recommended
to be set as
obligatory to be
collected (if the
region is covered by
digital cadaser.
Otherwise
Regulation Plan ID
of the property
should be collected
obligatory.
Appropriate to
provide position for
characteristic point
for building
addresses.
Suitable to extend
the scope of address
information. High
quality location
data- geodetic
precision and
accuracy. Suitable
for establishing a
point infrastructure.
Suitable for
standardisation of
address components.
Regularly updated.
Provides period of
validity of address
components and is
suitable for address
reference update.
Appropriate to
verify address
reference dataset
locally, by local
authorities.