General introduction to Spatial Data Infrastructures

01 | SDI Architecture and SDI components


Elaboration of Standards, Norms and Procedures in the Acquisition and Dissemination of Geospatial Information


SADL KU Leuven

WELCOME!

With the following slides and interactive material you will be able to take part in the journey to discover Spatial Data Infrastructures (SDI) its components and benefits through the observation of multiple examples and exercises. 


You can navigate through the course by pressing the navigation arrows at the bottom of each slide or using your arrow keys on your keyboard. You can move horizontally (← →) for viewing each theme and vertically (↑↓) to view extra recommended information


navigation arrows

01 | SDI Architecture and SDI components


Elaboration of Standards, Norms and Procedures in the Acquisition and Dissemination of Geospatial Information


SADL KU Leuven

SDI Architecture and SDI components

#Content
1Geographic Information for policy making
2Combining Geographical Data
3SDI definitions and approaches
4SDI Components
5Data specifications
6Services
7Standards
8Metadata
9Geoportals
10Concepts Exercise

01 | Geographic Information for policy making


    EUROSION


  1. In Context
  2. Applied Examples
  3. Combining Datasets

01 | Geographic Information for policy making


Geographic Information is needed in many policy domains

Many stakeholders and organizations produce spatial datasets but there is...

  • Lack of reference and authentic data
  • Difficult to obtain spatial data;
    • Not always accessible
    • Uneasy process to obtain and process
    • Often expensive
  • Gaps in availability of spatial data and duplication of efforts
    • Sometimes low quality
  • Spatial data not harmonized;
    • Difficulty to use across entities
    • Impossibility to process or combine with data from other countries
  • Difficult to interpret and understand;
    • Often not documented

01 | EUROSION


“To provide the European Commission with a package of recommendations on policy and management measures to address coastal erosion in the EU. These recommendations should be based on a thorough assessment of the state of coastline and of the response options available at each level of administration.”


01 | 01 Applied Examples

Ajaccio_Bay.png

Example of Ajaccio Bay

Ajaccio_Bay.png

01 | 02 Combining Datasets


02 | Combining Geographical Data

  1. Common Issues
  2. Summary
  3. Exercise

02 | Common Issues:

  • A large variety of formats exist
  • Many geographical gaps still remain
  • Reference systems are not harmonized
  • Many data sources are not consistent
  • Scales are not compatible
  • All data are not interoperable
  • Costs and access restrictions

02 | Issue No. 1 - A large variety of formats exists

  • Satellite images
  • Maps
  • Aerial Photographs
  • Diagrams
  • Statistics
  • Reports
  • Databases
  • 02 | Issue No. 2 – Many geographical gaps still remain

    Geological data at scale 1:50,000
    (source: BRGM, France)

    Need to identify the gaps and make priorities to bridge them

    02 | Issue No. 3 - Reference systems are not harmonized

    Need to define a common terrestrial reference system for data production and processing

    02 | Issue No. 4 – Many data sources are not consistent

    Need to build pan-european “seamless” data with standard specifications

    Sources 1:

    Coastline : SABE (EuroGeographics)
    Bathymetry : TCIFMS (SHOM)
    Topography : BDTOPO (IGN)

    Sources 2:

    Coastline : SABE (EuroGeographics)
    Bathymetry : GEBCO (BODC)
    Topography : MONA PRO

    02 | Issue No. 5 - Scales are not compatible

    Need to adopt a common level of perception and representation of data

    02 | Issue No. 6 – Data not interoperable

    CORINE Land Cover 1990

    SABE Coastline


    0 m < Difference < 50 m
    50 m < Difference < 200 m
    Difference > 200 m

    02 | Issue No. 7 – Costs and access restrictions

    • Most existing datasets are “copyrighted”: you do not buy information itself, but a right to use it (“license”)

    • Dissemination of end-products is restricted (sometimes, end-products have to be “degraded”)

    • Quality “label” are not commonly adopted: uncertainty about the products

    02 | Issue No. 7 – Costs and access restrictions

    EUROSION database = 2 Millions Euros



    • 26% acquisition of licensed data (e.g. Elevation)

    • 17% update of existing data (e.g. Coastal Erosion)

    • 33% production of missing data (e.g. Hydrodynamics)

    • 24% Format conversion, integration, and quality control

    02 | Summary

    The absence of a European spatial data infrastructure results in:



    • Higher investment costs (2 to 3 times)

    • Delayed implementation (8 to 10 months)

    • Uncertain quality

    • Dissemination constraints


    There is a need for developing an infrastructure that provides direct access to harmonized (interoperable) data and that can be shared amongst all those that need it

    03 | SDI definitions and approaches

    1. Common definitions and approaches
    2. GIS and SDI

    03 | 01 SDI definitions and approaches

    Many attempts to define and conceptualize SDI


    Focus on the components (spatial data, technology, policies, standards and people)
    Recognizing its dynamic nature (evolution)
    Identifying (data) sharing as a key issue
    Describing different views of SDI hierarchy

    • Rajibifard
    • Masser
    • Chan
    • GSDI
    • ...

    03 | 02 SDI definitions and approaches


    03 | 03 SDI definitions and approaches


      • Few define SDI as a (dynamic) networkTulloch & Harvey (2007)
        Few operational set-ups to analyse SDI-networksVandenbroucke et al (2009)

    03 | 04 SDI definitions and approaches



    03 | 05 GIS and SDI


    04 | SDI Components

    1. Principal components
    2. Data specifications

    04 | 01 SDI Components



    04 | 01 SDI Components

    04 | 01 SDI Components


    Laws or voluntary agreements governing the collaboration between producers and/or users of geodata.

    An organizational structure; mandates; who does what? - Mechanisms for financing the SDI and its coordination

    Laws/agreements about pricing and licensing of geodatasets:

    Laws/agreements about how to deal with:

    • Data free of charge
    • Data at distribution cost
    • Data at full cost
    • Data at market price
    • Exchange of data
    • Intellectual property rights on datasets and services
    • Confidentiality - privacy issues (cfr. Google Street View)
    • Access to public sector information (e.g. EU-Public sector Information Directive)
    • Homeland security
    • ...

    04 | 04 SDI Components

    04 | 05 SDI Components


    REFERENCE GEODATASETS:

    Required in almost every application as the geometric foundation

    • "Stable" topographic entities (roads, water courses, administrative boundaries)
    • Orthophotographs
    • "Stable" terrain characteristics (elevation, ...)

    04 | 06 SDI Components


    REFERENCE (CORE) THEMATIC GEODATASETS:

    Useful for several thematic domains

    • Land cover and land use
    • Spatial destination
    • Soils
    • Geology

    04 | 06 SDI Components


    Datasets are standardized and kept in repositories, ‘somewhere’ in the infrastructure/on the network

    • ‘Big Data’, but highly structured and standardized
    • ‘Cloud’ but highly structured

    01 | 05 Data specifications

    05 | 01 Data specifications


    Application schema
    (defines the model)
    Data harmonization
    • Getting your data in the reference (standard) model
    • Data transformations might be necessary (ETL)

    05 | 02 Data specifications


    It is about more than the dataset alone


    This is reflected in the geospatial standards!

    06 | Services

    1. Services for SDIs
    2. View services
    3. Download services
    4. Transformation services
    5. Catalgue service for the web CSW
    6. Rights management services

    06 | 01 Technological components of SDI: services

    06 | 02 Technological components of SDI: services

    06 | 03 Services for SDIs


    MOST IMPORTANT SERVICES:

    • View services (WMS)
      • View data (as image)
    • Download services (features WFS, coverages WCS)
    • Transformation services (WPS, processing)
      • Often in the background (e.g. coordinate transformation)
    • Discovery services (CSW, Catalogue)
      • Provider publishes data
      • User finds the data and access services

    06 | 04 View Services: Web map services WMS

    • As its name implies, is a service that provides maps.
    • Data only leaves the server as an image
    • Map is rendered on the server so styling and presentation is chosen by the data provider
    • Limited client interactivity with the map

    06 | 05 View Services: Web map services WMS

    Note: source material (data) from which the image is generated does not need to be an image, but can be a Shapefile, PostGIS database, Oracle Spatial,…


    06 | 06 Download Services:Web feature services WFS

    • Provides map data (GML) to a (web) client
    • The client chooses style & presentational details
    • Geospatial features
    • Selection query for features in the request
    • Requesting predefined datasets
    • Optional: Transactional Web Feature Service (WFS-T) enables the creation, deletion, and updating of features

    06 | 07 Download Services:Web feature services WFS

    06 | 08 Transformation services: Web processing services WPS

    • Requests the execution of a spatial calculation
    • One of the benefits of WPS is its ability to chain processes
    • A WPS process can use as its input the output of another (previous) process
    • Many complex functions can thus be combined in to a single powerful request
    • Example: transform coordinates from CRS in database to CRS requested by user

    06 | 09 CSW – Catalogue Service for the Web


    • A catalogue server publishes collections of descriptive information (metadata) about geospatial data
    • Defines interface to search for metadata (so client can ‘discover’ geospatial data)
    • Transaction: insert, update & delete of metadata (to publish metadata)
    • Harvest: pull existing metadata from other servers

    06 | 10 Services

    Service chaining: different services combined to produce the output


    06 | 11 Services


    Publish – Find – Agree - Bind

    06 | 12 Rights management services


    The ‘agree’ part: rights management services

    Related to the Institutional and organizational aspects of SDI

    06 | 13 Service oriented architecture

    06 | 14 Service oriented architecture


    Different technical services and components are distributed over the net.

    Talk to each other through APIs

    If standardised, components from different software providers can talk to each other


    Standards are needed!

    07 | Standards

    1. Definition
    2. Standardization bodies
    3. ISO/TC211
    4. Open Geospatial Consortium OGC

    07 | 01 What are standards?


    Standards are everywhere

    07 | 02 What are standards?


    Standards are...

    “Documented agreements containing technical specifications or other precise criteria to be used consistently as rules, guidelines, or definitions of characteristics, to ensure that materials, products, processes and services are fit for their purpose”

    (ISO, 2019)

    07 | 03 What are standards?

    • Usually in the form of documents
    • They are implementation neutral
      • E.g. Adapter for the electricity net should work for any device
      • E.g. Any software should be able to use or implement standards
    “The interesting thing about Standards is that there are so many to chose from…”

    07 | 04 Standardization bodies


    Many standardization bodies exist

    07 | 05 ISO/TC211


    38 Participating Members, 31 Observing Members

    07 | 06 Open Geospatial consortium OGC

    Not-for-profit, international voluntary consensus standards organization; leading development of geospatial standards
    • Founded in 1994
    • 520+ members and growing
    • 50+ standards
    • Thousands of implementations
    • Broad user community implementation worldwide
    • Alliances and collaborative activities with ISO and many other SDO’s

    07 | 07 Open Geospatial consortium OGC


    A document, established by consensus and approved by the OGC Membership, that provides rules and guidelines, aimed at the optimum degree of interoperability in a given context.

    • Community requirements
    • Member requirements
    • Market trends
    • Technology trends

    07 | 08 Types of standards


    Different types of standards: technical and semantic

    Technical standards will guarantee interaction between different (software-)systems by providing common or shared interfaces


    Semantics imply that geographic information provided by different organizations at different moments in time and for different purposes can unambiguously be interpreted and if necessary integrated. Semantic standards focus on this aspect.

    07 | 09 Types of standards


    In summary:

    technical standards will make you connected, while semantic standards will make you understood (Reuvers, 2005)

    07 | 10 Types of standards


    Different types of standards: de facto and de juro:

    • De facto:
      • e.g. SQL, SHP, KML
      • Open Geospatial consortium standards
    • De jure (legal)
      • ISO/TC 211
      • CEN/TC 287
      • Federal Geographic Data Committee (FGDC, USA)

    Sometimes a de facto standard is formally adopted (e.g. KML)

    07 | 11 Standards for SDI


    Standards for

    • Data
    • Services
    • Metadata

    07 | 12 Data related standards for SDI


    Standards for data (semantic!)


    More than exchange formats. Specify what must be included in datasets and how it must be stored

    Focus on standardised conceptual/logical data models and data specifications (UML)

    Standards and specifications related to geographical objects

    07 | 13 Service related standards


    Standards for services (How it works: technical!)


    OGC standards (often also covered by ISO) for use with all kinds of client-server combinations

    • Web Map Service (WMS, ISO 19128)
    • Web Feature Service (WFS, ISO 19142)
    • Web Coverage Service (WCS)
    • Web Processing Service (WPS)

    Similar to the services available in an SDI!

    07 | 14 Service related standards

    Define information about data or services, needed to search, explore, assess, connect to available resources

    08 | Metadata


    1. Definition
    2. Uses
    3. Use communities
    4. Storage of (geospatial) metadata
    5. Types of (geospatial) metadata
    6. Metadata Schema
    7. Elements

    08 | 02 What are metadata

    Metadata are everywhere

    Metadata can be found on all consumer products in our food chain (check a coca cola can) or any other product for that matter. It gives you information on who produced/delivered the product and how you can contact them, a description of the product, it provides a (bar-)code with unique identifier, a ‘preview’ of the product, information on how to use the product (or not), the ingredients, information about the quality, the validity date (the date it was produced and the date until you can consume it), the size/volume, etc.
    The same happens in our information society: e.g. books. It happens (partially) also in the context of data, and geospatial data and Spatial Data Infrastructures in particular.

    08 | 03 What are metadata used for?

    Inform the user about
    • What is the (information) product about?
    • What is inside and what does it consist of (in the description or list of elements / ingredients)?
    • To whom I can direct my questions or complaints (name, address, e-mail)?
    • How does it look like (visuals)?
    • Is it old or current (dates)?
    • What is the quality (through the dates, the composition …)
    • How can I use it: what is the purpose (e.g. the soup is not good for plants)

    Usually the metadata are mandatory and standardized!

    08 | 04 Metadata for different communities…

    Different communities,other standards
    The Dublin Core standard is used for documenting reports, archiving files, etc. It consist of several elements to understand the document / file:
    • Title and abstract
    • Author(s), contributors, reviewers…
    • Keywords
    • Date of initiation, date of last update
    • Information about quality assurance, adoption/acceptance
    • History of the document or file in the form of a log-table

    08 | 05 Storage of (geospatial) metadata

    Where storing metadata?
    Metadata should be stored with the data and might be stored in a catalogue. In software such as ArcGIS or QGIS the metadata file is automatically stored in the folder where also the other files are stored. When taking a copy of the data (to provide them to others) you will have all relevant files together.

    08 | 06 Storage of (geospatial) metadata

    A catalogue allows you to group the metadata records together and to expose the information via the web through a ‘discovery’ or catalogue service. The latter allows to have all information on datasets (and services) of the SDI in one place.

    08 | 07 Types of (geospatial) metadata

    Different types for different purposes

    Discovery - Users need to find the relevant datasets in the first place: searching datasets on a particular topic for a particular area for example. Discovery requires search mechanisms.
    Evaluation - Before using a geospatial dataset it is necessary to assess whether it is ‘fit-for-purpose’: is the quality and currency enough for my use-case; what is the history (lineage) of the dataset, etc.
    Use - When using the (geospatial) dataset users need to know some details about the characteristics of the data elements: the type of features and how they are/were defined; the meaning of the attributes and their technical characteristics, etc.

    08 | 08 Metadata Schema


    It is recommended to start with the mandatory metadata elements which are required for discovery purposes and which are ‘easy’ to maintain: e.g. title, abstract, reference date, language, topic category. Usually metadata elements are grouped.

    Some optional elements are also recommended: e.g. responsibility party, lineage, online resource.

    The table shows the so-called core elements of the ISO 19115:2003 standard.

    08 | 09 Metadata Schema

    Schema consists of metadata elements.
    Some are:

    • M - Mandatory
    • O - Optional
    • C - Conditional

    08 | 10 Mandatory elements

    The core metadata elements answering the following questions:

    “Does a dataset on a specific topic exist (‘what’)?”, “For a specific place (‘where’)?”,“For a specific date or period (‘when’)?” and “A point of contact to learn more about or order the dataset (‘who’)?”.

    Using the recommended optional elements in addition to the mandatory elements will increase interoperability, allowing users to understand without ambiguity the geographic data and the related metadata provided by either the producer or the distributor. Dataset metadata profiles of this International Standard shall include this core.

    08 | 11 Identification elements

    Identification information contains information to uniquely identify the data. Identification information includes information about the citation for the resource (title), an abstract, an identifier, keyword(s), the purpose, credit, the status and points of contact. The MD_Identification entity is mandatory. It contains mandatory, conditional, and optional elements.


    Identification elements

    Title - Should be self-explaining and not too long – don’t use the filename of the resource!
    Abstract - Should be concise and explain what the dataset is about, usually 1/3 to ½ page text
    Identifier - a unique code for the resource, not automatically generated but part of a PID strategy for information infrastructures
    Keyword(s) - the topic(s) covered by the dataset; might use an existing vocabulary (e.g. GEMET)
    Format - vector or raster data
    Type - it can be a dataset, a dataset series or a web service

    08 | 12 CRS and other elements

    Coordinate Reference System is referring to the projects, the ellipsoid and datum of the dataset including a link to the CRS type (code list) and projection parameters.

    Also the extent of the dataset is given (as part of the identification) by 4 coordinates or a geographic identifier.

    08 | 13 Data Quality elements

    Data Quality contains a general assessment of the quality of the dataset. Data Quality is an aggregate of the Lineage and Data Quality Elements. The latter can include following elements: completeness, logical consistency, positional accuracy, thematic accuracy and temporal accuracy. Those five entities can be further subdivided to the sub-elements of data quality.

    This package also contains information about the sources and production processes used in producing a dataset. The lineage entity is optional and contains a statement about the lineage or history of a dataset. It is an aggregate of process step performed on the data and the source information (starting dataset).

    08 | 14 Constraint elements

    Constraints can impose limitations of sharing and use. The reasons might be variable. Sensitive information such as personal or military information might need to be secured, or vulnerable species or habitats might be protected as well. On the other hand there might be legal constraints related to the existence of copyrights, patents or licenses (paying fees), or there might be some restrictions on what users can do with the data.

    In INSPIRE, two elements have been imposed:
    1) the conditions for accessing the data and for use and
    2) the possible limitation for public use (since normally all geospatial data should be open for the public).

    09 | Geoportals

    1. How to find information

    09 | 01 Geoportals How to find geographic information

    Geoportals as one of the possible applications to connect to the geographical (meta)data

    09 | 02 Geoportals

    Key component of a SDI

    “[…] the medium through which the users access the available information [in SDIs]”(Van Loenen et al., 2010)

    Some definitions and characteristics:

    “A geoportal may be defined as an internet or intranet entry point with the tools for retrieving metadata, searching for GI, visualizing GI, downloading GI, disseminating GI and in some cases the ordering of GI services” (Giff et al., 2008)

    10 | Concepts Exercise

    Reference list




    • Craglia, M., & Annoni, A. (2007). INSPIRE: An innovative approach to the development of spatial data infrastructures in Europe. Research and theory in advancing spatial data infrastructure concepts, 93-105.

    • Granell, C., Gould, M., Manso, M. A., & Bernabe, M. A. (2009). Spatial data infrastructures. In Handbook of Research on Geoinformatics (pp. 36-41). IGI Global.

    • Rajabifard, A., & Williamson, I. P. (2001). Spatial data infrastructures: concept, SDI hierarchy and future directions.
    • Schade, S., Granell, C., Vancauwenberghe, G., Keßler, C., Vandenbroucke, D., Masser, I., & Gould, M. (2020). Geospatial Information Infrastructures. In Manual of Digital Earth (pp. 161-190). Springer, Singapore.

    • Williamson, I. P., Rajabifard, A., & Feeney, M. E. F. (Eds.). (2003). Developing spatial data infrastructures: from concept to reality. CRC Press.