ø.. _project:

Project Model

suite allows to manage the publication lifecycle of complex documents called Projects , i.e. documents made by :

  • A lot of different set of files (pdf, docs, imgs, GIS data, csv…)

  • A custom metadata model (mixed information, data description, repetible complex fields..)

  • Common publication management metadata (author, creation date, version, publication phase, moderation messages, operation log)

Note

Each Project is associated to a Use Case Descriptors (UCD) that determines among other things its structure, its lifecycle, access rules etc..

Project structure

A Project is a represented as a JSON Document with the following sections :

  • Core Information : Basic management information e.g. id, version, ucd..

  • Publication Information : Information about author, editor, access, logs..

  • The Document : Custom metadata managed by clients, including references to FileSets and their manifestations

  • Lifecycle Information : Information on the status of the Project, its publication phase, related error messages..

  • Relationships : List of relations towards other projects

  • Identification References : List of record about indexing and identification (e.g. centroids coordinates, temporal reference, catalogues references..)

Core Information

Basic information for the management of a project can be found in the following fields

{
  "_id": "6283af5802ad3d70e315cb6e",
  "_version": "1.0.14",
  "_profileID": "...",
  "_profileVersion": "1.0.0",
...
}

Publication Information

Publication information contains :

  • Creation Info : user, context and time of creation

  • Last Edit Info : user, context and time of last edit operation

  • Access : policy and license ID

Following is an example of this section, that can be accessed at path $._info

"_info": {
    "_creationInfo": {"_user": {"_username": "fabio.sinibaldi"},
      "_context": {"_id": "/gcube/devsec/devVRE","_name": "/devVRE"},
      "_instant": "2022-05-17T16:21:12.399"},
    "_lastEditInfo": {"_user": {"_username": "fabio.sinibaldi"},
      "_context": {"_id": "/gcube/devsec/devVRE","_name": "/devVRE"},
      "_instant": "2022-05-17T16:22:04.328"},
    "_access": {"_policy": "OPEN","_license": ""}
}

The Document

Custom metadata managed by the user is stored as a JSON object at path $._theDocument. Its structure is strongly dependant on both

Along with other custom metadata, this section contains references to registered FileSets, which in turn can present multiple Manifestations. .. note:: Filesets are set of files representing a single entity, which need to be handled together (e.g. a dataset and its index, a file and its crc).

FileSets

Inside the document section, can be found registered FileSets.

Note

Filesets are set of files representing a single entity, which need to be handled together (e.g. a dataset and its index, a file and its crc).

They represent a set of files stored in gCube Workspace with associated :

  • uuid : unique identifier of the fileset, useful for JSON paths

  • creation Info : user, context, instant time of fileset registration

  • access : policy and license

  • folder ID : gCube Workspace folder ID containing the fileset

  • payloads : list of registered files

  • Manifestations : information on generated manifestations of the fileset

Warning

Filesets fields need to be defined in Use Case Descriptors (UCD) in order to be registered.

Following is an example of a fileset with no materializations

"...":{
    "_uuid": "..",
    "_creationInfo":{"_user":{"_username": "FAKE"},
      "_context":{"_id": "/pred4s/preprod/preVRE","_name": "/preVRE"},
      "_instant": "2022-02-28T16:53:30.115"},
    "_access":{"_policy": "OPEN","_license": ""},
    "_folderID": "8d38c0de-04f2-4463-a263-edfd6cec175a",
    "_payloads":[
      {"_mimetype": "application/x-shapefile",
        "_storageID": "bc801242-aac1-4228-99c5-d2c762c24331",
        "_link": "https://data-pre.d4science.org/shub/E_MHlKeXN1SWhSd1hpY2tsaHZhZjVhV2NrWkg5QUtnVGVGUE9EWWlMMGVBbjN3ajNjTUU3M0pYUE1nTmZGV2ZVWQ==",
        "_name": "pos.shp"}],
    "_materializations":[]
}

Manifestations

Filesets can be processed in various way by exploiting gCube Applications capabilities e.g.

  • GIS Dataset can be published on gCube SDI in order to consult them as live maps

  • Images can be processed in order to generate previews

  • Dataset can be pushed onto DBMS

  • Analytics engine can be used to extract information from datasets

Either by deploying a dataset onto specific engine or by generating other resources by processing the fileset, we register this new information as a fileset manifestation. The suite relies on Plugins in order to deal with specific tasks (dealing with DataMiner, publishing on SDI, pushing to gCat..).

A manifestation contains at least the following information :

  • type : This field instructs the consumer on the nature of the manifestation.

Warning

Since the structure of the manifestation changes from type to type in order to satisfy specific requirements, clients are expected to know the structure of expected manifestations. Check below for known manifestation formats.

Known Manifestations Types

Following is a list of known manifestation types and reference examples

gCube SDI Layer

type : gcube-sdi-layer It represents a gCube SDI layer, accessible via various ogc links (expected at list wms and wfs). It contains the following fields :

  • ogcLinks : map of OGC-Protocol -> endpoint to consume the layer

  • bbox : layer bounding box

  • platform info : Platform specific info (expected types GeoServer, GeoNetwork, Thredds)

Following is an example of a layer served by a GeoServer platform

{"_type": "gcube-sdi-layer",
 "_ogcLinks":[{"wms": "https://geoserver1-t.pre.d4science.org/geoserver/profiledconcessioni_pred4s_preprod_prevre_621ceff6eddbbb1c62c9d4ae/wms?service=WMS&version=1.1.0&request=GetMap&layers=profiledconcessioni_pred4s_preprod_prevre_621ceff6eddbbb1c62c9d4ae:pos&styles=&bbox=8.620919,40.629750,8.621179,40.630258&srs=EPSG:4326&format=application/openlayers&width=400&height=400"}],
 "_bbox":{"_maxX": 8.621178639172953,
      "_minX": 40.630257904721645,
      "_maxY": 8.62091913167495,
      "_minY": 40.62975046683799},
    "_platformInfo":[{
        "_type": "Geoserver",
        "workspace": "profiledconcessioni_pred4s_preprod_prevre_621ceff6eddbbb1c62c9d4ae",
        "layerName": "pos",
        "persistencePath": "profiledConcessioni/621ceff6eddbbb1c62c9d4ae/147add5d-aa09-418c-a5cb-aa761c7498c5/pos",
        "files":["pos.shp"],
        "storeName": "pos_store"
        }
    ]
}

Lifecycle Information

This section reports the status of the Project in relation to its configured Project Management (Lifecycle).

Note

Projects can be configured with different workflows, comprising of STEPS, PHASES and related events to be triggered. Check out Project Management (Lifecycle) section.

In particular it contains the following information:

  • Phase, Last Invoked Step : heavily dependant on configured Project Management (Lifecycle)

  • Last operation Status : valid values are OK, ERROR and WARNING

  • Error and Warning messages : expected not empty if status != OK

  • Triggered Events : list of reports for triggered events (check Project Management (Lifecycle))

  • Notes : It contains user’s note if specified on last operation

Following is an example of this section, which is accessible at path $._lifecycleInformation

"_lifecycleInformation": {
   "_phase": "DRAFT",
   "_lastInvokedStep": null,
   "_lastOperationStatus": "OK",
   "_errorMessages": [],
   "_warningMessages": [],
   "_triggeredEvents": [
     {
       "event": "INIT_DOCUMENT",
       "lastOperationStatus": "OK",
       "errorMessages": null,
       "warningMessages": null
     }
   ],
   "_notes": null
}

Relationships

This section contains a list of JSON object representing a link toward another project. A relation is comprised of

  • relationship name : the name of the relations, user defined.

  • target id : the id of the target project

  • target UCD [optional]: the UCD of the target project, if different from the parent’s

Note

Relationships can be navigated by asking the service a relationship chain.

Identification References

Identification references is a list of JSON objects containing information needed to identify the project in indexes and collections, eg :

  • Temporal reference allow to identify the project in a temporal axis

  • Spatial references allow to identify the project in 2D / 3D GIS Maps

  • Catalogue references allow to identify the project in gCube catalogues (e.g. CKAN, GeoNetwork..)

Following is an example of spatial reference

"_identificationReferences": [
       {
           "geoJSON": {
               "type": "Point",
               "crs": {
                   "type": "name",
                   "properties": {}
               },
               "bbox": [
                   8.694061737861185,
                   39.08725274364023,
                   0.0,
                   8.687033527629803,
                   39.09227506773862,
                   0.0
               ],
               "coordinates": [
                   8.690547632745494,
                   39.08976390568942,
                   0.0
               ]
           },
           "_type": "SPATIAL REFERENCE"
       }
   ]