HistoryViewLinks to this page 2015 January 15 | 05:08 pm
OSLC_logo.png

Open Services for Lifecycle Collaboration
Indexable Linked Data Provider Specification Version 2.0

Status: Stable working draft. Implementations welcome to validate the draft. Ongoing work is occurring to validate this draft.

This Version

Latest Version

Previous Version

  • This is the first version of this specification.

Authors

  • Jim des Rivieres (IBM, OSLC-Core)

Contributors

Table of Contents

Contents


Notation and Conventions

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC2119. Domain name examples use RFC2606.

Introduction

The OSLC Tracked Resource Set 2.0 specification defines a general-purpose protocol for making a large set of resource URIs discoverable and for reporting ongoing changes affecting the set. This document describes how a lifecycle tool web application exposes a live feed of its linked lifecycle data via a tracked resource set, in a way that permits others tool to build and maintain live, searchable indexes based on that linked data.

A particular pool of linked data resources exposed through a particular Tracked Resource Set is referred to as an Indexable Linked Data Source. When exposing a live feed of its linked lifecycle data, a lifecycle tool server is playing the role of Indexable Linked Data Provider. An Indexable Linked Data Consumer is a client (usually another server) that works with linked lifecycle data coming from an Indexable Linked Data Source.

Terminology

This specification uses the term “Tracked Resource Set”, “Change Log”, and “Change Events” defined by the OSLC Tracked Resource Set 2.0 specification.

This specification also defines the following terms:

Access Context - Grouping of resources with similar security requirements.

Access Context List Resource - Resource describing a list of Access Contexts.

Indexable Linked Data Source - Pool of linked data resources.

Indexable Linked Data Consumer - Client application that uses an Indexable Linked Data Source to discover a set of linked data resources and track changes to them.

Indexable Linked Data Provider - Server that implements an Indexable Linked Data Source.

Index Resource - Resource whose URI is listed in the Tracked Resource Set of an Indexable Linked Data Source.

TRS Patch - An extended Change Event in a Tracked Resource Set detailing a change to the resource’s RDF representation.

IMPORTANT NOTE TO READERS: The terminology definitions in this section are a normative portion of this specification, imposing requirements upon implementations. All the capitalized words in the text of this specification, such as “Access Context”, reference these defined terms. Whenever the reader encounters them, their definitions found in this section must be followed.

Indexable Linked Data Sources

(This section is normative.)

An Indexable Linked Data Provider offers one or more Indexable Linked Data Sources.

Each Indexable Linked Data Source has a set of linked data resources, called Index Resources. The Indexable Linked Data Provider decides which particular Index Resources are in a particular Indexable Linked Data Source at any moment. Both the set of Index Resources and the linked data contents of each Index Resource may vary over time.

Each Indexable Linked Data Source consists of a Tracked Resource Set (TRS) resource conforming to the Tracked Resource Set 2.0 specification. The resource URIs listed in a Tracked Resource Set are exactly those of the Indexable Linked Data Source’s Index Resources.

Index Resources MUST have a RDF linked data representation, and SHOULD support GET requests specifying text/turtle as the acceptable media type and returning a Turtle serialization of RDF content in response. Index Resources MAY support other RDF media types as well. The RDF content of an Index Resource is one RDF data graph representing one of the Indexable Linked Data Provider’s linked data resources.

Index Resources MAY be Linked Data Platform RDF Sources (LDP-RS) Linked Data Platform, and MAY support paging Linked Data Platform Paging.

By retrieving the Indexable Linked Data Source’s Tracked Resource Set, an Indexable Linked Data Consumer can discover the URIs of the Indexable Linked Data Source’s Index Resources. By retrieving the Index Resources, an Indexable Linked Data Consumer can discover the linked data representation of that Index Resource.

Indexable Linked Data Source Access Authorization

(This section is normative.)

An Indexable Linked Data Source’s Tracked Resource Set resources and the Index Resources listed in them are typically protected resources; to gain access, an Indexable Linked Data Consumer is expected to pass satisfactory credentials with its HTTP requests. In order for a lifecycle tool’s linked lifecycle data to be available to Indexable Linked Data Consumers, these protected resources SHOULD support access from trusted Indexable Linked Data Consumers made on behalf of the Consumers themselves (as opposed to on behalf of particular users).

There are several ways for an Indexable Linked Data Provider to achieve this:

  • OAuth 2.0 client credentials - An Indexable Linked Data Provider MAY support OAuth 2.0 authentication from a trusted client. The Consumer obtains an access token via an OAuth 2.0 client credentials grant, and makes requests passing the access token in a Bearer Authorization header.
  • 2-legged OAuth 1.0a - An Indexable Linked Data Provider MAY support OAuth 1.0 authentication from a trusted client. The Consumer makes requests passing an OAuth Authorization header signed with its consumer key and secret but containing no token (a so-called “2-legged” OAuth request).
  • Functional user id - An Indexable Linked Data Provider MAY support HTTP Basic authentication from a trusted client. The Consumer makes requests passing a Basic Authorization header containing the username and password credentials of a high-privilege “functional user id” with broad read access to linked data.

These protected resources MAY support other HTTP authentication methods as well. This applies to the Indexable Linked Data Source’s Tracked Resource Set resource and its various components and pages, as well as to the Indexable Linked Data Source’s Index Resources.

Access Contexts

(This section is normative.)

An Indexable Linked Data Consumer that makes a copy of linked data obtained from an Indexable Linked Data Source will likely need to control access to its copy of the data. It is simple enough for an Indexable Linked Data Consumer to allow some users to access its copy, while denying access to other users.

In order to make it feasible for Indexable Linked Data Consumers to offer access control at a finer grain than the whole Indexable Linked Data Source, an Indexable Linked Data Provider can define one or more Access Contexts and associate each of its linked data resources with an Access Context. When configuring an Indexable Linked Data Consumer to work with a particular Indexable Linked Data Source, the administrator can query the Indexable Linked Data Provider for a list of Access Contexts relevant to a particular Indexable Linked Data Source. This allows the administrator to configure access control at the level of Access Contexts within an Indexable Linked Data Source, not just at the level of whole Indexable Linked Data Sources.

For its part, the Indexable Linked Data Provider associates individual resources to Access Contexts, which it reflects with a statement (triple) in the resource’s RDF representation. When the Indexable Linked Data Consumer copies a resource from an Indexable Linked Data Source, this triple brings a record of the association(s) between resource and Access Context(s). This lets the Indexable Linked Data Consumer connect access control rules expressed at the level of Access Contexts with the resource representations copied from the Indexable Linked Data Source. Adding a resource to an Access Context, or removing one from it, changes the RDF representation of the resource. Like other changes affecting the RDF representation of the resource, this change is reported as a Change Event in the Tracked Resource Set for the Indexable Linked Data Source. This lets the Indexable Linked Data Consumer work with Indexable Linked Data Sources with Access Contexts whose set of resources varies dynamically.

This set of Access Contexts within an Indexable Linked Data Source can also change over time. Adding a new Access Context to an Indexable Linked Data Source will generally require an administrator to configure the Indexable Linked Data Consumers working with that Indexable Linked Data Source to add an access control rule for dealing with the additional Access Context.

Access Context Namespace

(This section is normative.)

The namespace used for Access Context-related resources and properties defined in this specification is as follows:

  • Namespace URI: [TBD - using http://open-services.net/ns/core/acc# provisionally, but this needs to be reviewed]
  • Default Prefix: acc

Associations between Resources and Access Contexts

(This section is normative.)

The RDF acc:accessContext property is used to indicate that a resource belongs to an Access Context. The resource is the subject; the Access Context is the object.

  • Property Name: accessContext
  • Description: Access Context of the resource
  • Property URI: http://open-services.net/ns/core/acc#accessContext

For example, the RDF statement (in Turtle):

@prefix acc: <http://open-services.net/ns/core/acc#> .
<https://a.example.com/defect/2314> acc:accessContext <https://a.example.com/acclist#alpha> .

declares the resource https://a.example.com/defect/2314 to be in the Access Context https://a.example.com/acclist#alpha.

A linked data resource of an Indexable Linked Data Source that is deemed (by the Indexable Linked Data Provider) to be in an Access Context MUST use the acc:accessContext predicate in its RDF representation to assert a relation between the linked data resource (subject) and an Access Context. For example, the above RDF statement embedded in the representation of resource https://a.example.com/defect/2314 asserts that this resource is in Access Context https://a.example.com/acclist#alpha. The RDF representation of a linked data resource in several Access Contexts will have multiple such RDF statements; for a linked data resource not in any Access Context, there will be none.

Access Context List Resource

(This section is normative.)

If an Indexable Linked Data Provider uses Access Contexts within the resources of an Indexable Linked Data Source, the Indexable Linked Data Provider MUST provide an Access Context List resource for the Indexable Linked Data Source. If an Indexable Linked Data Provider has more than one Indexable Linked Data Source, it MUST designate an Access Context List resource for each Indexable Linked Data Source; several Indexable Linked Data Sources MAY share the same Access Context List resource.

The Access Context List resource is intended to be accessed by administrator for the purpose of configuring an Indexable Linked Data Consumer that is working with linked data obtained from that Indexable Linked Data Source. The representation of the Access Context resource is itself linked data.

The Indexable Linked Data Provider MUST support the use of the HTTP GET method for the Access Context List resource. The Indexable Linked Data Provider SHOULD require the use of TLS when sending requests to the Access Context List resource. The Indexable Linked Data Provider SHOULD require authentication for the Access Context List resource, and SHOULD allow access only to users with administrative privileges. The Indexable Linked Data Provider’s response MUST support the JSON-LD media type (application/ld+json), and MAY support other linked data representations. The response SHOULD include an ETag header.

A client uses an HTTP GET request to retrieve a representation of the Access Context List resource, specifying JSON-LD as an acceptable format.

Non-normative example of a request:

GET https://a.example.com/acclist HTTP/1.1
Accept: application/ld+json
Authorization: Basic [missing - admin user credentials]

The response MUST be a JSON-LD format string with a node for the Access Context List along with a node for each Access Context. The response SHOULD use the simple @graph form with a default graph as shown in the example below. The response SHOULD use the @context value shown below (i.e., as a boilerplate header), and SHOULD NOT use other advanced JSON-LD features, since these can make the response more difficult to understand for human readers who only know JSON, and more difficult to processed programmatically by scripts without the benefit of a full JSON-LD library. The node’s type property gives the type of the node - either acc:AccessContextList or acc:AccessContext; the node’s id property gives the Access Context URL; the title and description properties give the title and description, respectively.

Non-normative example of a response:

HTTP/1.1 200 OK
Content-Type: application/ld+json;charset=UTF-8
ETag: 68djsgg82
{
  "@context": {
    "acc": "http://open-services.net/ns/core/acc#",
    "id": "@id",
    "type": "@type",
    "title": "http://purl.org/dc/terms/title",
    "description": "http://purl.org/dc/terms/description"
  },
  "@graph": [{
     "id": "https://a.example.com/acclist",
     "type": "acc:AccessContextList"
    }, {
     "id": "https://a.example.com/acclist#alpha",
     "type": "acc:AccessContext",
     "title": "Alpha",
     "description": "Resources for Alpha project"
    }, {
     "id": "https://a.example.com/acclist#beta",
     "type": "acc:AccessContext",
     "title": "Beta",
     "description": "Resources for Beta project"
  }]
}

The response MAY include other properties. A client MUST ignore any properties that it does not understand.

Resource: AccessContextList

(This section is normative.)

  • Name: AccessContextList
  • Description: An Access Context List
  • Type URI: http://open-services.net/ns/core/acc#AccessContextList

AccessContextList Properties

AccessContextList resources do not currently have any declared properties.

Resource: AccessContext

(This section is normative.)

  • Name: AccessContext
  • Description: An Access Context
  • Type URI: http://open-services.net/ns/core/acc#AccessContext

AccessContext Properties

(This section is normative.)

Prefixed Name Occurs Value-type Description
dcterms:title zero-or-one String A human-readable string describing the Access Context. RECOMMENDED.
dcterms:description zero-or-one String A human-readable string describing the Access Context in more detail.

URI Stability

(This section is non-normative.)

Access Context List and Access Context resources should have stable URIs. When Access Context URIs are based on an Access Context List URI with the addition of local id in the fragment (e.g., the Access Context URI https://a.example.com/acclist#alpha is based on the Access Context List URI https://a.example.com/acclist), the Indexable Linked Data Provider should ensure that each Access Context has a stable local id that is unique within the Access Context List.

TRS Patch

(This section is non-normative.)

An Indexable Linked Data Provider uses a Tracked Resource Set (TRS) resource to expose a live stream of events affecting a set of Index Resources with RDF representations. Each time an Index Resource is changed, the Provider adds a Change Event to the TRS’s Change Log. A typical Indexable Linked Data Consumer polls the TRS to discover new Change Events appearing in the Change Log, and uses HTTP GET to retrieve the current state (RDF representation) of the affected Index Resource.

For an Index Resource that changes regularly, the typical Consumer retrieves the same Index Resource over and over again. When the current state (RDF representation) of the Index Resource is large and the differences between adjacent states can be described compactly, including additional information in the trs:Modification Change Event can allow the Consumer to infer the Index Resource’s resultant state and thereby avoid having to re-retrieve the Index Resource.

Similiarly, in versioned worlds each change to a versioned resource may result in the creation of a new Index Resource representing an immutable version of the resource. The typical Consumer generally retrieves each Index Resource as it is created. The state of the new Index Resource is often quite similar to the state of Index Resource corresponding to a previous version. When the state of one Index Resource is similar to that of another Index Resource and the differences between the two can be described compactly, including additional information in the trs:Creation Change Event can allow the Consumer to infer the new Index Resource’s resultant state from the potentially-known state of a previously-retrieved Index Resource and thereby avoid having to retrieve the new Index Resource.

This section describes an extension to Change Events allowing them to carry detailed information about modifications to resources with an RDF representation.

TRS Patch Namespace

(This section is normative.)

The namespace used for TRS Patch-related resources and properties defined in this specification is as follows:

  • Namespace URI: [TBD - using http://open-services.net/ns/core/trspatch# provisionally, but this needs to be reviewed]
  • Default Prefix: trspatch

Additional Change Event Properties

(This section is normative.)

The following additional properties are intended to be used for Change Events:

Prefixed Name Occurs Value-type Description
trspatch:createdFrom zero-or-one String URI of antecedent resource. Required for trs:Creation Change Events.
trspatch:rdfPatch zero-or-one String LD Patch describing a modification to the resource’s RDF representation.
trspatch:beforeETag zero-or-one String HTTP entity tag of resource immediately before this change.
trspatch:afterETag zero-or-one String HTTP entity tag of resource immediately after this change.

The trspatch:createdFrom property, when present, identifies the antecedent resource. If omitted, the antecedent resource is the resource referenced in the trs:changed property. The antecedent resource is the one that supplies the “before” state.

The trspatch:rdfPatch property, when present, describes a patch applied to the antecedent resource’s representation. The result of applying the patch describes the representation of the resource referenced in the trs:changed property. The trspatch:rdfPatch property is used with trs:Modification and trs:Creation Change Events; it is not meaningful for trs:Deletion Change Events. The value of the trspatch:rdfPatch property is an LD Patch. The trspatch:rdfPatch property is meaningful only for resources with RDF representations.

The trspatch:beforeETag property, when present, gives the initial HTTP entity tag of the antecedent resource. This is the entity-tag value that would be returned in the HTTP ETag response header if the antecedent resource is retrieved immediately before the change.

The trspatch:afterETag property, when present, gives the final HTTP entity tag of the resource referenced in the trs:changed property. For a trs:Modification (trs:Creation) Change Event, this is the entity tag of the resource immediately after it was modified (created, respectively). This is the entity-tag value that would be returned in the HTTP ETag response header if the resource is retrieved immediately after the change.

Note that these properties are can be used with any resource having both an RDF representation and an entity tag. This includes all Linked Data Platform RDF Source (LDP-RS) resources, which have both.

Note also that the trspatch:beforeETag and trspatch:afterETag properties are meaningful for any kind of resource, not just ones with RDF representations.

LD Patch

(This section is non-normative.)

The Linked Data Patch (LD Patch) specification is currently under development by the W3C LDP WG. Our intention is to adopt the syntax and semantics of LD patches from the LD Patch specification rather than specifying our own. However, the LD Patch effort is only just beginning, and the First Public Working Draft was published on 18 September 2014.

In an effort to insulate Providers for changes to the LD Patch specification while it is being refined, this document proposes that Providers temporarily limit themselves to generating LD patches in a limited subset which we call Core format. Core format is extremely simple (no prefixes, no variables, and no Binds) but perfectly adequate for describing patches to graphs not involving blank nodes. (Core format is based on an early (and unofficial) precursor called RDF Patch.)

A Core format patch consists of a sequence of rows. A row with ‘A’ (or ‘D’) in the first column describes the addition (deletion) of one RDF triple from the resource’s RDF data graph. The subject, predicate, and object of the triples are described in columns 2-4 in the form of absolute IRI references enclosed between ‘<’ and ‘>’. Each row is delimited by a ‘.’ and may have white space between the various terms in a row, including newlines.

Example of a Core format patch that deletes one RDF triple (subject http://example.com/bob, predicate http://xmlns.com/foaf/0.1/knows, object https://example.com/alice) and adds another an RDF triple (subject https://example.com/fred, predicate http://http://xmlns.com/foaf/0.1/member, object http://example.com/old-timers):

D <http://example.com/bob> <http://xmlns.com/foaf/0.1/knows> <http://example.com/alice> .
A <http://example.com/fred> <http://http://xmlns.com/foaf/0.1/member> <http://example.com/old-timers> .

TRS Patch Example 1

(This section is non-normative.)

Turtle representation for the resource https://a.example.com/config/a1 in state 1. Assume that when the resource is retrieved in this state, the entity tag 15687ds9gha6s7 is returned in the ETag response header:

# The following is the representation of
# https://a.example.com/config/a1
# in the state with entity tag 15687ds9gha6s7
@prefix dcterms: <http://purl.org/dc/terms/>.
@prefix ldp: <http://www.w3.org/ns/ldp#>.
<https://a.example.com/config/a1>
  a ldp:BasicContainer;
  dcterms:title "Component configuration A1";
  ldp:member <https://a.example.com/version/s/143>;
  ldp:member <https://a.example.com/version/r/577>;
  ldp:member <https://a.example.com/version/t/033>.

Turtle representation for the same resource https://a.example.com/config/a1 in state 2. Assume that when the resource is retrieved in this state, the entity tag 285d4h2ffgddd9 is returned in the ETag response header:

# The following is the representation of
# https://a.example.com/config/a1
# in the state with entity tag 285d4h2ffgddd9
@prefix dcterms: <http://purl.org/dc/terms/>.
@prefix ldp: <http://www.w3.org/ns/ldp#>.
<https://a.example.com/config/a1>
  a ldp:BasicContainer;
  dcterms:title "Component configuration A1";
  ldp:member <https://a.example.com/version/s/143>;
  ldp:member <https://a.example.com/version/r/578>;
  ldp:member <https://a.example.com/version/t/033>.

Turtle representation for a Change Event describing resource https://a.example.com/config/a1 changing from state 1 to state 2:

# The following is the representation of a change event
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix trs: <http://open-services.net/ns/core/trs#>.
@prefix trspatch: <http://open-services.net/ns/core/trspatch#>.
<urn:urn-3:a.example.com:2014-04-28T17:39:32.000Z:102>
  a trs:Modification;
  trs:changed <https://a.example.com/config/a1>;
  trs:order "102"^^xsd:integer;
  trspatch:beforeEtag "15687ds9gha6s7";
  trspatch:afterEtag "285d4h2ffgddd9";
  trspatch:rdfPatch
    """
     D <https://a.example.com/config/a1> <http://www.w3.org/ns/ldp#member> <https://a.example.com/version/r/577> .
     A <https://a.example.com/config/a1> <http://www.w3.org/ns/ldp#member> <https://a.example.com/version/r/578> .
    """.

TRS Patch Example 2

(This section is non-normative.)

Turtle representation for the resource https://a.example.com/sw-movie/versions/1. Assume that when the resource is retrieved in this state, the entity tag 783xhaty95 is returned in the ETag response header:

# The following is the representation of
# https://a.example.com/sw-movie/versions/1
# in the state with entity tag 783xhaty95
@prefix dcterms: <http://purl.org/dc/terms/>.
@prefix ldp: <http://www.w3.org/ns/ldp#>.
<https://a.example.com/sw-movie/versions/1>
  dcterms:isVersionOf <https://a.example.com/sw-movie> .
<https://a.example.com/sw-movie>
  a ldp:Resource;
  dcterms:title "Star Wars".

Turtle representation for the resource https://a.example.com/sw-movie/versions/2. Assume that when the resource is retrieved in this state, the entity tag 212gyysxx8 is returned in the ETag response header:

# The following is the representation of
# https://a.example.com/sw-movie/versions/2
# in the state with entity tag 212gyysxx8
@prefix dcterms: <http://purl.org/dc/terms/>.
@prefix ldp: <http://www.w3.org/ns/ldp#>.
<https://a.example.com/sw-movie/versions/2>
  dcterms:isVersionOf <https://a.example.com/sw-movie> .
<https://a.example.com/sw-movie>
  a ldp:Resource;
  dcterms:title "Star Wars: Episode IV - A New Hope".

Turtle representation for a Change Event describing the creation of the resource https://a.example.com/sw-movie/versions/2. The TRS patch describes the state of this new resource in terms of the state of resource https://a.example.com/sw-movie/versions/1:

# The following is the representation of a change event
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix trs: <http://open-services.net/ns/core/trs#>.
@prefix trspatch: <http://open-services.net/ns/core/trspatch#>.
<urn:urn-3:a.example.com:2014-11-20T13:08:00.000Z:102>
  a trs:Creation;
  trs:changed <https://a.example.com/sw-movie/version/2>;
  trs:order "192"^^xsd:integer;
  trspatch:createdFrom <https://a.example.com/sw-movie/version/1>;
  trspatch:beforeEtag "783xhaty95";
  trspatch:afterEtag "212gyysxx8";
  trspatch:rdfPatch
    """
     D <https://a.example.com/sw-movie/versions/1>  <http://purl.org/dc/terms/isVersionOf> <https://a.example.com/sw-movie> .
     A <https://a.example.com/sw-movie/versions/2>  <http://purl.org/dc/terms/isVersionOf> <https://a.example.com/sw-movie> .
     D <https://a.example.com/sw-movie> <http://purl.org/dc/terms/title> \"Star Wars\".
     A <https://a.example.com/sw-movie> <http://purl.org/dc/terms/title> \"Star Wars: Episode IV - A New Hope\".
    """.

General Guidance

(This section is non-normative.)

General Guidance for Providers

There are a number of possible ways that a lifecycle tool could go about exposing its linked lifecycle data. Here is some general guidance:

  • An Indexable Linked Data Provider should restrict itself to a small number of Indexable Linked Data Sources. When configuring an Indexable Linked Data Consumer, an administrator will typically have to select Indexable Linked Data Sources one at a time.
  • An Indexable Linked Data Provider should restrict itself to a static set of Indexable Linked Data Sources. When an Indexable Linked Data Source gets created dynamically, the administrator would be required to update the configurations of Indexable Linked Data Consumers in order to make the new Indexable Linked Data Source available to Consumers.
  • An Indexable Linked Data Provider’s Indexable Linked Data Sources should contain pairwise-disjoint sets of Index Resources. That is, a resource should not appear as an Index Resource of more than one Indexable Linked Data Source. An Indexable Linked Data Provider should document any overlap between Indexable Linked Data Sources. (Some Indexable Linked Data Consumers are unable to work with overlapping Indexable Linked Data Sources.)
  • An Indexable Linked Data Source’s Index Resources should be linked data resources under the control of the Indexable Linked Data Provider itself, rather than linked data resources of some other lifecycle tool. In other words, a lifecycle tool should expose its own resources, not those of others.
  • The RDF content of an Indexable Linked Data Source’s Index Resources should be statements about linked data resources under the control of the Indexable Linked Data Provider itself, rather than statements about linked data resources of some other lifecycle tool. In other words, the subjects of a lifecycle tool claims should be its own resources, as opposed to resources of some other lifecycle tool.
  • An Index Resource may be one of the Indexable Linked Data Provider regular linked lifecycle data resources, or it may be a resource containing an RDF data graph used specifically for exposing some linked data though an Indexable Linked Data Source.
  • An Indexable Linked Data Provider’s should expose all of its linked lifecycle data via Index Resources in one of its Indexable Linked Data Sources. Any information that is held back will be unavailable to Indexable Linked Data Consumers.
  • An Indexable Linked Data Provider’s combined RDF dataset should not repeat the same RDF statements.
  • An Indexable Linked Data Provider’s combined RDF dataset should not contain contradictory RDF statements.
  • It is recommended that an Indexable Linked Data Provider report changes to its linked lifecycle data (including resource creations, deletions, and modifications) within 1 second of the changes being committed. Changes are reported via the Indexable Linked Data Source’s Tracked Resource Set Change Log. This helps ensure that Indexable Linked Data Consumers are able to obtain a live feed of changes in nearly real-time.
  • It is recommended that an Indexable Linked Data Source’s Tracked Resource Set include a Base not older than 7 days. This helps ensure that Indexable Linked Data Consumers are able to initially build an index without having to process Change Events older than 7 days.
  • It is recommended that an Indexable Linked Data Source’s Tracked Resource Set Change Log retain Change Events for at least 7 days. This helps ensure that there are sufficient Change Events to allow an Indexable Linked Data Consumer tracking the Indexable Linked Data Source to incrementally update an out-of-date index after a lengthy downtime or network outage.

General Guidance for Consumers

What the Consumer building an index does is akin to what a Web crawler (wikipedia) does, and most of the same considerations apply.

In its role as Indexable Linked Data Consumer, the client plays follower to the Indexable Linked Data Provider’s leader. A Consumer retrieves the TRS, Change Logs, and Base Resources, as well as some or all the Index Resources listed in the TRS. Except for the TRS Resource URI itself, the Consumer is blindly retrieving a succession of URIs that the Provider feeds to it. An insufficiently wary Consumer can come to grief when it interacts with an imperfect or untrustworthy Provider.

Most of the risks are always present: networks connecting Consumer to Provider may experience delays and outages; and Provider implementations may be imperfect (bugs in code, database corruptions). Moreover, when the Provider is untrusted - when there is a concern of the Provider could attempt something nefarious - the Consumer needs to take extra steps to prevent itself from being misused or abused.

Here are risks and general guidance for Indexable Linked Data Consumers:

  • The size and rate of change of a Data Source reflects the amount of linked data that a Provider has to make available and how often that data changes. These vary considerably and are often difficult to estimate in advance. A Consumer that is maintaining a copy of a Data Source should establish reasonable limits and monitor the size and rate of change so that a misbehaving Provider does not cause the Consumer to spend an unreasonable amount of effort (network communication and storage) in doing so. This should include caps on the number and representation sizes of resources (including TRS Change Logs, TRS Base pages, and Index Resources).
  • Since the Consumer is retrieving resources of various kinds over the network from the Provider, which takes time and network connections, the Consumer should do so in a way that does not prevent it from doing other useful work while waiting for responses. The Consumer should be tolerant of failures due to network Provider failures and outages, and carry over important work until the blockage has been removed. On the other hand, the Consumer should not allow its queue of work to grow without bound since that may make it difficult for the Consumer to clear the backlog. The Consumer should also be polite to the Provider, and avoid making multiple requests per second and/or downloading large files that might make it hard for the Provider to keep up with its normal workload.
  • A Data Source’s Index Resources should be linked data resources under the control of the Provider itself, rather than linked data resources of some other lifecycle tool. A Consumer that does not trust a Provider in this regard should mitigate the risk by keeping a server whitelist for each Data Source and refusing to retrieve the Index Resources of a given Data Source if the server is not on the whitelist. Without something like this, a Consumer can be tricked into retrieving and indexing resources that it should not, such as other resources located on a different Provider that the Consumer also happens to have access to.
  • The RDF content of a Data Source’s Index Resources should be statements about linked data resources under the control of the Provider itself, rather than statements about linked data resources of some other lifecycle tool. In other words, the subjects of a lifecycle tool claims should be its own resources, as opposed to resources of some other lifecycle tool. A Consumer that does not trust a Provider in this regard should mitigate the risk by keeping a content whitelist for each Data Source and rejecting the RDF content coming from Index Resources of a given Data Source if the URIs of subjects are not all on the content whitelist. Without something like this, a Provider can pollute an index with arbitrary claims which may interfere with or contradict authoritative claims made by the legitimate owner.

Access Context Guidance

(This section is non-normative.)

There are several things to consider when deciding how a lifecycle tool can make use of Access Contexts. Before suggesting possible designs, here are some characteristics that will help ensure a lifecycle tool will be useful to administrators tasked with configuring access for Indexable Linked Data Consumers to a tool’s Indexable Linked Data Sources:

  • Optional. An Indexable Linked Data Provider should only use Access Contexts if there are reasons why an administrator might want to impose differential access in external Indexable Linked Data Consumers.
  • Understandable. An administrator should be able to intuit from the Access Context name and description what kinds of resources are in it.
  • Useful collections. An Access Context should contain resources that can be treated similarly.
  • Same security classification. An Access Context should contain resources with the same security classification.
  • Reasonable number. The list of Access Contexts should not be so long as to overwhelm the administrator.
  • Stable. The set of Access Contexts should be more or less static. Changes to the Access Context list will generally require the administrator to update configurations of Indexable Linked Data Consumers.
  • Centralized. An Indexable Linked Data Provider should host a single Access Context List resource enumerating the Access Contexts used in any of its Indexable Linked Data Sources, unless there are reasons to do otherwise.

The following recipes suggest some of the designs that are possible.

Recipe 1: Your tool has top-level objects called workspaces. New workspaces are created infrequently, and only by administrators. Each linked data resource is associated with a single workspace. Teams of users work in the context of a single workspace. All the resources in a workspace have the same security classification.

Your tool should treat each workspace as a separate Indexable Linked Data Source, and not use Access Contexts.

An administrator can always control access to the linked data in an Indexable Linked Data Consumer on an Indexable Linked Data Source by Indexable Linked Data Source basis, and grant users access to linked data from some workspaces but not others.

Recipe 2: Your tool has top-level objects called projects. New projects are created infrequently, and only by administrators. Each linked data resource is associated with a single project. Teams of users work in the context of a set of projects. All the resources in a project have the same security classification.

Your tool should treat all projects as part of a single Indexable Linked Data Source, and automatically create Access Contexts in 1-1 correspondence with projects, taking on the name and description of the project.

An administrator can control access to the linked data in an Indexable Linked Data Consumer on a project by project basis, and grant users access to linked data from some projects but not others.

Recipe 3: Your tool has resources that can be tagged as containing confidential customer information. Teams of users work in the context of your tool. In the customer’s organization, only some employees are allowed access to confidential customer information.

Your tool should have a single Indexable Linked Data Source, and automatically create an Access Context named “Confidential Customer Data” and assigns all tagged resources to this Access Context. Other resources are left “loose”; i.e., not included in any Access Context.

An administrator for an Indexable Linked Data Consumer can control access to the confidential customer information separately from the regular linked data.

Recipe 4: Your tool has many resources. Teams of users work in the context of your tool. The customer’s organization has strict policies on what information can be shown to which employees.

Your tool should have a single Indexable Linked Data Source. Your tool should let an administrator define a set of custom Access Contexts. Your tool should let users (or possibly just administrators) associate resources with these Access Contexts.

An administrator can control access to the linked data in an Indexable Linked Data Consumer based on these custom Access Contexts.

TRS Patch Guidance

(This section is non-normative.)

TRS Patch Guidance for Providers

When the state of an Index Resource changes, the Provider adds a trs:Modification Change Event to the Change Log of the Indexable Linked Data Source’s Tracked Resource Set. The Change Event describes a transition between two definite states of the Index Resource. In principle, the entity tags of the two states, and the LD patch between the two RDF representations, are all well-defined. This much is true whether or not the Provider chooses to embed those pieces of information in the Change Event.

The decision as to whether to provide an LD Patch for a trs:Modification Change Event should be made on a case-by-case basis. Just because one Change Event for a resource includes an LD Patch, that does not mean that all Change Events for the same resource should also include an LD Patch.

Provider writers should remember that a Consumer wishing to discover the current state of a resource can always do so using HTTP GET to retrieve the resource. Including an LD Patch in a Change Event is an optional embellishment that allows some Consumers under the right circumstances to infer the new current state of a resource instead of re-retrieving the resource. It is up to the Provider to decide whether including an LD patch is likely to be worthwhile.

However, whenever a trs:Modification Change Event includes a trspatch:rdfPatch, it should also include accurate trspatch:beforeETag and trspatch:afterETag properties. Without all 3 pieces of information, a Consumer is unlikely to be able to do better than re-retrieving the resource to discover its updated state.

When the RDF representation of the resource contains a large number of RDF triples and the number of rows in the LD Patch is small, including the LD patch in the Change Event is recommended, and may improve overall system performance by allowing Consumers to avoid having to re-retrieve the resource to discover its updated state. Similiarly, whenever a trs:Creation Change Event includes a trspatch:rdfPatch, it should also include a trspatch:createdFrom along with accurate trspatch:beforeETag and trspatch:afterETag properties.

Conversely, when the number of affected RDF triples is large, the size of the LD Patch becomes significant. Including the LD Patch in the Change Event is not recommended because it bloats the size of Change Events in the Change Log, which may negatively impact performance. Omitting the LD patch from the Change Event is likely to give better overall performance.

TRS Patch Guidance for Consumers

A typical Indexable Linked Data Consumer is tracking the state of some or all Index Resources in the Indexable Linked Data Source. When the Consumer first discovers the Index Resource, whether through a trs:Creation Change Event in the Change Log or an entry in the Base of the Indexable Linked Data Source’s Tracked Resource Set, the Consumer uses HTTP GET to retrieve the current state of the Index Resource and gets back its RDF representation. When the response includes an entity tag for the resource in its current state, as it will when the Index Resource is a LDP-RS, the Consumer remembers both the RDF representation and entity tag as the state of that Index Resource.

When the Consumer processes a trs:Modification Change Event for the Index Resource in the Change Log of the Indexable Linked Data Source’s Tracked Resource Set, the Consumer learns that the Index Resource has changed state. This means that the Consumer’s remembered RDF representation and entity tag for the Index Resource are no longer accurate, which cues the Consumer to discard the remembered RDF representation and re-retrieve the Index Resource. However, when the Change Event includes a TRS Patch, the Consumer may have a second option. When the trspatch:beforeETag value matches the Consumer’s remembered entity tag, the Consumer can apply the trspatch:rdfPatch to the Consumer’s remembered RDF representation to compute a replacement RDF representation, which can be remembered along with the trspatch:afterETag value as the entity tag. When this happens, the Consumer can process the trs:Modification Change Event for the Index Resource without having to re-retrieve the Index Resource. It is clearly advantageous for a Consumer to behave this way whenever possible. On the other hand, if the trspatch:beforeETag value does not match the Consumer’s remembered entity tag, the Consumer cannot apply the trspatch:rdfPatch, and should treat the Change Event as if the TRS Patch were absent.

Similarly, when the Consumer processes a trs:Creation Change Event for the Index Resource in the Change Log of the Indexable Linked Data Source’s Tracked Resource Set, the Consumer learns of the existence of a new Index Resource. This cues the Consumer to retrieve the new Index Resource. However, when the Change Event includes a TRS Patch, the Consumer may have a second option. When the Consumer has previously retrieved and remembered the resource identified by trspatch:createdFrom in the state with entity tag matching trspatch:beforeETag, the Consumer can apply the trspatch:rdfPatch to the Consumer’s remembered RDF representation to compute an RDF representation of the new Index Resource, which can be remembered along with the trspatch:afterETag value as the entity tag. When this happens, the Consumer can process the trs:Modification Change Event for the Index Resource without having to retrieve the new Index Resource. It is clearly advantageous for a Consumer to behave this way whenever possible. On the other hand, if the trspatch:beforeETag value does not match the Consumer’s remembered entity tag, the Consumer cannot apply the trspatch:rdfPatch, and should treat the Change Event as if the TRS Patch were absent.

Risk-wise, TRS Patches provide a way for a Provider to tamper with the RDF representations of another server’s resources in the Consumer’s index without the other server’s involvement. The mitigations covered in General Guidance for Consumers, above, will address this risk as well. The Consumer’s server whitelist for an untrusted Data Source should be used to vet trspatch:createdFrom URIs, and its content whitelist should be used to vet subjects in the results of applying TRS patches.

Compatibility with Earlier Specifications

(This section is non-normative.)

The ILDP 2.0 specification is based on the TRS 2.0 spec. It is a follow-on from a similar approach that was based on the TRS 1.0 spec. The TRS 2.0 spec is a revision of the TRS 1.0 spec, which it renders obsolete. Both specs are semantically similar, although they are syntactically incompatible because the RDF vocabularies are in different namespaces (there are other small differences).

The following guidance suggest how providers and consumers should behave in order to be compatible with both TRS 2.0 and TRS 1.0.

Guidance for Implementing TRS 1.0-Compatible Providers

  • If your application is new, it’s data sources should support ILDP 2.0 and TRS 2.0. Additionally, your application’s data sources should support TRS 1.0 if the data sources will need to be consumed by older consumer applications employing TRS 1.0.
  • If your application already has existing data sources based on TRS 1.0, it should be upgraded to add support to the data sources for ILDP 2.0 and TRS 2.0. Your application’s data source’s should retain the existing support for TRS 1.0 so that your application can be upgraded independently without breaking the connection with pre-existing TRS 1.0-based consumers.
  • Conceptually, each data source has a single underlying tracked resource set that gets exposed in 2 different formats. The TRS 2.0 representation of the tracked resource set is exposed through one URI; the TRS 1.0 representation is exposed through a different URI. Change Log and Base resources also have different TRS version-specific URIs; however, the Change Event resources share the same URIs, as do all the Index resources in the resource set.
  • Always use distinct URIs for TRS 1.0 and TRS 2.0 endpoints. Since the TRS URIs are opaque to consumers, the TRS 1.0 and 2.0 URIs can be as (dis)similiar as your application chooses.
  • Include a global configuration switch to enable the obsolete TRS 1.0 support. When disabled, TRS 1.0 endpoints should not be discoverable, and attempts to access the TRS resource should return a suitable HTTP error status code (e.g., 404 Not Found). This make it easy for the customer to ensure that their provider instance is not in the TRS 1.0 game. TRS 1.0 support should be disabled by default for new instances of your application.
  • To ensure maximum compatibility for existing TRS 1.0 consumers, do not include TRS Patches in the TRS 1.0 representation (TRS Patches were not introduced until the ILDP 2.0 spec).
  • Use the property trs2:trackedResourceSet in the namespace trs2 = <http://open-services.net/ns/core/trs#> to advertise a TRS 2.0 resource endpoint URI.
  • Use the property trs1:trackedResourceSet in the namespace trs1 = <http://jazz.net/ns/trs#> to advertise a TRS 1.0 resource endpoint URI.

Guidance for Implementing TRS 1.0-Compatible Consumers

  • If your application is new, it should consume data sources that support ILDP 2.0 and TRS 2.0. Your consumer may also want to consume older data sources that support TRS 1.0.
  • If your application is already consuming data sources based on TRS 1.0, it should be upgraded to add support for consuming data sources based on ILDP 2.0 and TRS 2.0. Retain the existing support for TRS 1.0 so that instances of your application can be upgraded in the field to the new version and continue to consume data sources via TRS 1.0.
  • Include a global configuration to disable the obsolete TRS 1.0 support. When disabled, TRS 1.0 endpoints should not be accessed. This make it easy for the customer to ensure that their consumer instance is not in the TRS 1.0 game. TRS 1.0 support should be disabled by default for new instances of your application.
  • When your product is configured with the URI of a TRS, your product needs to know whether this is a TRS 1.0 or TRS 2.0 endpoint URI. A resource that follows TRS 1.0 does not follow TRS 2.0, and conversely. This can be done with a configuration parameter, or by sniffing the TRS endpoint URI at configuration time.
  • Use the property trs2:trackedResourceSet in the namespace trs2 = <http://open-services.net/ns/core/trs#> to discover TRS 2.0 resource endpoints.
  • Use the property trs1:trackedResourceSet in the namespace trs1 = <http://jazz.net/ns/trs#> to discover TRS 1.0 resource endpoints.
  • For data sources advertizing both TRS 2.0 and TRS 1.0 endpoints URIs, consumers should always prefer TRS 2.0 endpoints over TRS 1.0 endpoints.
  • Capitalize on the fact that the distinct TRS 2.0 and TRS 1.0 endpoint URI for a data source are representations of the same underlying tracked resource set and change log. Make it possible for the customer to reconfigure your consumer to switch between a data source’s corresponding TRS 2.0 and TRS 1.0 endpoints without triggering a time-consuming index rebuild.

Indexable Linked Data Source Discoverability

(This section is normative.)

The documentation for an Indexable Linked Data Provider MUST document its Indexable Linked Data Sources, including the URL of each Indexable Linked Data Source’s Tracked Resource Set resource and designated Access Context List resource.

In order to help an administrator of an Indexable Linked Data Consumer in configuring its access to Indexable Linked Data Sources, an Indexable Linked Data Provider MAY also make its Indexable Linked Data Sources discoverable. Discoverability is a convenience; an administrator can configure an Indexable Linked Data Consumer with a particular Indexable Linked Data Source knowing just the URLs of the Indexable Linked Data Source’s Tracked Resource Set and designated Access Context List resource. An administrator can retrieve the Access Context List resource to discover the titles and URIs of the Access Contexts being used with that Indexable Linked Data Source.

The RDF trs:trackedResourceSet property defined in Tracked Resource Set 2.0 can be used to declare the whereabouts of a Tracked Resource Set resource. The Tracked Resource Set resource is the object.

This allows the existence and location of an Indexable Linked Data Source’s Tracked Resource Set resource to be declared with an RDF statement like the following (rendered here in Turtle):

@prefix trs: <http://open-services.net/ns/core/trs#> .
<> trs:trackedResourceSet <https://a.example.com/trs1> . 

The RDF acc:accessContextList property declares the whereabouts of an Access Context List resource. The Access Context List resource is the object.

  • Property Name: accessContextList
  • Description: URL of Access Context List resource
  • Property URI: http://open-services.net/ns/core/acc#accessContextList

This allows the existence and location of an Access Context List resource to be declared with an RDF statement like the following (rendered here in Turtle):

@prefix acc: <http://open-services.net/ns/core/acc#> .
<> acc:accessContextList <https://a.example.com/acclist> . 

Where such RDF statements might be found is outside the scope of this specification.

Appendix A: Samples

(This section is non-normative.)

See samples within the body of this specification.

Appendix B: Resource Shapes

Not applicable

Appendix C: Notices and References

Contributors

  • Jim des Rivieres (IBM, OSLC-Core)
  • Vivek Garg (IBM, OSLC-Core)
  • SteveSpeicher (IBM, OSLC-Core Lead)
  • Arthur Ryman (IBM, OSLC-Core)
  • Nick Crossley (IBM, OSLC-Core)
  • Ian Green (IBM, OSLC-Core)

Reporting Issues on the Specification

The working group participants who author and maintain this working draft specification, monitor a distribution list where issues or questions can be raised, see Core Mailing List

Also the issues found with this specification and their resolution can be found at Core 2.0 Issues.

License and Intellectual Property

We make this specification available under the terms and conditions set forth in the site Terms of Use, IP Policy, and the Workgroup Participation Agreement for this Workgroup.

References

Appendix D: Changes

Category:Specifications


Categories