Apr_AMP_Digital

A D V A N C E D M A T E R I A L S & P R O C E S S E S | A P R I L 2 0 1 6 2 7

data standards necessary to enable

discovery, access, and use of materials

science and engineering data. It advo-

cates the development of highly distrib-

uted repositories because data sharing

is optimized in a many-to-many, net-

centric data environment that manag-

es data across a community of interest

(COI) rather than connecting disparate

data sources with brittle, point-to-point

interfaces throughout the research

ecosystem.

MGI also recommends communi-

ty developed standards that provide

“…the format, metadata, data types,

criteria for data inclusion and retire-

ment, and protocols necessary for in-

teroperability and seamless data trans-

fer.” The MGI website lists 10 partners

working on such standards

[7]

, and there

are many others. But these standards

already exist in the W3C Semantic Web,

proven and in use in other research in-

dustries such as the life sciences

[8]

, and

by big-data giants including Google

[9]

The W3C Linking Open Drug Data

[10]

ef-

fort links pharmaceutical companies Eli

Lilly, AstraZeneca, and Johnson & John-

son together in a cooperative effort to

interlink openly licensed data about

drugs and clinical trials to aid drug dis-

covery and development.

How do W3C standards estab-

lish the infrastructure to meet MGI

goals? Distributing data across the

web requires a standard mechanism to

specify the existence and meaning of

connections between items described

in the data. The mechanism is provid-

ed by the W3C’s Resource Description

Framework (RDF) specification.

The RDF specification is built on

Universal/Internationalized Resource

Identifiers that uniquely identify data

resources (Fig. 2). RDF provides the

baseline for “linking data” to form the

web of data—a graph structure for

representing machine-understandable

information. XML provides one means

to connect data to the RDF. The RDF

query language SPARQL enables us-

ers to write unambiguous queries,

which can be distributed to multiple

SPARQL endpoints, computed, and

results gathered.

Together, OWL and RDF-S com-

bine to provide publication tools for

semantics of schemas. OWL provides

the abstract syntax to develop expres-

sive ontologies that describe content

relative to other described entities/

concepts within a domain of interest,

and assign a URI to uniquely identi-

fy data components. Together these

establish semantic meaning. RDF-S

provides basic logic primitives for writ-

ing lightweight ontologies that define

classes of resources, organize their hi-

erarchies, and add more intelligence

to the data. The Rule Interchange For-

mat provides the framework to encode

knowledge as first-order logic used to

implement inference engines that pro-

cess conditions and draw conclusions.

The Unifying Logic layer applies higher

order reasoning over result sets.

The Proof and Trust components

use provenance models, such as W3C

PROV Ontology (PROV-O), to provide

explanations about results and their

sources, including information about

entities, activities, and people involved.

The provenance ontology supports as-

sessments about quality, reliability,

and trustworthiness of data and re-

sults. Crypto security may be applied

throughout the stack using industry

standards to protect data and services.

The W3C Semantic Web framework sup-

ports a host of user interfaces and appli-

cations to analyze, visualize, and share

enriched, semantically linked data.

Fig. 2 —

W3C’s Resource Description Frame-

work (RDF) specification is built on Univer-

sal/Internationalized Resource Identifiers

that uniquely identify data resources.

Elemental Thermal Wet Chemistry Metallurgical Organic Optical Elemental Thermal Wet Chemistry Metallurgical Organic Optical