

data standards necessary to enable
discovery, access, and use of materials
science and engineering data. It advo-
cates the development of highly distrib-
uted repositories because data sharing
is optimized in a many-to-many, net-
centric data environment that manag-
es data across a community of interest
(COI) rather than connecting disparate
data sources with brittle, point-to-point
interfaces throughout the research
ecosystem.
MGI also recommends communi-
ty developed standards that provide
“…the format, metadata, data types,
criteria for data inclusion and retire-
ment, and protocols necessary for in-
teroperability and seamless data trans-
fer.” The MGI website lists 10 partners
working on such standards
[7]
, and there
are many others. But these standards
already exist in the W3C Semantic Web,
proven and in use in other research in-
dustries such as the life sciences
[8]
, and
by big-data giants including Google
[9]
.
The W3C Linking Open Drug Data
[10]
ef-
fort links pharmaceutical companies Eli
Lilly, AstraZeneca, and Johnson & John-
son together in a cooperative effort to
interlink openly licensed data about
drugs and clinical trials to aid drug dis-
covery and development.
How do W3C standards estab-
lish the infrastructure to meet MGI
goals? Distributing data across the
web requires a standard mechanism to
specify the existence and meaning of
connections between items described
in the data. The mechanism is provid-
ed by the W3C’s Resource Description
Framework (RDF) specification.
The RDF specification is built on
Universal/Internationalized Resource
Identifiers that uniquely identify data
resources (Fig. 2). RDF provides the
baseline for “linking data” to form the
web of data—a graph structure for
representing machine-understandable
information. XML provides one means
to connect data to the RDF. The RDF
query language SPARQL enables us-
ers to write unambiguous queries,
which can be distributed to multiple
SPARQL endpoints, computed, and
results gathered.
Together, OWL and RDF-S com-
bine to provide publication tools for
semantics of schemas. OWL provides
the abstract syntax to develop expres-
sive ontologies that describe content
relative to other described entities/
concepts within a domain of interest,
and assign a URI to uniquely identi-
fy data components. Together these
establish semantic meaning. RDF-S
provides basic logic primitives for writ-
ing lightweight ontologies that define
classes of resources, organize their hi-
erarchies, and add more intelligence
to the data. The Rule Interchange For-
mat provides the framework to encode
knowledge as first-order logic used to
implement inference engines that pro-
cess conditions and draw conclusions.
The Unifying Logic layer applies higher
order reasoning over result sets.
The Proof and Trust components
use provenance models, such as W3C
PROV Ontology (PROV-O), to provide
explanations about results and their
sources, including information about
entities, activities, and people involved.
The provenance ontology supports as-
sessments about quality, reliability,
and trustworthiness of data and re-
sults. Crypto security may be applied
throughout the stack using industry
standards to protect data and services.
The W3C Semantic Web framework sup-
ports a host of user interfaces and appli-
cations to analyze, visualize, and share
enriched, semantically linked data.
Fig. 2 —
W3C’s Resource Description Frame-
work (RDF) specification is built on Univer-
sal/Internationalized Resource Identifiers
that uniquely identify data resources.
Elemental Thermal Wet Chemistry Metallurgical Organic Optical Elemental Thermal Wet Chemistry Metallurgical Organic Optical