[xmlschemata] Re: Innie or Outie?

From: Eric van der Vlist <vdv@dyomedea.com>
Date: Sat Jun 22 2002 - 16:52:28 UTC

Hi Rick,

On Sat, 2002-06-22 at 11:12, Rick Jelliffe wrote:
> In Australia there is a schoolyard ontology of bellybuttons being =
> "innies" or "outies".
> Schema languages seem to be the same. Schemachine is an outie and
> XVIF is (in Eric's trial implementation in RELAX NG) an innie :-)

> From: "Eric van der Vlist" <vdv@dyomedea.com>
> > Even though their current syntaxes are different, I am confident that
> > the semantic of the 3 elements composing xvif is generic enough to be
> > used as building blocks surrounded by Schemachine bells and whistles =
> :-)
> On a logical level yes. But on an implementation level there are three
> very distinct levels:
> 1) Outside invoking someone's third party schema library
> 2) What goes on inside a third party schema libary (and what
> provision they made for extensions)
> 3) What can be extracted and embedded by a third-party library.

At this stage, I don't think we should worry too much about the
implementation leval. Yes I am the one who has published an
implementation but that's only because I wanted to be sure that what I
was proposing was possible to implement and that I wanted to be able to
understand the implications of what I was proposing.

Now, at a logical level, there is IMO a lot of benefit to using common
vocabularies for things which have the same semantic.

For instance, moving XPath out of XSLT has been very positive since
XPath has been reused by other applications such as XPointer, W3C XML
Schema, the XPath filter or Schematron to name few. And the benefit for
the user is the same whether implementations use common XPath libraries
or not.

I think that if we could define a common vocabulary for the inside of
your Schemachine and my xvif it would be a good thing for the users
(assuming of course that the semantic is the same) whether or not the
implementers use the same libraries.

> Figuring out how best to support this distinction is critical, because
> we don't want to make XXXX something where an implementer
> of a schema language component needs to understand the XXXX
> framework.

But, framework XXXX is probably 20 times simpler to understand and
implement than the schema language :-) ...

It's not the same level of implementation. Implementing a schema
language can be tricky, but implementing the framework (at least when
the framework is as simple as xvif) is just a matter of integrating
existing technologies and this integration may be documented and maybe
even standardized.

> In Schemachine, I decided to treat these three levels using different
> mechanisms each.
> 1) The outside level is handled by the framework, with some
> limited transformations available, such as treatment of validation
> results, stripping items, and wrapping results.
> 2) The only ways of the framework interacting with the
> library are through parameters or in-band in the document.
> 3) Embedded schemas are handled by particular validators
> that understand the format, just as any other extension.
> I think a lot of this comes down to our expections of how things will
> pan out. I expect that a XXXX user will
> * prefer to use grammar-based constraints on structure
> ahead of Schematron initially
> * prefer to use datatyping embeded in the structure language more
> than free floating or Schematron constraints or DTD datatyping
> * prefer to use a declarative mechanism for keys and uniqueness,
> controlled vocabulary and link checking, rather than Schematron
> * use Schematron to fill in the gaps.
> * over time, as maintenance is required, make their grammars
> looser and migrate specific small-grained constraint checks
> to Schematron.
> * use a phase/stage mechanism (i.e. reify a bundle of switches)
> at the top level and within any schema components that support it
> by parameters (e.g. Schematron) for selecting versions over
> time, for progressive validation, and for black-box validation at
> various stages of an augmenting pipeline.
> So it seems to me that the essential difference is that XVIF seems
> to lead to intricately decorated schemas, adn therefore difficult to
> reason about and therefore tool-requiring schemas. But I would
> expect Schemamachine to grow by the grammar-based schemas
> becoming progressively looser as more constraints are moved to
> Schematron, or by fairly large chunks of schema being changed.

xvif can also lead to decorated transformations if you prefer :-)
> I think the weakness for my approach is that it does not tie into XSD's
> <import><include><redefine> mechanism or the inclusions in
> RELAX NG or DTDs. I guess the way it would have to do it
> to specify as a parameter on validation engines a URL remapping,
> so that the particular schemas that are included during schema
> construction would be dependent on the phase mechanism.

Not only. That's probably very subjective, but I do prefer to keep all
what's related to the definition of element "foo" together. I can
understand that other people prefer to keep things by "nature" (XSLT
with XSLT, Schematron with Schematron, RNG with RNG, ...) but I do
prefer to keep things by "subject" than by "nature".

The other advantage of integrating transformations within schemas is
that you don't have to repeat what you've expressed in your schema in
your transformation.

When you design a transformation for an element "foo" in a whole
document, you need to implement some knowledge of the structure of the
document in your transformation: you need to find the element "foo" in
the document and it may have a different model depending on its location
and on other criterias and you need to catch this in you match and
select conditions.

When you locate the template for transforming element "foo" in a schema,
this selection is done by the schema processor and you don't have to
write it twice.

Furthermore, you can use (especially when the hosting language is Relax
NG) the ability of the schema language to manage alternatives and
express things such as "this date is a French date if the result of the
French to ISO conversion is a valid ISO date or a German date if the
result of the German to ISO conversion is a valid ISO date).

At the end of the day, I think that this is the only way to avoid
duplicating writing the schema twice (with XPath and with a grammar
based schema language)!

Thanks for your mail!

> Leigh Dodds has a nice weblog entry on this for people who want to get =
> up-to-speed,
> at http://weblogs.userland.com/eclectic/ for June 20, 2002.
> Cheers
> Rick Jelliffe

See you in San Diego.
Eric van der Vlist       http://xmlfr.org            http://dyomedea.com
(W3C) XML Schema ISBN:0-596-00252-1 http://oreilly.com/catalog/xmlschema
Received on Sat Jun 22 18:52:31 2002

This archive was generated by hypermail 2.1.8 : Fri Dec 03 2004 - 14:29:47 UTC