xmlguru.cz

Spreading the XML paradigm around

Evaluation of ECMA responses to Czech OOXML comments

2008-01-14

Today ECMA published responses to all comments submitted by national bodies during DIS29500 ballot. Unfortunatelly due to stupid ISO rules ECMA responses are not public, only ISO member organizations can see them. As I was responsible for collecting Czech commments I'm also in a good position to evaluate ECMA response to them.

[UPDATED] I missed one last minute change made by ECMA, so there is only one unresolved editorial comment and six partially resolved comments.


The following table contains evaluation of ECMA responses to Czech comments. At this time the document reflects only my personal opinions, but it will be principal input for the final evaluation of ECMA proposed disposition by Czech Standards Institute.

ECMA already provided proposed resolution for 75 comments (out of total 75 Czech comments). This means that 100.00% of Czech comments were handled by ECMA.

90.67% of comments were satisfactory resolved.

8.00% of comments were resolved only partially.

1.33% of comments were not satisfactory resolved.

IDCZ CommentCZ Proposed ResolutionECMA Proposed ResolutionResolution
CZ-0001

Proposed standard DIS29500 has big functional overlap with existing standard ISO/IEC 26300:2006 (ODF) which has been approved quite recently in the last year. However we think that office applications users will benefit from having Office Open XML standardized as DIS29500 if below mentioned comments are incorporated into the final version of standard. This is mainly because DIS29500 has features for representing common document elements which are not yet supported by ODF standard and it will took several years before those features are incorporated also into standardized ODF format. Another reason is OOXML's ability to represent large corpus of existing documents (previously stored usually in proprietary binary formats) in an open and easy to process format. For each standard it is also important to gain mass adoption, otherwise its benefits are diminished. It seems that majority of office applications (in terms of market share) will support DIS29500 which is not yet case of ODF.

Coexistence of two very similar international standards such as ODF and OOXML is undesirable in a long term perspective. Therefore we ask JTC1 to start work on a progressive harmonization of both formats in cooperation with OASIS and ECMA organizations which are originators of these document formats.

There are many possible approaches for harmonization. For example, as the first step both formats could start to use the same unified packaging system based on OPC (as described in Part 2 of DIS29500). Moreover, OPC could be extended to support storage of alternative representations of a single object—single file then could contain one document stored in several variants (e.g. ODF, OOXML and XHTML). Applications will be then free to choose format which best fits their needs and capabilities.

In a long term it is recommended to carefully study both formats and then create unified abstract document model. ODF and OOXML formats will then serve just as alternative serializations of this data model. If experience will disclose weaknesses of both ODF and OOXML formats, it is possible to start thinking about creating completely new document data model serialization.

 Proposed DispositionResolution provided by ECMA is satisfactory.

This comment is just introductory text we are not expecting any resolution for it.

CZ-0002

The standard is very huge and not all applications have to implement support for all document types. The standard split to several smaller and more standalone parts would be more usable.

Create separate parts for WordprocessingML, SpreadsheetML, PresentationML and shared vocabularies.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0003

Long attribute descriptions including examples of use are repeated on all elements supporting this attribute. This prolongs text of the standard. Moreover examples are not related to currently defined element on several places because description of attributes is shared.

List of attributes for a given element should contain only name of attribute, its data type and very brief description (single line or sentence). Detailed attribute description should be provided just once and it should be referenced from all attribute instances.

Proposed DispositionResolution provided by ECMA is not satisfactory.

We don't think that current editorial approach is better for any group of users including reviewers and implementers. Repeating the same normative text is simply not useful and it is not common in ISO standards. If this editorial change is not implemented in the amended DIS after BRM, then it should be definitively implemented in the first revision in accordance with the article 13.14 of JTC1 directives.

CZ-0004

VML language is marked as depreciated and it is intended as temporal solution for maintaining backwards compatibility. Therefore there is no reason for including VML description directly into the standard.

VML specification should be published as Technical Report only.

Proposed DispositionResolution provided by ECMA is only partial. Additional changes in proposed resolution are needed in order to resolve our comment.

Proposed resolution is definitvely better then original DIS. However we still think that separate TR is more appropriate place for VML then annex. Especially when VML definition occupies around 600 pages.

CZ-0005

Reference to ZIP format specification is not pointing to particular ZIP version.

Include ZIP format version into reference or state that the latest version available should be used.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0006

Only language codes defined in ISO 639, ISO 3166 and ISO 15924 should be used for language identification. If there is no corresponding ISO code for some combination of language, region and script it is possible to use newer language identification mechanism defined in RFC 4646 (BCP 47).

Definition of ST_Lang type should use language identifiers as defined in BCP 47. ST_LangCode type should be completely removed and for languages which cannot be represented using BCP 47 new language and country code should be added into ISO 639 and ISO 3166, for example utilizing space reserved for local codes.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0007

It is not clear whether numbers in table are decimal or hexadecimal (text before table mentions hexadecimal numbers, but table contains decimal numbers).

Number range requires 4 hexadecimal digits, not just two as is written in the text.

The example wrongly describes number 1033 as being hexadecimal.

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0008

The default date system should be “1904” because it does not suffer leap year bug of “1900” system in which year 1900 is wrongly considered to be leap. All newly created documents should use “1904” date system, “1900” based system should be allowed only for representation of already existing documents.

“date1904” attribute should be mandatory so it is always explicitly known which date system is used. Text of the standard should recommend usage of “1904” date system. Standard should allow usage of the “1900” date system only in documents that were converted from legacy formats.

Proposed DispositionResolution provided by ECMA is satisfactory.

However if the namespace for final stadard will be changed (or other versioning strategy to differentiate it from DIS 29500 will be used) default value of dateCompatibility attribute should be changed to false.

CZ-0009

The standard should provide facilities for representing dates prior 1900-01-01/1904-01-01.

Either negative values should be allowed as serial value of date or a completely new date/time data type should be introduced.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0010

Definition of a spreadsheet formula language should be put into a separate standard or part to make it reusable in other standards, for example in ISO/IEC 26300:2006 (ODF).

 Proposed DispositionResolution provided by ECMA is satisfactory.

OASIS TC responsible for ODF is not thinking that OOXML formula is appropriate for ODF (although it technically can be used inside it), so we are not longer insisting on separating formula language right now.

CZ-0011

The standard describes VML format as depreciated and states that DrawingML should replace it. Because of this DrawingML content should be allowed on all places where currently only VML content is allowed in various vocabularies defined in DIS29500.

Allow DrawingML content on all places where VML is allowed.

In particular inside “background”, “pict” and “object” elements.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0012

Reference pointing to part 5 section 12 is not meaningful.

Fix the reference, it should point to section 11 likely.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0013

Optimizing output for particular Web browser is generally considered as bad practice. If an application should ever support this feature for whatever reason then the standard should provide more parameters for controlling this feature and normative list of Web browsers should not be included in the standard as browsers are continuously evolving and adding support for new technologies.

The standard should define the following elements for describing browser capabilities: allowGIF, allowJPEG, allowPNG, allowSVG, doNotRelyOnCSS, doNotRelyOnJavascript, relyOnVML, doNotSaveWebPageAsSingleFile. The table after line 18 should be removed or marked as informal.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0014

Behavior of “autoSpaceLikeWord95” element is not sufficiently defined.

Add definition of behavior for this element. Especially, the definition should list formatting differences between situations when the element is used and when it is not used.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0015

Behavior of “footnoteLayoutLikeWW8” element is not sufficiently defined.

Add definition of behavior for this element. Especially, the definition should list formatting differences between situations when the element is used and when it is not used.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0016

Behavior of “lineWrapLikeWord6” element is not sufficiently defined.

Add definition of behavior for this element. Especially, the definition should list formatting differences between situations when the element is used and when it is not used.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0017

Behavior of “mwSmallCaps” element is not sufficiently defined.

Add definition of behavior for this element. Especially, the definition should list formatting differences between situations when the element is used and when it is not used.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0018

Behavior of “shapeLayoutLikeWW8” element is not sufficiently defined.

Add definition of behavior for this element. Especially, the definition should list formatting differences between situations when the element is used and when it is not used.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0019

Behavior of “suppressTopSpacingWP” element is not sufficiently defined.

Add definition of behavior for this element. Especially, the definition should list formatting differences between situations when the element is used and when it is not used.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0020

Behavior of “truncateFontHeightsLikeWP6” element is not sufficiently defined.

Add definition of behavior for this element. Especially, the definition should list formatting differences between situations when the element is used and when it is not used.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0021

Behavior of “useWord2002TableStyleRules” element is not sufficiently defined.

Add definition of behavior for this element. Especially, the definition should list formatting differences between situations when the element is used and when it is not used.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0022

Behavior of “useWord97LineBreakRules” element is not sufficiently defined.

Add definition of behavior for this element. Especially, the definition should list formatting differences between situations when the element is used and when it is not used.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0023

Behavior of “wpJustification” element is not sufficiently defined.

Add definition of behavior for this element. Especially, the definition should list formatting differences between situations when the element is used and when it is not used.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0024

Behavior of “wpSpaceWidth” element is not sufficiently defined.

Add definition of behavior for this element. Especially, the definition should list formatting differences between situations when the element is used and when it is not used.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0025

“uiCompat97To2003” parameter is related to an application behavior but not to a document and its content. As such it should not be part of the standard. If necessary applications could use custom elements defined in accordance with rules of Part 5 for storing such information.

Remove “uiCompat97To2003” element from the standard.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0026

Example uses MS-DOS/Windows file path conventions. To improve interoperability all paths should be specified as URIs.

Consistently use URIs for specifying paths in the whole standard. If a reference to a local file system is necessary use “file” schema.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0027

There is no parameter for specifying type of included data in INCLUDETEXT field. It is not always possible to reliably determine type without explicit content type specification. Moreover, sometimes user might want to load data in a different way—for example he or she might want to load XML document as a plain text to show source code of this XML file.

Add two additional parameters. One for specifying MIME type of included data and second for specifying encoding of included data (to handle situations when encoding couldn't be determined from file contents).

Proposed DispositionResolution provided by ECMA is satisfactory.

Please also add reference to IANA registry of encoding names.

CZ-0028

MACROBUTTON field doesn't define interface for macro invocation.

Extend the description and state that macro invocation is application dependent and it is not defined in this version of the standard.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0029

Images as shown in the standard cannot be faithfully reproduced.

Attach an electronic representation of all graphical objects in an open vector format like SVG, CGM or DrawingML to the standard.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0030

The standard does not allow to use custom graphics for artistic borders.

Allow artistic borders based on any image provided or completely remove artistic borders from the standard.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0031

Length of xs:hexBinary data type is specified using bytes not characters.

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0032

Reference to definition of “chicago” numbering format is insufficient.

Specify term which can be used to lookup numbering format definition in the Chicago Manual of Style or include more detailed description of this numbering format.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0033

Specifying allowed page sizes by enumeration is too restrictive.

Add new value 0 (= custom paper size) for “paperSize” attribute. Page size will be specified manually using attributes like “pageWidth” and “pageHeight” when this value is used.

Do the same modification also for the corresponding attribute of “pageSetup” element in section 5.7.2.135 (p. 4063).

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0034

The standard is not referencing QuickTime specification. Moreover need for QuickTime specific element is not justified as there is already generic element for embedding video data (videoFile).

Provide better explanation why it is necessary to have specific QuickTime element. Add reference to definition of QuickTime format.

Proposed DispositionResolution provided by ECMA is only partial. Additional changes in proposed resolution are needed in order to resolve our comment.

It is still not clear why there is a need for a special QuickTime element if there is already generic element for video.

CZ-0035

Putting XML fragment into an attribute value is completely unacceptable.

Use nested element instead of equationxml attribute. This change will allow to directly represent mathematical equation in XML syntax without need for escaping. We will not insist on this change if VML is moved into a separate Technical Report as suggested in one of previous comments.

Proposed DispositionResolution provided by ECMA is satisfactory.

There is a typo in modified description of the equationxml attribute (contentType -> equationxmlContentType).

CZ-0036

Putting XML fragment into an attribute value is completely unacceptable.

Use nested element instead of gfxdata attribute for storing direct representation of XML. We will not insist on this change if VML is moved into a separate Technical Report as suggested in one of previous comments.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0037

It is not clear whether and how other formats like PNG or EPS can be used for storing clipboard data.

Modify description in such way that it is clear that any bitmap format can be used for “Bitmap” type and that any metaformat can be used for “Pict” type. Change remaining types in the same fashion. Accompany each clipboard format type with several examples of possible image formats, for example PNG, BMP, GIF and JPEG for “Bitmap” type and EMF, EPS and SVG for “Pict” type.

Alternatively, consider using more general content type identification mechanism based on MIME types (like image/png).

Add example showing how to represent PNG image stored inside clipboard.

Proposed DispositionResolution provided by ECMA is only partial. Additional changes in proposed resolution are needed in order to resolve our comment.

We still would like to see examples of using formats which don't have their explicit type in enumeration—for example SVG or CGM.

CZ-0038

Escape mechanism does not define escaping for “_” character.

Add escaping definition for “_” character.

Proposed DispositionResolution provided by ECMA is only partial. Additional changes in proposed resolution are needed in order to resolve our comment.

Provided escaping is ambiguous. Require escaping of all underscores, not just initial one. With the current disposition it is not clear how to encode for example the following string "___" (three underscores in the row).

CZ-0039

It is not clear what the purpose of “cf” element is. Is it used for holding clipboard content or is it used only for identification of clipboard data format? The standard does not justify needs for such element in an interchange format like OOXML.

Clarify element definition and its usage.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0040

Text encoding specified for “textPr” element should not use codePage attribute which can contain only one from predefined codes. Encoding should be specified using character encoding names registered at IANA instead.

Replace “codePage” attribute with “encoding” attribute. Value of this attribute can be any encoding name from the corresponding IANA registry (http://www.iana.org/assignments/character-sets).

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0041

Part names are compared case insensitively but only for ASCII characters. Why is comparison not case insensitive for all Unicode/ISO 10646 characters which are available in both lowercase and uppercase variants?

Clarify this conflict or define comparison as case sensitive.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0042

It should be possible to attach additional metadata like language to each keyword stored inside “keywords” element.

Change content model of “keywords” element to mixed content in which subelements can be used to markup individual keywords and to attach additional text properties to each keyword.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0043

Precise algorithm for extracting custom XML markup from document is not defined.

Define algorithm for converting custom XML markup into a standalone XML document.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0044

“w” and “h” attributes are optional and it is not defined how to compute their value from value of “code” attribute.

Either “w” and “h” attributes should be required or it should be defined how to compute page size from the value of “code” attribute.

It is not clear what the purpose of “code” attribute is. Improve its description.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0045

The standard uses several different length units—for example font size is specified using half pt (see ST_HpsMeasure Part 4/2.18.48/p. 1742), DrawingML uses EMU unit (see ST_Coordinate data type Part 4/5.1.12.16/p. 3694) and 100th of point (see ST_TextPoint data type Part 4/5.1.12.75/p. 3861). On other place twips unit (see ST_TwipsMeasure Part 4/2.18.105/p. 1836) is used. Although usage of such different units might have some benefits like suitable scale or elimination of rounding errors it would be very useful if any length value can be specified using any common length unit.

Modify all length data types to support also values with specified measure unit. At least the following units should be supported: cm, mm, in, pc and pt. These units must be recognized during document loading but they do not have to be preserved during editing session. When saving a default unit for given length data type might be used.

Proposed DispositionResolution provided by ECMA is only partial. Additional changes in proposed resolution are needed in order to resolve our comment.

Postponing this important change is not a real solution to the problem of inconsistent usage of measure units in OOXML.

CZ-0046

Characters are enumerated only by showing their glyph which is not always unambiguous.

Add corresponding Unicode/ISO 10646 code point to each character.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0047

Definition of “hanging punctuation” is meaningless. Punctuation is always on the same line as related text, the only difference is that hanging punctuation can be shifted out from normal printing area to gain better visual appearance.

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0048

Element description should be border bottom not border between.

Correct text and all occurrences where this erroneous text is referenced.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0049

Version of SQL language which can be used for writing queries is unspecified.

Add reference to the corresponding SQL standard.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0050

“dllVersion” attribute which specifies version of grammar checker module is too platform dependent.

Use more general mechanism. Change data type of attribute to string.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0051

The standard does not define how to allocate codes for “vendorID” attribute.

Use more general mechanism. Change data type of attribute to string.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0052

Platform dependent path is used.

Specify all paths and addresses using URI syntax.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0053

Text assumes that Unicode string is represented using UCS-2 encoding where each character is stored in exactly two bytes. Nowadays Unicode contains almost 100000 characters and other encodings with full Unicode coverage like UTF-16 have to be used. In UTF-16 some characters are stored in four bytes using surrogate pairs.

Specify which encoding is used for Unicode string representation. Instead of using high and low bytes base description on octet positions.

Proposed DispositionResolution provided by ECMA is only partial. Additional changes in proposed resolution are needed in order to resolve our comment.

It should be also specified whether UTF-16LE or UTF-16BE is used. UTF-16 encoding can use two different byte ordering resulting in unreliable hash calculations.

CZ-0054

The fact that “summaryLength” element contains percentage value is described only in example.

Improve description of the corresponding data type in such way that it is clear that value is specified as percentage.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0055

It is not apparent from the description of “themeFontLang” element that it can be used together with “bidi” and “eastAsia” attributes and what is meaning of those attributes.

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0056

Fill patterns are not sufficiently defined using sample images only.

Provide electronic representation of fill patterns in appendix.

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0057

There is no text run property for specifying whether given piece of text should be translated during localization process. This functionality is very important in environments where texts are routinely translated to many other languages, for example in EU.

Add new property for specifying whether given run of text should be translated during document localization. Proposed mechanism should be compatible with ITS markup (http://www.w3.org/TR/its/).

Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0058

There are missing quotes around attribute value.

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0059

In URL forward slashes (“/”) should be used to separate path parts instead of backslashes (“\”).

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0060

There is an excessive comma before word “core”.

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0061

There is an excessive second period at the end of sentence.

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0062

Provided XML example is not well-formed. Several attribute values are not enclosed in quotes, there is some strange text “[3204]” in place where only attributes can occur.

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0063

Text mentions “CNTS” ticker but example on the following page shows “MSFT” ticker.

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0064

There is an excessive file path artifact before “<w:style>” element.

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0065

In URL forward slashes (“/”) should be used to separate path parts instead of backslashes (“\”).

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0066

Correct “xpath” spelling is “XPath” (note the first two uppercased letters).

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0067

There is an error in XPath expression. “@type” should be preceded by “/” to separate it from the start of location path.

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0068

There is an error in XPath expression. “@currency” should be preceded by “/” to separate it from the start of location path.

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0069

“This element specifies the background information for this document.”—obviously “this document” should be replaced by the appropriate object.

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0070

“all lines for this page” → “all lines of this paragraph”

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0071

Usage of terms “Chinese PRC” and “Chinese Taiwan” is not consistent with the common practice and rest of the standard. Use terms “Simplified Chinese” and “Traditional Chinese” instead.

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0072

Example shows how to specify kerning value, but we are inside description of font size element. There are more instances of this error because examples for attribute with the same name (e.g. “val”) are somehow shared and reused.

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0073

There is an excessive backslash at the end of sentence.

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0074

Ampersand character (“&”) should not be escaped when it is not part of XML source listing.

 Proposed DispositionResolution provided by ECMA is satisfactory.
CZ-0075

The paragraph is broken in the middle of “docume-nt” word.

 Proposed DispositionResolution provided by ECMA is satisfactory.

In fact I was really surprised how many green boxes are there at the end. I was expecting that ECMA will properly address only part of our comments. The vast majority of Czech comments was addressed by ECMA so it is time to say yes to OOXML.

blog comments powered by Disqus
Copyright © Jiří Kosek, 2006–2018