11 Feb 02:05
Re: host-meta file format comments (draft-nottingham-site-meta-01)
Thomas Roessler <tlr <at> w3.org>
2009-02-11 01:05:48 GMT
2009-02-11 01:05:48 GMT
(diverting to www-talk, too...) On 11 Feb 2009, at 01:20, Mark Nottingham wrote: > Yeah, I'm not completely happy with it yet. The thought was that > since blank lines don't introduce ambiguity here, they're not > harmful. OTOH one of my goals for the format is to allow existing > HTTP header and MIME parsers (e.g., in Python) to be used on the > format, and they very well may barf on a blank line. Well, they'll barf on blank lines and declare the header over; changing that within the parser (or just restarting it on the rest of the file) should be relatively cheap. BTW, I notice that this draft is silent on the HTTP header syntax's combining feature for multiple occurences of the same field (last paragraph of 4.2, RFC 2616); I suspect that to be one of the more likely causes for surprises if HTTP header parsers are re-used. (No such risk with MIME parsers.) Finally, why disallow whitespace stuffed folding? It's pretty useful to make long lines editable, and I suspect that we're assuming /host- meta to be the product of some human with emacs in their hands.Implementing it is easy, and a given if existing parsers are used. > So, the right thing to do might be to explicitly disallow them, both > in BNF and prose. Eran, thoughts? I'd just prefer to not have the BNF say "no empty lines", and then have prose that says the opposite, but with a SHOULD. >>> 5. Minting New meta-fields >> >>> Applications that wish to mint new meta-fields for use in the >>> host- meta format MUST register them in the host-meta field- >>> registry, following the procedures in Section 7.2. Field-names >>> MUST conform to the field-name ABNF Section 3, and field-value >>> syntax MUST be well- defined (e.g., using ABNF, or a reference to >>> the syntax of an existing header field-value). Field-values SHOULD >>> use the ISO-859-1 character encoding. If a field-value applies to >>> a scope other than the entire authority, that scope MUST be well- >>> defined. >> >> Editorial nit: ISO-8859-1 is missing an 8 here. > > That one always gets me, thanks. > >> More substantially, is there any particular reason to not just go >> with utf-8 here? After all, the content type is *appplication*/ >> host-meta anyway. > > Same as above; allowing existing parsers and serialisation libraries > to be used. That said, there have been many arguments in HTTPbis > that existing libraries won't harm non-ASCII characters in transit, > but IIRC no one has actually gone out and surveyed what they do... That suggests that it's a coin toss, unless the mythical "someone" does that work. May I, in that event, suggest that we use a coin biased in favor of broader internationalization, i.e., UTF-8?
Implementing it is easy, and a given if existing parsers are used.
> So, the right thing to do might be to explicitly disallow them, both
> in BNF and prose. Eran, thoughts?
I'd just prefer to not have the BNF say "no empty lines", and then
have prose that says the opposite, but with a SHOULD.
>>> 5. Minting New meta-fields
>>
>>> Applications that wish to mint new meta-fields for use in the
>>> host- meta format MUST register them in the host-meta field-
>>> registry, following the procedures in Section 7.2. Field-names
>>> MUST conform to the field-name ABNF Section 3, and field-value
>>> syntax MUST be well- defined (e.g., using ABNF, or a reference to
>>> the syntax of an existing header field-value). Field-values SHOULD
>>> use the ISO-859-1 character encoding. If a field-value applies to
>>> a scope other than the entire authority, that scope MUST be well-
>>> defined.
>>
>> Editorial nit: ISO-8859-1 is missing an 8 here.
>
> That one always gets me, thanks.
>
>> More substantially, is there any particular reason to not just go
>> with utf-8 here? After all, the content type is *appplication*/
>> host-meta anyway.
>
> Same as above; allowing existing parsers and serialisation libraries
> to be used. That said, there have been many arguments in HTTPbis
> that existing libraries won't harm non-ASCII characters in transit,
> but IIRC no one has actually gone out and surveyed what they do...
That suggests that it's a coin toss, unless the mythical "someone"
does that work. May I, in that event, suggest that we use a coin
biased in favor of broader internationalization, i.e., UTF-8?
RSS Feed