Johnny Bufu | 9 Jul 06:41

Re: [OpenID] Canonical OpenID url form


On 08/07/08 03:01 PM, Andrew Arnott wrote:
> What is the canonical form of an OpenID URL? One with the %AB%CD hex 
> encoding for unicode chars in the URL or with the actual unicode chars? 
> For the purposes of displaying to the user and storing in the RP's database.
> 
> The spec doesn't seem to have anything to say on this.  

I believe it does say:

4.1.  Protocol Messages
The OpenID Authentication protocol messages are mappings of plain-text 
keys to plain-text values. The keys and values permit the full Unicode 
character set (UCS). When the keys and values need to be converted 
to/from bytes, they MUST be encoded using UTF-8 [RFC3629].

http://openid.net/specs/openid-authentication-2_0.html#anchor4

> The reason I 
> think it's not a simple automatic answer is the unicode chars may be 
> what the user typed in and what exists on the server, but in transit, 
> these characters are translated to %AB%CD in order to be validly escaped 
> URI strings.  

The receiving party must decode them to the original form when they are 
extracted from the transport layer.

> So one could argue that the unicode characters are never 
> part of the protocol 

One would then be ignoring the parts of the protocol that do not deal 
with the transport layer directly.

Johnny

Gmane