1 Oct 2007 11:02
Re: u32regex_search crashes
Anjaly <anjaly <at> cdactvm.in>
2007-10-01 09:02:18 GMT
2007-10-01 09:02:18 GMT
I am sorry the last message had an mistake.I wanted to say that I want to do a search that would take all the data as though it is Utf32 rather than utf8 ( as i incorrectly wrote). I don't know whether i am making myself clear (I am not very good in expressing the opnion). What i really want to do is a unicode search on the available data. Anjaly G S On Mon, 2007-10-01 at 09:42 +0100, John Maddock wrote: > Anjaly wrote: > > In the regex document it was said that the size of data type of the > > variable passed to the make_u32regex that determines character > > encoding (utf8,utf16 or utf32) . > > *For construction of the regex object*. > > The search algorithms operate independently on any of UTF8/16/32. > > > I passed wchar_t (which i think size > > is 4) so that the buffer encoding is considered as utf8 by > > u32regex_search irrespectively. Actually i am trying to do a utf8 > > search. > > Except the data file you sent *was not valid UTF8* ! > > It looks like it's probably UTF16LE, it's up to you in that case to decode > the byte order mark and read the text into something that Boost.Regex can > handle (for example platform-native UTF16). ICU should have some file IO > routines for doing that kind of thing: for example for loading a file into a > UnicodeString type. > > HTH, John. > > _______________________________________________ > Boost-users mailing list > Boost-users <at> lists.boost.org > http://lists.boost.org/mailman/listinfo.cgi/boost-users ______________________________________ Scanned and protected by Email scanner
RSS Feed