Quantcast

strange unicode behaviour

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

strange unicode behaviour

Dean Oemcke
Hi,

Just thought I'd mention some strange behaviour I have encountered when trying to insert unicode characters within an INSERT or UPDATE statement ( Hsqldb v2.2.9).

UPDATE foo SET bar = U&'\00E9'

As far as I'm aware, this should translate to the é character. However it seems to convert to the equivalent of u000E.

It turns out that shifting the characters one to the left seems to produce the desired result (although you are still forced to enter a redundant 4th character which seems to get ignored). ie:

UPDATE foo SET bar = U&'\0E90'

The above produces the desired result of é
To me this seems a bit odd? Just thought I'd mention it in case it's a bug.

Cheers,
Dean

------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Hsqldb-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: strange unicode behaviour

Fred Toussi-2
Yes. This problem was found and fixed for the next release.
 
If you need this for 2.2.9, you try updating the code by comparing with the latest SVN version of org.hsqldb.Scanner
 
Fred
 
On Wed, Apr 10, 2013, at 6:21, Dean Oemcke wrote:
Hi,

Just thought I'd mention some strange behaviour I have encountered when trying to insert unicode characters within an INSERT or UPDATE statement ( Hsqldb v2.2.9).

UPDATE foo SET bar = U&'\00E9'

As far as I'm aware, this should translate to the é character. However it seems to convert to the equivalent of u000E.
 
It turns out that shifting the characters one to the left seems to produce the desired result (although you are still forced to enter a redundant 4th character which seems to get ignored). ie:
 
UPDATE foo SET bar = U&'\0E90'
 
The above produces the desired result of é
To me this seems a bit odd? Just thought I'd mention it in case it's a bug.

Cheers,
Dean
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
_______________________________________________
Hsqldb-user mailing list
 

------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Hsqldb-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-user
Loading...