(1) On changing "byte" semantic to "char" -
If we change NLS_LENGTH_SEMANTICS to CHAR -- all the old tables still are maintained in "Byte Semantic" and new tables are maintained in "Char Semantic".
(2) Effect on "char" semantic on named PL/SQL -
If we change the NLS_LENGTH_SEMANTICS to CHAR -- all named PL/SQL codes will use the Character semantics. i.e., VARCHAR2(20) inside a *named* PL/SQL(as opposed to an anonymous PL/SQL block) will be VARCHAR2(20 CHAR).
Anonymous PL/SQL blocks, however, have to be recompiled.
(3) If VARCHAR2 can store more than 4000 bytes with "char" semantic -
If we change the NLS_LENTH_SEMANTICS we still are bound by maximum bytes (4000) of VARCHAR2.
Let's do some math here - max byte a character can hold is 3 for UTF8 (Oracle 8i); 4 bytes for AL32UTF8 (Oracle 9i, we're going to use this for database, i.e., VARCHAR2s); and 2 bytes for AL16UTF16 (for NVARCHAR data type).
Thus the maximum number of character a VARCHAR2(4000) will hold is 1333 in UTF8 (8i) and 1000 in AL32UTF8 (9i).
Here's a nice table I found on Oracle metalink -
UTF8 (1 to 3 bytes) AL32UTF8 (1 to 4 bytes) AL16UTF16 ( 2 bytes)
MIN MAX MIN MAX MIN MAX
CHAR 2000 666 2000 500 N/A N/A
VARCHAR2 4000 1333 4000 1000 N/A N/A
NCHAR 2000 666 N/A N/A 1000 1000
NVARCHAR2 4000 1333 N/A N/A 2000 2000
N/A means Not possible!
However, if the width of the data type is well below max, i.e., 4000 - max number of characters will be stored.
i.e., a VARCHAR2(100 char) will store 100 characters leading to a max. 400 bytes with AL32UTF8 (Unicode 3.0 encoding)
Note – Pl/SQL code can handle upto 32K bytes in a VARCHAR2 variable!
(4) What if we need to store more than 4000 bytes in a varchar2 field for multi-byte characters without changing the length semantics -
We need to resort to using NVARCHAR2.
NVARCHAR2 falls under "national character set" and *ALWAYS* stores data in "character" semantic, even if the NLS_LENGTH_SEMANTICS is set to be "byte".
(5) What's the easiest way to solve the problem "varchar2(250) field cannot take 250 characters because it was declared in 'byte' semantic” -
Case 1 -- Easiest for an already up and running database --
Alter table modify column (problem_column varchar2(250 char));
OR
Alter table modify column (problem_column nvarchar2(250));
Change the database nls_length_semantics to CHAR to let PL/SQL handle these extra bytes for multi-byte characters.
Case 2 -- Easiest for a "fresh" Install --
Alter system set nls_length_semantics = char;
-- create all the tables or Import them fresh.
If the problem happens only with a few tables - I would say case 1 is a better way to go. For ideal cases or instance-wide multi-byte data, "Case 2" is the one.
(6) What's the performance implication of "char" semantic?
Well, when Oracle stores string as "characters" Oracle
(a) cannot do the easier "byte" arithmetic for string computations. It has to then count characters to begin and start, not bytes. So string-extensive operations - particularly SUBSTR(:string,1,40) types, are slower. I've to benchmark to find out just how much slower. I would guess "LIKE" operator would also be slower.
But then I received an interesting feedback from a co-worker -
"In regard to performance implications resulting from not being able to do byte-level arithmetic for "short-cutting" string manipulation and comparison calculations, the detrimental effect of this is probably marginal.
(b) (obviously), increases the space requirements.
January 2005 February 2005 March 2005 April 2005 May 2005 March 2006 April 2006 August 2006 September 2006 October 2006 January 2007 May 2008