View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0005333 | JEDI Code Library | JclStrings | public | 2010-09-17 11:44 | 2010-10-11 10:52 |
Reporter | Kiriakos | Assigned To | outchy | ||
Priority | normal | Severity | minor | Reproducibility | always |
Status | resolved | Resolution | fixed | ||
Product Version | Version 2.3 | ||||
Target Version | Fixed in Version | Version 2.3 | |||
Summary | 0005333: Huge overhead of JclStrings | ||||
Description | Version 2.4 (latest) Just adding JclStrings in the uses clause increases EXE size by 600 Kbytes if compiled under Delphi >= 2009. The reason is const MaxStrCharCount = Ord(High(Char)) + 1; // # of chars in one set StrLoOffset = MaxStrCharCount * 0; // offset to lower case chars StrUpOffset = MaxStrCharCount * 1; // offset to upper case chars StrReOffset = MaxStrCharCount * 2; // offset to reverse case chars StrCaseMapSize = MaxStrCharCount * 3; // # of chars is a table var StrCaseMap: array [0..StrCaseMapSize - 1] of Char; // case mappings StrCaseMapReady: Boolean = False; // true if case map exists StrCharTypes: array [Char] of Word; Also LoadCharTypes and LoadCaseMap incur a lot of time overhead which is only justified in intensive text processing applications. Could there be some option to use slower versions of the functions that rely on StrCaseMap and StrCharTypes so that this overhead be avoided? If that is not possible then could StrCaseMap and StrCharTypes allocated on the heap so that EXE size is not affected? (e.g. dynamic char arrays) | ||||
Tags | No tags attached. | ||||
Fixed in GIT commit | |||||
Fixed in SVN revision | 3362 | ||||
IDE version | Delphi/C++Builder 2009 | ||||
2010-09-20 20:36
|
jcl_string_overhead.patch (55,155 bytes)
Index: JclAnsiStrings.pas =================================================================== --- JclAnsiStrings.pas (revision 3343) +++ JclAnsiStrings.pas (working copy) @@ -69,7 +69,8 @@ {$IFDEF HAS_UNIT_ANSISTRINGS} AnsiStrings, {$ENDIF HAS_UNIT_ANSISTRINGS} - JclBase; + JclBase, + JclSynch; // Ansi types @@ -492,10 +493,15 @@ AnsiCaseMapSize = AnsiCharCount * 3; // # of chars is a table var - AnsiCaseMap: array [0..AnsiCaseMapSize - 1] of AnsiChar; // case mappings - AnsiCaseMapReady: Boolean = False; // true if case map exists - AnsiCharTypes: array [AnsiChar] of Word; + AnsiCaseMap: array of AnsiChar; // case mappings + AnsiCaseMapFlag: Integer = FLAG_NOT_INITIALIZED; + AnsiCharTypes: array of Word; + AnsiCharTypesFlag: Integer = FLAG_NOT_INITIALIZED; +// lookup table initialization functions, can be safely called multiple times +procedure LoadAnsiCaseMap; +procedure LoadAnsiCharTypes; + {$IFDEF UNITVERSIONING} const UnitVersioning: TUnitVersionInfo = ( @@ -517,7 +523,7 @@ {$IFDEF SUPPORTS_UNICODE} RtlConsts, {$ENDIF SUPPORTS_UNICODE} - JclLogic, JclResources, JclStreams, JclSynch; + JclLogic, JclResources, JclStreams; //=== Internal =============================================================== @@ -531,11 +537,12 @@ const AnsiStrRecSize = SizeOf(TAnsiStrRec); // size of the AnsiString header rec -procedure LoadCharTypes; +procedure InitAnsiCharTypes; var CurrChar: AnsiChar; CurrType: Word; begin + SetLength(AnsiCharTypes, AnsiCharCount); for CurrChar := Low(AnsiChar) to High(AnsiChar) do begin {$IFDEF MSWINDOWS} @@ -565,51 +572,58 @@ CurrType := CurrType or C1_ALPHA; {$DEFINE CHAR_TYPES_INITIALIZED} {$ENDIF LINUX} - AnsiCharTypes[CurrChar] := CurrType; + AnsiCharTypes[Ord(CurrChar)] := CurrType; {$IFNDEF CHAR_TYPES_INITIALIZED} Implement case map initialization here {$ENDIF ~CHAR_TYPES_INITIALIZED} end; end; -procedure LoadCaseMap; +procedure LoadAnsiCharTypes; +begin + LockedInitialization(AnsiCharTypesFlag, InitAnsiCharTypes); +end; + +procedure InitAnsiCaseMap; var CurrChar, UpCaseChar, LoCaseChar, ReCaseChar: AnsiChar; begin - if not AnsiCaseMapReady then + SetLength(AnsiCaseMap, AnsiCaseMapSize); + for CurrChar := Low(AnsiChar) to High(AnsiChar) do begin - for CurrChar := Low(AnsiChar) to High(AnsiChar) do - begin - {$IFDEF MSWINDOWS} - LoCaseChar := CurrChar; - UpCaseChar := CurrChar; - Windows.CharLowerBuffA(@LoCaseChar, 1); - Windows.CharUpperBuffA(@UpCaseChar, 1); - {$DEFINE CASE_MAP_INITIALIZED} - {$ENDIF MSWINDOWS} - {$IFDEF LINUX} - LoCaseChar := AnsiChar(tolower(Byte(CurrChar))); - UpCaseChar := AnsiChar(toupper(Byte(CurrChar))); - {$DEFINE CASE_MAP_INITIALIZED} - {$ENDIF LINUX} - {$IFNDEF CASE_MAP_INITIALIZED} - Implement case map initialization here - {$ENDIF ~CASE_MAP_INITIALIZED} - if CharIsUpper(CurrChar) then - ReCaseChar := LoCaseChar - else - if CharIsLower(CurrChar) then - ReCaseChar := UpCaseChar - else - ReCaseChar := CurrChar; - AnsiCaseMap[Ord(CurrChar) + AnsiLoOffset] := LoCaseChar; - AnsiCaseMap[Ord(CurrChar) + AnsiUpOffset] := UpCaseChar; - AnsiCaseMap[Ord(CurrChar) + AnsiReOffset] := ReCaseChar; - end; - AnsiCaseMapReady := True; + {$IFDEF MSWINDOWS} + LoCaseChar := CurrChar; + UpCaseChar := CurrChar; + Windows.CharLowerBuffA(@LoCaseChar, 1); + Windows.CharUpperBuffA(@UpCaseChar, 1); + {$DEFINE CASE_MAP_INITIALIZED} + {$ENDIF MSWINDOWS} + {$IFDEF LINUX} + LoCaseChar := AnsiChar(tolower(Byte(CurrChar))); + UpCaseChar := AnsiChar(toupper(Byte(CurrChar))); + {$DEFINE CASE_MAP_INITIALIZED} + {$ENDIF LINUX} + {$IFNDEF CASE_MAP_INITIALIZED} + Implement case map initialization here + {$ENDIF ~CASE_MAP_INITIALIZED} + if CharIsUpper(CurrChar) then + ReCaseChar := LoCaseChar + else + if CharIsLower(CurrChar) then + ReCaseChar := UpCaseChar + else + ReCaseChar := CurrChar; + AnsiCaseMap[Ord(CurrChar) + AnsiLoOffset] := LoCaseChar; + AnsiCaseMap[Ord(CurrChar) + AnsiUpOffset] := UpCaseChar; + AnsiCaseMap[Ord(CurrChar) + AnsiReOffset] := ReCaseChar; end; end; +procedure LoadAnsiCaseMap; +begin + LockedInitialization(AnsiCaseMapFlag, InitAnsiCaseMap); +end; + // Uppercases or Lowercases a give AnsiString depending on the // passed offset. (UpOffset or LoOffset) @@ -618,6 +632,9 @@ P: PAnsiChar; I, L: SizeInt; begin + if AnsiCaseMapFlag <> FLAG_INITIALIZED then + LoadAnsiCaseMap; + if Str <> '' then begin UniqueString(Str); @@ -637,6 +654,9 @@ procedure StrCaseBuff(S: PAnsiChar; const Offset: SizeInt); begin + if AnsiCaseMapFlag <> FLAG_INITIALIZED then + LoadAnsiCaseMap; + if (S <> nil) and (S^ <> #0) then begin repeat @@ -2855,23 +2875,34 @@ function CharIsAlpha(const C: AnsiChar): Boolean; begin - Result := (AnsiCharTypes[C] and C1_ALPHA) <> 0; + if AnsiCharTypesFlag <> FLAG_INITIALIZED then + LoadAnsiCharTypes; + + Result := (AnsiCharTypes[Ord(C)] and C1_ALPHA) <> 0; end; function CharIsAlphaNum(const C: AnsiChar): Boolean; begin - Result := ((AnsiCharTypes[C] and C1_ALPHA) <> 0) or - ((AnsiCharTypes[C] and C1_DIGIT) <> 0); + if AnsiCharTypesFlag <> FLAG_INITIALIZED then + LoadAnsiCharTypes; + + Result := (AnsiCharTypes[Ord(C)] and (C1_ALPHA or C1_DIGIT)) <> 0; end; function CharIsBlank(const C: AnsiChar): Boolean; begin - Result := ((AnsiCharTypes[C] and C1_BLANK) <> 0); + if AnsiCharTypesFlag <> FLAG_INITIALIZED then + LoadAnsiCharTypes; + + Result := (AnsiCharTypes[Ord(C)] and C1_BLANK) <> 0; end; function CharIsControl(const C: AnsiChar): Boolean; begin - Result := (AnsiCharTypes[C] and C1_CNTRL) <> 0; + if AnsiCharTypesFlag <> FLAG_INITIALIZED then + LoadAnsiCharTypes; + + Result := (AnsiCharTypes[Ord(C)] and C1_CNTRL) <> 0; end; function CharIsDelete(const C: AnsiChar): Boolean; @@ -2881,39 +2912,57 @@ function CharIsDigit(const C: AnsiChar): Boolean; begin - Result := (AnsiCharTypes[C] and C1_DIGIT) <> 0; + if AnsiCharTypesFlag <> FLAG_INITIALIZED then + LoadAnsiCharTypes; + + Result := (AnsiCharTypes[Ord(C)] and C1_DIGIT) <> 0; end; function CharIsFracDigit(const C: AnsiChar): Boolean; begin - Result := (C = '.') or ((AnsiCharTypes[C] and C1_DIGIT) <> 0); + if AnsiCharTypesFlag <> FLAG_INITIALIZED then + LoadAnsiCharTypes; + + Result := (C = '.') or ((AnsiCharTypes[Ord(C)] and C1_DIGIT) <> 0); end; function CharIsHexDigit(const C: AnsiChar): Boolean; begin + if AnsiCharTypesFlag <> FLAG_INITIALIZED then + LoadAnsiCharTypes; + case C of 'A'..'F', 'a'..'f': Result := True; else - Result := ((AnsiCharTypes[C] and C1_DIGIT) <> 0); + Result := ((AnsiCharTypes[Ord(C)] and C1_DIGIT) <> 0); end; end; function CharIsLower(const C: AnsiChar): Boolean; begin - Result := (AnsiCharTypes[C] and C1_LOWER) <> 0; + if AnsiCharTypesFlag <> FLAG_INITIALIZED then + LoadAnsiCharTypes; + + Result := (AnsiCharTypes[Ord(C)] and C1_LOWER) <> 0; end; function CharIsNumberChar(const C: AnsiChar): Boolean; begin - Result := ((AnsiCharTypes[C] and C1_DIGIT) <> 0) or (C = AnsiSignMinus) or (C = AnsiSignPlus) or + if AnsiCharTypesFlag <> FLAG_INITIALIZED then + LoadAnsiCharTypes; + + Result := ((AnsiCharTypes[Ord(C)] and C1_DIGIT) <> 0) or (C = AnsiSignMinus) or (C = AnsiSignPlus) or (Char(C) = {$IFDEF RTL220_UP}FormatSettings.{$ENDIF}DecimalSeparator); end; function CharIsNumber(const C: AnsiChar): Boolean; begin - Result := ((AnsiCharTypes[C] and C1_DIGIT) <> 0) or + if AnsiCharTypesFlag <> FLAG_INITIALIZED then + LoadAnsiCharTypes; + + Result := ((AnsiCharTypes[Ord(C)] and C1_DIGIT) <> 0) or (Char(C) = {$IFDEF RTL220_UP}FormatSettings.{$ENDIF}DecimalSeparator); end; @@ -2924,7 +2973,10 @@ function CharIsPunctuation(const C: AnsiChar): Boolean; begin - Result := ((AnsiCharTypes[C] and C1_PUNCT) <> 0); + if AnsiCharTypesFlag <> FLAG_INITIALIZED then + LoadAnsiCharTypes; + + Result := ((AnsiCharTypes[Ord(C)] and C1_PUNCT) <> 0); end; function CharIsReturn(const C: AnsiChar): Boolean; @@ -2934,12 +2986,18 @@ function CharIsSpace(const C: AnsiChar): Boolean; begin - Result := (AnsiCharTypes[C] and C1_SPACE) <> 0; + if AnsiCharTypesFlag <> FLAG_INITIALIZED then + LoadAnsiCharTypes; + + Result := (AnsiCharTypes[Ord(C)] and C1_SPACE) <> 0; end; function CharIsUpper(const C: AnsiChar): Boolean; begin - Result := (AnsiCharTypes[C] and C1_UPPER) <> 0; + if AnsiCharTypesFlag <> FLAG_INITIALIZED then + LoadAnsiCharTypes; + + Result := (AnsiCharTypes[Ord(C)] and C1_UPPER) <> 0; end; function CharIsValidIdentifierLetter(const C: AnsiChar): Boolean; @@ -2954,9 +3012,12 @@ function CharIsWhiteSpace(const C: AnsiChar): Boolean; begin + if AnsiCharTypesFlag <> FLAG_INITIALIZED then + LoadAnsiCharTypes; + Result := (C = AnsiTab) or (C = AnsiLineFeed) or (C = AnsiVerticalTab) or (C = AnsiFormFeed) or (C = AnsiCarriageReturn) or (C =AnsiSpace) or - ((AnsiCharTypes[C] and C1_SPACE) <> 0); + ((AnsiCharTypes[Ord(C)] and C1_SPACE) <> 0); end; function CharIsWildcard(const C: AnsiChar): Boolean; @@ -2971,7 +3032,10 @@ function CharType(const C: AnsiChar): Word; begin - Result := AnsiCharTypes[C]; + if AnsiCharTypesFlag <> FLAG_INITIALIZED then + LoadAnsiCharTypes; + + Result := AnsiCharTypes[Ord(C)]; end; //=== PCharVector ============================================================ @@ -3068,16 +3132,25 @@ function CharLower(const C: AnsiChar): AnsiChar; begin + if AnsiCaseMapFlag <> FLAG_INITIALIZED then + LoadAnsiCaseMap; + Result := AnsiCaseMap[Ord(C) + AnsiLoOffset]; end; function CharToggleCase(const C: AnsiChar): AnsiChar; begin + if AnsiCaseMapFlag <> FLAG_INITIALIZED then + LoadAnsiCaseMap; + Result := AnsiCaseMap[Ord(C) + AnsiReOffset]; end; function CharUpper(const C: AnsiChar): AnsiChar; begin + if AnsiCaseMapFlag <> FLAG_INITIALIZED then + LoadAnsiCaseMap; + Result := AnsiCaseMap[Ord(C) + AnsiUpOffset]; end; @@ -3103,6 +3176,9 @@ function CharIPos(const S: AnsiString; C: AnsiChar; const Index: SizeInt): SizeInt; begin + if AnsiCaseMapFlag <> FLAG_INITIALIZED then + LoadAnsiCaseMap; + if (Index > 0) and (Index <= Length(S)) then begin C := CharUpper(C); @@ -3865,10 +3941,8 @@ Result := AnsiCompareNatural(S1, S2, True); end; +{$IFDEF UNITVERSIONING} initialization - LoadCharTypes; // this table first - LoadCaseMap; // or this function does not work -{$IFDEF UNITVERSIONING} RegisterUnitVersion(HInstance, UnitVersioning); finalization Index: JclResources.pas =================================================================== --- JclResources.pas (revision 3343) +++ JclResources.pas (working copy) @@ -1751,6 +1751,11 @@ RsMetSectInvalidParameter = 'An invalid parameter was passed to the constructor.'; RsMetSectInitialize = 'Failed to initialize the metered section.'; RsMetSectNameEmpty = 'Name cannot be empty when using the Open constructor.'; + RsLockedResSeqFinalizing = 'Sequence error: initialization attempt of a finalizing resource.'; + RsLockedResSeqFinalized = 'Sequence error: initialization attempt of a finalized resource.'; + RsLockedResSeqInitializing = 'Sequence error: finalization attempt of an initializing resource.'; + RsLockedResInitError = 'Locked resource initialization error.'; + RsLockedResInternalError = 'Internal error in locked resource access.'; //=== JclSysInfo ============================================================= resourcestring Index: JclStrings.pas =================================================================== --- JclStrings.pas (revision 3343) +++ JclStrings.pas (working copy) @@ -71,7 +71,8 @@ Classes, SysUtils, JclAnsiStrings, JclWideStrings, - JclBase; + JclBase, + JclSynch; // Exceptions type @@ -584,10 +585,15 @@ StrCaseMapSize = MaxStrCharCount * 3; // # of chars is a table var - StrCaseMap: array [0..StrCaseMapSize - 1] of Char; // case mappings - StrCaseMapReady: Boolean = False; // true if case map exists - StrCharTypes: array [Char] of Word; + StrCaseMap: array of Char; // case mappings + StrCaseMapFlag: Integer = FLAG_NOT_INITIALIZED; + StrCharTypes: array of Word; + StrCharTypesFlag: Integer = FLAG_NOT_INITIALIZED; +// lookup table initialization functions, can be safely called multiple times +procedure LoadStrCaseMap; +procedure LoadStrCharTypes; + {$IFDEF UNITVERSIONING} const UnitVersioning: TUnitVersionInfo = ( @@ -609,7 +615,7 @@ {$IFDEF SUPPORTS_UNICODE} StrUtils, {$ENDIF SUPPORTS_UNICODE} - JclLogic, JclResources, JclStreams, JclSynch; + JclLogic, JclResources, JclStreams; //=== Internal =============================================================== @@ -620,11 +626,12 @@ end; PStrRec = ^TStrRec; -procedure LoadCharTypes; +procedure InitStrCharTypes; var CurrChar: Char; CurrType: Word; begin + SetLength(StrCharTypes, MaxStrCharCount); for CurrChar := Low(CurrChar) to High(CurrChar) do begin {$IFDEF MSWINDOWS} @@ -654,51 +661,58 @@ CurrType := CurrType or C1_ALPHA; {$DEFINE CHAR_TYPES_INITIALIZED} {$ENDIF LINUX} - StrCharTypes[CurrChar] := CurrType; + StrCharTypes[Ord(CurrChar)] := CurrType; {$IFNDEF CHAR_TYPES_INITIALIZED} Implement case map initialization here {$ENDIF ~CHAR_TYPES_INITIALIZED} end; end; -procedure LoadCaseMap; +procedure LoadStrCharTypes; +begin + LockedInitialization(StrCharTypesFlag, InitStrCharTypes); +end; + +procedure InitStrCaseMap; var CurrChar, UpCaseChar, LoCaseChar, ReCaseChar: Char; begin - if not StrCaseMapReady then + SetLength(StrCaseMap, StrCaseMapSize); + for CurrChar := Low(Char) to High(Char) do begin - for CurrChar := Low(Char) to High(Char) do - begin - {$IFDEF MSWINDOWS} - LoCaseChar := CurrChar; - UpCaseChar := CurrChar; - Windows.CharLowerBuff(@LoCaseChar, 1); - Windows.CharUpperBuff(@UpCaseChar, 1); - {$DEFINE CASE_MAP_INITIALIZED} - {$ENDIF MSWINDOWS} - {$IFDEF LINUX} - LoCaseChar := Char(tolower(Byte(CurrChar))); - UpCaseChar := Char(toupper(Byte(CurrChar))); - {$DEFINE CASE_MAP_INITIALIZED} - {$ENDIF LINUX} - {$IFNDEF CASE_MAP_INITIALIZED} - Implement case map initialization here - {$ENDIF ~CASE_MAP_INITIALIZED} - if CharIsUpper(CurrChar) then - ReCaseChar := LoCaseChar - else - if CharIsLower(CurrChar) then - ReCaseChar := UpCaseChar - else - ReCaseChar := CurrChar; - StrCaseMap[Ord(CurrChar) + StrLoOffset] := LoCaseChar; - StrCaseMap[Ord(CurrChar) + StrUpOffset] := UpCaseChar; - StrCaseMap[Ord(CurrChar) + StrReOffset] := ReCaseChar; - end; - StrCaseMapReady := True; + {$IFDEF MSWINDOWS} + LoCaseChar := CurrChar; + UpCaseChar := CurrChar; + Windows.CharLowerBuff(@LoCaseChar, 1); + Windows.CharUpperBuff(@UpCaseChar, 1); + {$DEFINE CASE_MAP_INITIALIZED} + {$ENDIF MSWINDOWS} + {$IFDEF LINUX} + LoCaseChar := Char(tolower(Byte(CurrChar))); + UpCaseChar := Char(toupper(Byte(CurrChar))); + {$DEFINE CASE_MAP_INITIALIZED} + {$ENDIF LINUX} + {$IFNDEF CASE_MAP_INITIALIZED} + Implement case map initialization here + {$ENDIF ~CASE_MAP_INITIALIZED} + if CharIsUpper(CurrChar) then + ReCaseChar := LoCaseChar + else + if CharIsLower(CurrChar) then + ReCaseChar := UpCaseChar + else + ReCaseChar := CurrChar; + StrCaseMap[Ord(CurrChar) + StrLoOffset] := LoCaseChar; + StrCaseMap[Ord(CurrChar) + StrUpOffset] := UpCaseChar; + StrCaseMap[Ord(CurrChar) + StrReOffset] := ReCaseChar; end; end; +procedure LoadStrCaseMap; +begin + LockedInitialization(StrCaseMapFlag, InitStrCaseMap); +end; + // Uppercases or Lowercases a give string depending on the // passed offset. (UpOffset or LoOffset) @@ -707,6 +721,9 @@ P: PChar; I, L: SizeInt; begin + if StrCaseMapFlag <> FLAG_INITIALIZED then + LoadStrCaseMap; + L := Length(Str); if L > 0 then begin @@ -728,6 +745,9 @@ var C: Char; begin + if StrCaseMapFlag <> FLAG_INITIALIZED then + LoadStrCaseMap; + if S <> nil then begin repeat @@ -2642,22 +2662,30 @@ function CharIsAlpha(const C: Char): Boolean; begin - Result := (StrCharTypes[C] and C1_ALPHA) <> 0; + if StrCharTypesFlag <> FLAG_INITIALIZED then + LoadStrCharTypes; + Result := (StrCharTypes[Ord(C)] and C1_ALPHA) <> 0; end; function CharIsAlphaNum(const C: Char): Boolean; begin - Result := ((StrCharTypes[C] and C1_ALPHA) <> 0) or ((StrCharTypes[C] and C1_DIGIT) <> 0); + if StrCharTypesFlag <> FLAG_INITIALIZED then + LoadStrCharTypes; + Result := (StrCharTypes[Ord(C)] and (C1_ALPHA or C1_DIGIT)) <> 0; end; function CharIsBlank(const C: Char): Boolean; begin - Result := ((StrCharTypes[C] and C1_BLANK) <> 0); + if StrCharTypesFlag <> FLAG_INITIALIZED then + LoadStrCharTypes; + Result := ((StrCharTypes[Ord(C)] and C1_BLANK) <> 0); end; function CharIsControl(const C: Char): Boolean; begin - Result := (StrCharTypes[C] and C1_CNTRL) <> 0; + if StrCharTypesFlag <> FLAG_INITIALIZED then + LoadStrCharTypes; + Result := (StrCharTypes[Ord(C)] and C1_CNTRL) <> 0; end; function CharIsDelete(const C: Char): Boolean; @@ -2667,39 +2695,51 @@ function CharIsDigit(const C: Char): Boolean; begin - Result := (StrCharTypes[C] and C1_DIGIT) <> 0; + if StrCharTypesFlag <> FLAG_INITIALIZED then + LoadStrCharTypes; + Result := (StrCharTypes[Ord(C)] and C1_DIGIT) <> 0; end; function CharIsFracDigit(const C: Char): Boolean; begin - Result := (C = '.') or ((StrCharTypes[C] and C1_DIGIT) <> 0); + if StrCharTypesFlag <> FLAG_INITIALIZED then + LoadStrCharTypes; + Result := (C = '.') or ((StrCharTypes[Ord(C)] and C1_DIGIT) <> 0); end; function CharIsHexDigit(const C: Char): Boolean; begin + if StrCharTypesFlag <> FLAG_INITIALIZED then + LoadStrCharTypes; case C of 'A'..'F', 'a'..'f': Result := True; else - Result := ((StrCharTypes[C] and C1_DIGIT) <> 0); + Result := ((StrCharTypes[Ord(C)] and C1_DIGIT) <> 0); end; end; function CharIsLower(const C: Char): Boolean; begin - Result := (StrCharTypes[C] and C1_LOWER) <> 0; + if StrCharTypesFlag <> FLAG_INITIALIZED then + LoadStrCharTypes; + Result := (StrCharTypes[Ord(C)] and C1_LOWER) <> 0; end; function CharIsNumberChar(const C: Char): Boolean; begin - Result := ((StrCharTypes[C] and C1_DIGIT) <> 0) or (C = '+') or (C = '-') or + if StrCharTypesFlag <> FLAG_INITIALIZED then + LoadStrCharTypes; + Result := ((StrCharTypes[Ord(C)] and C1_DIGIT) <> 0) or (C = '+') or (C = '-') or (C = {$IFDEF RTL220_UP}FormatSettings.{$ENDIF}DecimalSeparator); end; function CharIsNumber(const C: Char): Boolean; begin - Result := ((StrCharTypes[C] and C1_DIGIT) <> 0) or + if StrCharTypesFlag <> FLAG_INITIALIZED then + LoadStrCharTypes; + Result := ((StrCharTypes[Ord(C)] and C1_DIGIT) <> 0) or (C = {$IFDEF RTL220_UP}FormatSettings.{$ENDIF}DecimalSeparator); end; @@ -2710,7 +2750,9 @@ function CharIsPunctuation(const C: Char): Boolean; begin - Result := ((StrCharTypes[C] and C1_PUNCT) <> 0); + if StrCharTypesFlag <> FLAG_INITIALIZED then + LoadStrCharTypes; + Result := ((StrCharTypes[Ord(C)] and C1_PUNCT) <> 0); end; function CharIsReturn(const C: Char): Boolean; @@ -2720,12 +2762,16 @@ function CharIsSpace(const C: Char): Boolean; begin - Result := (StrCharTypes[C] and C1_SPACE) <> 0; + if StrCharTypesFlag <> FLAG_INITIALIZED then + LoadStrCharTypes; + Result := (StrCharTypes[Ord(C)] and C1_SPACE) <> 0; end; function CharIsUpper(const C: Char): Boolean; begin - Result := (StrCharTypes[C] and C1_UPPER) <> 0; + if StrCharTypesFlag <> FLAG_INITIALIZED then + LoadStrCharTypes; + Result := (StrCharTypes[Ord(C)] and C1_UPPER) <> 0; end; function CharIsValidIdentifierLetter(const C: Char): Boolean; @@ -2772,7 +2818,9 @@ function CharType(const C: Char): Word; begin - Result := StrCharTypes[C]; + if StrCharTypesFlag <> FLAG_INITIALIZED then + LoadStrCharTypes; + Result := StrCharTypes[Ord(C)]; end; //=== PCharVector ============================================================ @@ -2867,16 +2915,25 @@ function CharLower(const C: Char): Char; begin + if StrCaseMapFlag <> FLAG_INITIALIZED then + LoadStrCaseMap; + Result := StrCaseMap[Ord(C) + StrLoOffset]; end; function CharToggleCase(const C: Char): Char; begin + if StrCaseMapFlag <> FLAG_INITIALIZED then + LoadStrCaseMap; + Result := StrCaseMap[Ord(C) + StrReOffset]; end; function CharUpper(const C: Char): Char; begin + if StrCaseMapFlag <> FLAG_INITIALIZED then + LoadStrCaseMap; + Result := StrCaseMap[Ord(C) + StrUpOffset]; end; @@ -2906,6 +2963,9 @@ function CharIPos(const S: string; C: Char; const Index: SizeInt): SizeInt; begin + if StrCaseMapFlag <> FLAG_INITIALIZED then + LoadStrCaseMap; + if (Index > 0) and (Index <= Length(S)) then begin C := CharUpper(C); @@ -5150,14 +5210,10 @@ Result := CompareNatural(S1, S2, True); end; +{$IFDEF UNITVERSIONING} initialization - LoadCharTypes; // this table first - LoadCaseMap; // or this function does not work - {$IFDEF UNITVERSIONING} RegisterUnitVersion(HInstance, UnitVersioning); - {$ENDIF UNITVERSIONING} -{$IFDEF UNITVERSIONING} finalization UnregisterUnitVersion(HInstance); {$ENDIF UNITVERSIONING} Index: JclSynch.pas =================================================================== --- JclSynch.pas (revision 3343) +++ JclSynch.pas (working copy) @@ -330,7 +330,24 @@ function ValidateMutexName(const aName: string): string; +// locked initialization and finalization of ressources +type + EJclLockedResourceError = class(EJclWin32Error); + TJclInitializationProcedure = procedure; + TJclFinalizationProcedure = procedure; +const + FLAG_NOT_INITIALIZED = 0; + FLAG_INITIALIZING = 1; + FLAG_INITIALIZED = 2; + FLAG_FINALIZING = 3; + FLAG_FINALIZED = 4; + FLAG_INITFAILED = 5; + FLAG_INTERROR = 6; + +procedure LockedInitialization(var InitFlag: Integer; InitProc: TJclInitializationProcedure); +procedure LockedFinalization(var InitFlag: Integer; FinitProc: TJclFinalizationProcedure); + {$IFDEF UNITVERSIONING} const UnitVersioning: TUnitVersionInfo = ( @@ -1722,7 +1739,129 @@ Result := StrReplaceChar(Result, '\', '_'); end; +procedure LockedInitialization(var InitFlag: Integer; InitProc: TJclInitializationProcedure); +begin + case LockedCompareExchange(InitFlag, FLAG_INITIALIZING, FLAG_NOT_INITIALIZED) of + FLAG_NOT_INITIALIZED: + // the ressource is not initialized yet, going to initialize now! + begin + try + InitProc; + if LockedCompareExchange(InitFlag, FLAG_INITIALIZED, FLAG_INITIALIZING) <> FLAG_INITIALIZING then + raise EJclLockedResourceError.CreateRes(@RsLockedResInternalError); + except + InitFlag := FLAG_INITFAILED; + raise; + end; + end; + FLAG_INITIALIZING: + // the ressource is being initialized in an other thread, wait until it is initialized + begin + while (True) do + begin + Sleep(1); + case InitFlag of + FLAG_INITIALIZING: ; // loop while other thread has initialized + FLAG_INITIALIZED: // done in the other thread, the resource is initialized + Break; + FLAG_FINALIZING: + raise EJclLockedResourceError.CreateRes(@RsLockedResSeqFinalizing); + FLAG_FINALIZED: + raise EJclLockedResourceError.CreateRes(@RsLockedResSeqFinalized); + else + //FLAG_NOT_INITIALIZED + //FLAG_INTERROR + //FLAG_INITFAILED + raise EJclLockedResourceError.CreateRes(@RsLockedResInternalError); + end; + end; + end; + FLAG_INITIALIZED: ; + // nothing to do the ressource is already initialized + FLAG_FINALIZING: + // sequence error: trying to initialize a ressource that is being finalized + raise EJclLockedResourceError.CreateRes(@RsLockedResSeqFinalizing); + FLAG_FINALIZED: + // sequence error: trying to initialize a ressource that is already finalized + raise EJclLockedResourceError.CreateRes(@RsLockedResSeqFinalized); + FLAG_INITFAILED: + // the resource could not be initialized + raise EJclLockedResourceError.CreateRes(@RsLockedResInitError); + else + // FLAG_INTERROR and others + raise EJclLockedResourceError.CreateRes(@RsLockedResInternalError); + end; +end; +procedure LockedFinalization(var InitFlag: Integer; FinitProc: TJclFinalizationProcedure); +begin + case LockedCompareExchange(InitFlag, FLAG_FINALIZING, FLAG_INITIALIZED) of + FLAG_NOT_INITIALIZED: ; + // the ressource was not initialized, no need to finalize it + FLAG_INITIALIZING: + // the ressource is being initialized in an other thread, wait until it is initialized and raise a sequence error + begin + try + while (True) do + begin + Sleep(1); + case LockedCompareExchange(InitFlag, FLAG_FINALIZING, FLAG_INITIALIZED) of + FLAG_INITIALIZING: ; + // loop while other thread has initialized + FLAG_INITIALIZED: + // done in the other thread, the resource is initialized, just destroy it and raise an error + FinitProc; + else + //FLAG_FINALIZING + //FLAG_FINALIZED + //FLAG_NOT_INITIALIZED + //FLAG_INTERROR + Break; + end; + end; + finally + // finally raise an exception since the resource was initializing while being finalized + InitFlag := FLAG_INTERROR; + raise EJclLockedResourceError.CreateRes(@RsLockedResSeqInitializing); + end; + end; + FLAG_INITIALIZED: + // the ressource is being finalized here! + begin + try + FinitProc; + if LockedCompareExchange(InitFlag, FLAG_FINALIZED, FLAG_FINALIZING) <> FLAG_FINALIZING then + raise EJclLockedResourceError.CreateRes(@RsLockedResInternalError); + except + InitFlag := FLAG_INTERROR; + raise; + end; + end; + FLAG_FINALIZING: ; + // the resource is being finalized in an other thread, nothing to do + FLAG_FINALIZED: ; + // the resource is already finalized, nothing to do + FLAG_INITFAILED: + // the resource is being finalized here + begin + if LockedCompareExchange(InitFlag, FLAG_FINALIZING, FLAG_INITFAILED) = FLAG_INITFAILED then + begin + // the resource is being finalized here + begin + try + FinitProc; + finally + if LockedCompareExchange(InitFlag, FLAG_INTERROR, FLAG_FINALIZING) <> FLAG_FINALIZING then + raise EJclLockedResourceError.CreateRes(@RsLockedResInternalError); + end; + end; + end; + end; + else + // FLAG_INTERROR and others + raise EJclLockedResourceError.CreateRes(@RsLockedResInternalError); + end; +end; {$IFDEF UNITVERSIONING} initialization Index: JclUnicode.pas =================================================================== --- JclUnicode.pas (revision 3343) +++ JclUnicode.pas (working copy) @@ -1348,7 +1348,7 @@ type EJclUnicodeError = class(EJclError); -// functions to load Unicode data from resource +// functions to load Unicode data from resource, can be safely called multiple times procedure LoadCharacterCategories; procedure LoadCaseMappingData; procedure LoadDecompositionData; @@ -1422,11 +1422,6 @@ // used to negate a set of categories ClassAll = [Low(TCharacterCategory)..High(TCharacterCategory)]; -var - // As the global data can be accessed by several threads it should be guarded - // while the data is loaded. - LoadInProgress: TJclCriticalSection; - function OpenResourceStream(const ResName: string): TJclEasyStream; var ResourceStream: TStream; @@ -1501,11 +1496,10 @@ TCategoriesArray = array of array of TCharacterCategories; var - // character categories, stored in the system's swap file and mapped on demand - CategoriesLoaded: Boolean; - Categories: array [Byte] of TCategoriesArray; + CategoriesFlag: Integer = FLAG_NOT_INITIALIZED; + Categories: array of TCategoriesArray; -procedure LoadCharacterCategories; +procedure InitCharacterCategories; // Loads the character categories data (as saved by the Unicode database extractor, see also // the comments about JclUnicode.res above). var @@ -1516,58 +1510,54 @@ First, Second, Third: Byte; J, K: Integer; begin - // Data already loaded? - if not CategoriesLoaded then - begin - // make sure no other code is currently modifying the global data area - LoadInProgress.Enter; - try - CategoriesLoaded := True; - Stream := OpenResourceStream('CATEGORIES'); - try - while Stream.Position < Stream.Size do + SetLength(Categories, 256); + + Stream := OpenResourceStream('CATEGORIES'); + try + while Stream.Position < Stream.Size do + begin + // a) read which category is current in the stream + Category := TCharacterCategory(Stream.ReadByte); + // b) read the size of the ranges and the ranges themself + Size := Stream.ReadInteger; + if Size > 0 then + begin + SetLength(Buffer, Size); + for J := 0 to Size - 1 do begin - // a) read which category is current in the stream - Category := TCharacterCategory(Stream.ReadByte); - // b) read the size of the ranges and the ranges themself - Size := Stream.ReadInteger; - if Size > 0 then + Buffer[J].Start := StreamReadChar(Stream); + Buffer[J].Stop := StreamReadChar(Stream); + end; + + // c) go through every range and add the current category to each code point + for J := 0 to Size - 1 do + for K := Buffer[J].Start to Buffer[J].Stop do begin - SetLength(Buffer, Size); - for J := 0 to Size - 1 do - begin - Buffer[J].Start := StreamReadChar(Stream); - Buffer[J].Stop := StreamReadChar(Stream); - end; + Assert(K < $1000000, LoadResString(@RsCategoryUnicodeChar)); - // c) go through every range and add the current category to each code point - for J := 0 to Size - 1 do - for K := Buffer[J].Start to Buffer[J].Stop do - begin - Assert(K < $1000000, LoadResString(@RsCategoryUnicodeChar)); - - First := (K shr 16) and $FF; - Second := (K shr 8) and $FF; - Third := K and $FF; - // add second step array if not yet done - if Categories[First] = nil then - SetLength(Categories[First], 256); - if Categories[First, Second] = nil then - SetLength(Categories[First, Second], 256); - Include(Categories[First, Second, Third], Category); - end; + First := (K shr 16) and $FF; + Second := (K shr 8) and $FF; + Third := K and $FF; + // add second step array if not yet done + if Categories[First] = nil then + SetLength(Categories[First], 256); + if Categories[First, Second] = nil then + SetLength(Categories[First, Second], 256); + Include(Categories[First, Second, Third], Category); end; - end; - // Assert(Stream.Position = Stream.Size); - finally - Stream.Free; end; - finally - LoadInProgress.Leave; end; + // Assert(Stream.Position = Stream.Size); + finally + Stream.Free; end; end; +procedure LoadCharacterCategories; +begin + LockedInitialization(CategoriesFlag, InitCharacterCategories); +end; + function CategoryLookup(Code: Cardinal; Cats: TCharacterCategories): Boolean; overload; // determines whether the Code is in the given category var @@ -1576,7 +1566,7 @@ Assert(Code < $1000000, LoadResString(@RsCategoryUnicodeChar)); // load property data if not already done - if not CategoriesLoaded then + if CategoriesFlag <> FLAG_INITIALIZED then LoadCharacterCategories; First := (Code shr 16) and $FF; @@ -1598,84 +1588,80 @@ // An array for all case mappings (including 1 to many casing if saved by the extraction program). // The organization is a sparse, two stage matrix. // SingletonMapping is to quickly return a single default mapping. - CaseDataLoaded: Boolean; - CaseMapping: array [Byte] of TCaseArray; + CaseDataFlag: Integer = FLAG_NOT_INITIALIZED; + CaseMapping: array of TCaseArray; -procedure LoadCaseMappingData; +procedure InitCaseMappingData; var Stream: TJclEasyStream; I, J, Code, Size: Integer; First, Second, Third: Byte; begin - if not CaseDataLoaded then - begin - // make sure no other code is currently modifying the global data area - LoadInProgress.Enter; + SetLength(CaseMapping, 256); - try - CaseDataLoaded := True; - Stream := OpenResourceStream('CASE'); - try - // the first entry in the stream is the number of entries in the case mapping table - Size := Stream.ReadInteger; - for I := 0 to Size - 1 do - begin - // a) read actual code point - Code := StreamReadChar(Stream); - Assert(Code < $1000000, LoadResString(@RsCasedUnicodeChar)); + Stream := OpenResourceStream('CASE'); + try + // the first entry in the stream is the number of entries in the case mapping table + Size := Stream.ReadInteger; + for I := 0 to Size - 1 do + begin + // a) read actual code point + Code := StreamReadChar(Stream); + Assert(Code < $1000000, LoadResString(@RsCasedUnicodeChar)); - // if there is no high byte entry in the first stage table then create one - First := (Code shr 16) and $FF; - Second := (Code shr 8) and $FF; - Third := Code and $FF; - if CaseMapping[First] = nil then - SetLength(CaseMapping[First], 256); - if CaseMapping[First, Second] = nil then - SetLength(CaseMapping[First, Second], 256); + // if there is no high byte entry in the first stage table then create one + First := (Code shr 16) and $FF; + Second := (Code shr 8) and $FF; + Third := Code and $FF; + if CaseMapping[First] = nil then + SetLength(CaseMapping[First], 256); + if CaseMapping[First, Second] = nil then + SetLength(CaseMapping[First, Second], 256); - // b) read fold case array - Size := Stream.ReadByte; - if Size > 0 then - begin - SetLength(CaseMapping[First, Second, Third, ctFold], Size); - for J := 0 to Size - 1 do - CaseMapping[First, Second, Third, ctFold, J] := StreamReadChar(Stream); - end; - // c) read lower case array - Size := Stream.ReadByte; - if Size > 0 then - begin - SetLength(CaseMapping[First, Second, Third, ctLower], Size); - for J := 0 to Size - 1 do - CaseMapping[First, Second, Third, ctLower, J] := StreamReadChar(Stream); - end; - // d) read title case array - Size := Stream.ReadByte; - if Size > 0 then - begin - SetLength(CaseMapping[First, Second, Third, ctTitle], Size); - for J := 0 to Size - 1 do - CaseMapping[First, Second, Third, ctTitle, J] := StreamReadChar(Stream); - end; - // e) read upper case array - Size := Stream.ReadByte; - if Size > 0 then - begin - SetLength(CaseMapping[First, Second, Third, ctUpper], Size); - for J := 0 to Size - 1 do - CaseMapping[First, Second, Third, ctUpper, J] := StreamReadChar(Stream); - end; - end; - Assert(Stream.Position = Stream.Size); - finally - Stream.Free; + // b) read fold case array + Size := Stream.ReadByte; + if Size > 0 then + begin + SetLength(CaseMapping[First, Second, Third, ctFold], Size); + for J := 0 to Size - 1 do + CaseMapping[First, Second, Third, ctFold, J] := StreamReadChar(Stream); end; - finally - LoadInProgress.Leave; + // c) read lower case array + Size := Stream.ReadByte; + if Size > 0 then + begin + SetLength(CaseMapping[First, Second, Third, ctLower], Size); + for J := 0 to Size - 1 do + CaseMapping[First, Second, Third, ctLower, J] := StreamReadChar(Stream); + end; + // d) read title case array + Size := Stream.ReadByte; + if Size > 0 then + begin + SetLength(CaseMapping[First, Second, Third, ctTitle], Size); + for J := 0 to Size - 1 do + CaseMapping[First, Second, Third, ctTitle, J] := StreamReadChar(Stream); + end; + // e) read upper case array + Size := Stream.ReadByte; + if Size > 0 then + begin + SetLength(CaseMapping[First, Second, Third, ctUpper], Size); + for J := 0 to Size - 1 do + CaseMapping[First, Second, Third, ctUpper, J] := StreamReadChar(Stream); + end; end; + Assert(Stream.Position = Stream.Size); + finally + Stream.Free; end; end; +procedure LoadCaseMappingData; +begin + LockedInitialization(CaseDataFlag, InitCaseMappingData); +end; + function CaseLookup(Code: Cardinal; CaseType: TCaseType; var Mapping: TUCS4Array): Boolean; // Performs a lookup of the given code; returns True if Found, with Mapping referring to the mapping. // ctFold is handled specially: if no mapping is found then result of looking up ctLower @@ -1686,7 +1672,7 @@ Assert(Code < $1000000, LoadResString(@RsCasedUnicodeChar)); // load case mapping data if not already done - if not CaseDataLoaded then + if CaseDataFlag <> FLAG_INITIALIZED then LoadCaseMappingData; First := (Code shr 16) and $FF; @@ -1772,89 +1758,86 @@ Leaves: TUCS4Array; end; TDecompositions = array of array of TDecomposition; - TDecompositionsArray = array [Byte] of TDecompositions; + TDecompositionsArray = array of TDecompositions; var // list of decompositions, organized (again) as three stage matrix // Note: there are two tables, one for canonical decompositions and the other one // for compatibility decompositions. - DecompositionsLoaded: Boolean; + DecompositionsFlag: Integer = FLAG_NOT_INITIALIZED; CanonicalDecompositions, CompatibleDecompositions: TDecompositionsArray; -procedure LoadDecompositionData; +procedure InitDecompositionData; var Stream: TJclEasyStream; I, J, Code, Size: Integer; First, Second, Third: Byte; begin - if not DecompositionsLoaded then - begin - // make sure no other code is currently modifying the global data area - LoadInProgress.Enter; + SetLength(CanonicalDecompositions, 256); + SetLength(CompatibleDecompositions, 256); - try - DecompositionsLoaded := True; - Stream := OpenResourceStream('DECOMPOSITION'); - try - // determine how many decomposition entries we have - Size := Stream.ReadInteger; - for I := 0 to Size - 1 do - begin - Code := StreamReadChar(Stream); + Stream := OpenResourceStream('DECOMPOSITION'); + try + // determine how many decomposition entries we have + Size := Stream.ReadInteger; + for I := 0 to Size - 1 do + begin + Code := StreamReadChar(Stream); - Assert((Code and not $40000000) < $1000000, LoadResString(@RsDecomposedUnicodeChar)); + Assert((Code and not $40000000) < $1000000, LoadResString(@RsDecomposedUnicodeChar)); - // if there is no high byte entry in the first stage table then create one - First := (Code shr 16) and $FF; - Second := (Code shr 8) and $FF; - Third := Code and $FF; + // if there is no high byte entry in the first stage table then create one + First := (Code shr 16) and $FF; + Second := (Code shr 8) and $FF; + Third := Code and $FF; - // insert into the correct table depending on bit 30 - // (if set then it is a compatibility decomposition) - if Code and $40000000 <> 0 then - begin - if CompatibleDecompositions[First] = nil then - SetLength(CompatibleDecompositions[First], 256); - if CompatibleDecompositions[First, Second] = nil then - SetLength(CompatibleDecompositions[First, Second], 256); + // insert into the correct table depending on bit 30 + // (if set then it is a compatibility decomposition) + if Code and $40000000 <> 0 then + begin + if CompatibleDecompositions[First] = nil then + SetLength(CompatibleDecompositions[First], 256); + if CompatibleDecompositions[First, Second] = nil then + SetLength(CompatibleDecompositions[First, Second], 256); - Size := Stream.ReadByte; - if Size > 0 then - begin - CompatibleDecompositions[First, Second, Third].Tag := TCompatibilityFormattingTag(Stream.ReadByte); - SetLength(CompatibleDecompositions[First, Second, Third].Leaves, Size); - for J := 0 to Size - 1 do - CompatibleDecompositions[First, Second, Third].Leaves[J] := StreamReadChar(Stream); - end; - end - else - begin - if CanonicalDecompositions[First] = nil then - SetLength(CanonicalDecompositions[First], 256); - if CanonicalDecompositions[First, Second] = nil then - SetLength(CanonicalDecompositions[First, Second], 256); + Size := Stream.ReadByte; + if Size > 0 then + begin + CompatibleDecompositions[First, Second, Third].Tag := TCompatibilityFormattingTag(Stream.ReadByte); + SetLength(CompatibleDecompositions[First, Second, Third].Leaves, Size); + for J := 0 to Size - 1 do + CompatibleDecompositions[First, Second, Third].Leaves[J] := StreamReadChar(Stream); + end; + end + else + begin + if CanonicalDecompositions[First] = nil then + SetLength(CanonicalDecompositions[First], 256); + if CanonicalDecompositions[First, Second] = nil then + SetLength(CanonicalDecompositions[First, Second], 256); - Size := Stream.ReadByte; - if Size > 0 then - begin - CanonicalDecompositions[First, Second, Third].Tag := TCompatibilityFormattingTag(Stream.ReadByte); - SetLength(CanonicalDecompositions[First, Second, Third].Leaves, Size); - for J := 0 to Size - 1 do - CanonicalDecompositions[First, Second, Third].Leaves[J] := StreamReadChar(Stream); - end; - end; + Size := Stream.ReadByte; + if Size > 0 then + begin + CanonicalDecompositions[First, Second, Third].Tag := TCompatibilityFormattingTag(Stream.ReadByte); + SetLength(CanonicalDecompositions[First, Second, Third].Leaves, Size); + for J := 0 to Size - 1 do + CanonicalDecompositions[First, Second, Third].Leaves[J] := StreamReadChar(Stream); end; - Assert(Stream.Position = Stream.Size); - finally - Stream.Free; end; - finally - LoadInProgress.Leave; end; + Assert(Stream.Position = Stream.Size); + finally + Stream.Free; end; end; +procedure LoadDecompositionData; +begin + LockedInitialization(DecompositionsFlag, InitDecompositionData); +end; + function UnicodeDecomposeHangul(Code: UCS4): TUCS4Array; // algorithmically decomposes hangul character var @@ -1879,7 +1862,7 @@ Assert((Code and not $40000000) < $1000000, LoadResString(@RsDecomposedUnicodeChar)); // load decomposition data if not already done - if not DecompositionsLoaded then + if DecompositionsFlag <> FLAG_INITIALIZED then LoadDecompositionData; Result := nil; @@ -1927,69 +1910,65 @@ var // canonical combining classes, again as two stage matrix - CCCsLoaded: Boolean; - CCCs: array [Byte] of TClassArray; + CCCsFlag: Integer = FLAG_NOT_INITIALIZED; + CCCs: array of TClassArray; -procedure LoadCombiningClassData; +procedure InitCombiningClassData; var Stream: TJclEasyStream; I, J, K, Size: Integer; Buffer: TRangeArray; First, Second, Third: Byte; begin - // make sure no other code is currently modifying the global data area - LoadInProgress.Enter; + SetLength(CCCs, 256); + Stream := OpenResourceStream('COMBINING'); try - if not CCCsLoaded then + while Stream.Position < Stream.Size do begin - CCCsLoaded := True; - Stream := OpenResourceStream('COMBINING'); - try - while Stream.Position < Stream.Size do + // a) determine which class is stored here + I := Stream.ReadByte; + // b) determine how many ranges are assigned to this class + Size := Stream.ReadByte; + // c) read start and stop code of each range + if Size > 0 then + begin + SetLength(Buffer, Size); + for J := 0 to Size - 1 do begin - // a) determine which class is stored here - I := Stream.ReadByte; - // b) determine how many ranges are assigned to this class - Size := Stream.ReadByte; - // c) read start and stop code of each range - if Size > 0 then - begin - SetLength(Buffer, Size); - for J := 0 to Size - 1 do - begin - Buffer[J].Start := StreamReadChar(Stream); - Buffer[J].Stop := StreamReadChar(Stream); - end; + Buffer[J].Start := StreamReadChar(Stream); + Buffer[J].Stop := StreamReadChar(Stream); + end; - // d) put this class in every of the code points just loaded - for J := 0 to Size - 1 do - for K := Buffer[J].Start to Buffer[J].Stop do - begin - // (outchy) TODO: handle in a cleaner way - Assert(K < $1000000, LoadResString(@RsCombiningClassUnicodeChar)); - First := (K shr 16) and $FF; - Second := (K shr 8) and $FF; - Third := K and $FF; - // add second step array if not yet done - if CCCs[First] = nil then - SetLength(CCCs[First], 256); - if CCCs[First, Second] = nil then - SetLength(CCCs[First, Second], 256); - CCCs[First, Second, Third] := I; - end; + // d) put this class in every of the code points just loaded + for J := 0 to Size - 1 do + for K := Buffer[J].Start to Buffer[J].Stop do + begin + // (outchy) TODO: handle in a cleaner way + Assert(K < $1000000, LoadResString(@RsCombiningClassUnicodeChar)); + First := (K shr 16) and $FF; + Second := (K shr 8) and $FF; + Third := K and $FF; + // add second step array if not yet done + if CCCs[First] = nil then + SetLength(CCCs[First], 256); + if CCCs[First, Second] = nil then + SetLength(CCCs[First, Second], 256); + CCCs[First, Second, Third] := I; end; - end; - // Assert(Stream.Position = Stream.Size); - finally - Stream.Free; end; end; + // Assert(Stream.Position = Stream.Size); finally - LoadInProgress.Leave; + Stream.Free; end; end; +procedure LoadCombiningClassData; +begin + LockedInitialization(CCCsFlag, InitCombiningClassData); +end; + function CanonicalCombiningClass(Code: Cardinal): Cardinal; var First, Second, Third: Byte; @@ -1997,7 +1976,7 @@ Assert(Code < $1000000, LoadResString(@RsCombiningClassUnicodeChar)); // load combining class data if not already done - if not CCCsLoaded then + if CCCsFlag <> FLAG_INITIALIZED then LoadCombiningClassData; First := (Code shr 16) and $FF; @@ -2019,56 +1998,52 @@ end; var + NumberFlag: Integer = FLAG_NOT_INITIALIZED; // array to hold the number equivalents for specific codes NumberCodes: array of TCodeIndex; // array of numbers used in NumberCodes Numbers: array of TUcNumber; -procedure LoadNumberData; +procedure InitNumberData; var Stream: TJclEasyStream; Size, I: Integer; begin - // make sure no other code is currently modifying the global data area - LoadInProgress.Enter; - + Stream := OpenResourceStream('NUMBERS'); try - if NumberCodes = nil then - begin - Stream := OpenResourceStream('NUMBERS'); - try - // Numbers are special (compared to other Unicode data) as they utilize two - // arrays, one containing all used numbers (in nominator-denominator format) and - // another one which maps a code point to one of the numbers in the first array. + // Numbers are special (compared to other Unicode data) as they utilize two + // arrays, one containing all used numbers (in nominator-denominator format) and + // another one which maps a code point to one of the numbers in the first array. - // a) determine size of numbers array - Size := Stream.ReadByte; - SetLength(Numbers, Size); - // b) read numbers data - for I := 0 to Size - 1 do - begin - Numbers[I].Numerator := Stream.ReadInteger; - Numbers[I].Denominator := Stream.ReadInteger; - end; - // c) determine size of index array - Size := Stream.ReadInteger; - SetLength(NumberCodes, Size); - // d) read index data - for I := 0 to Size - 1 do - begin - NumberCodes[I].Code := StreamReadChar(Stream); - NumberCodes[I].Index := Stream.ReadByte; - end; - Assert(Stream.Position = Stream.Size); - finally - Stream.Free; - end; + // a) determine size of numbers array + Size := Stream.ReadByte; + SetLength(Numbers, Size); + // b) read numbers data + for I := 0 to Size - 1 do + begin + Numbers[I].Numerator := Stream.ReadInteger; + Numbers[I].Denominator := Stream.ReadInteger; end; + // c) determine size of index array + Size := Stream.ReadInteger; + SetLength(NumberCodes, Size); + // d) read index data + for I := 0 to Size - 1 do + begin + NumberCodes[I].Code := StreamReadChar(Stream); + NumberCodes[I].Index := Stream.ReadByte; + end; + Assert(Stream.Position = Stream.Size); finally - LoadInProgress.Leave; + Stream.Free; end; end; +procedure LoadNumberData; +begin + LockedInitialization(NumberFlag, InitNumberData); +end; + function UnicodeNumberLookup(Code: UCS4; var Number: TUcNumber): Boolean; // Searches for the given code and returns its number equivalent (if there is one). // Typical cases are: '1/6' (U+2159), '3/8' (U+215C), 'XII' (U+216B) etc. @@ -2077,7 +2052,7 @@ L, R, M: Integer; begin // load number data if not already done - if NumberCodes = nil then + if NumberFlag <> FLAG_INITIALIZED then LoadNumberData; Result := False; @@ -2114,55 +2089,51 @@ end; var + CompositionFlag: Integer = FLAG_NOT_INITIALIZED; // list of composition mappings Compositions: array of TComposition; MaxCompositionSize: Integer; -procedure LoadCompositionData; +procedure InitCompositionData; var Stream: TJclEasyStream; I, J, Size: Integer; begin - // make sure no other code is currently modifying the global data area - LoadInProgress.Enter; - + Stream := OpenResourceStream('COMPOSITION'); try - if Compositions = nil then + // a) determine size of compositions array + Size := Stream.ReadInteger; + SetLength(Compositions, Size); + // b) read data + for I := 0 to Size - 1 do begin - Stream := OpenResourceStream('COMPOSITION'); - try - // a) determine size of compositions array - Size := Stream.ReadInteger; - SetLength(Compositions, Size); - // b) read data - for I := 0 to Size - 1 do - begin - Compositions[I].Code := StreamReadChar(Stream); - Size := Stream.ReadByte; - if Size > MaxCompositionSize then - MaxCompositionSize := Size; - SetLength(Compositions[I].Next, Size - 1); - Compositions[I].First := StreamReadChar(Stream); - for J := 0 to Size - 2 do - Compositions[I].Next[J] := StreamReadChar(Stream); - end; - Assert(Stream.Position = Stream.Size); - finally - Stream.Free; - end; + Compositions[I].Code := StreamReadChar(Stream); + Size := Stream.ReadByte; + if Size > MaxCompositionSize then + MaxCompositionSize := Size; + SetLength(Compositions[I].Next, Size - 1); + Compositions[I].First := StreamReadChar(Stream); + for J := 0 to Size - 2 do + Compositions[I].Next[J] := StreamReadChar(Stream); end; + Assert(Stream.Position = Stream.Size); finally - LoadInProgress.Leave; + Stream.Free; end; end; +procedure LoadCompositionData; +begin + LockedInitialization(CompositionFlag, InitCompositionData); +end; + function UnicodeCompose(const Codes: array of UCS4; out Composite: UCS4): Integer; // Maps the sequence of Codes (up to MaxCompositionSize codes) to a composite // Result is the number of Codes that were composed (at least 1 if Codes is not empty) var L, R, M, I, HighCodes, HighNext: Integer; begin - if Compositions = nil then + if CompositionFlag <> FLAG_INITIALIZED then LoadCompositionData; Result := 0; @@ -7189,8 +7160,6 @@ procedure PrepareUnicodeData; // Prepares structures which are globally needed. begin - LoadInProgress := TJclCriticalSection.Create; - if (Win32Platform and VER_PLATFORM_WIN32_NT) <> 0 then @WideCompareText := @CompareTextWinNT else @@ -7200,7 +7169,7 @@ procedure FreeUnicodeData; // Frees all data which has been allocated and which is not automatically freed by Delphi. begin - FreeAndNil(LoadInProgress); + // all dynamic arrays are automatically freed end; initialization |
|
Could you give a look at the attached patch file? It can be applied on current trunk head revision in jcl/source/common. |
|
It appears to work fine. Thanks for the quick fix. |
|
How much is the size reduction for your applications? |
|
There was no size reduction!! I did some anlysis with project analyzer on three cases. Simple project without JclStrings Simple project with old JclStrings Simple project with new JclStrings Here are the results: File size No Jcl Units: 69, Forms:1, Code:1126998, ICode: 3454, Data: 9937, Bss: 21318, Resources: 60416 5,8 Mbytes New JclStrings Units: 86, Forms: 1, Code: 1149706, ICode: 3773, Data: 16653, Bss: 21463, Resources: 237568 6,44 Mbytes Old JclStrings Units: 86, Forms:1, Code:1149826, ICode: 3785, Data: 16657, Bss: 545743, Resources: 237568 6,44 Mbytes So just adding JclStrings and not using it adds about 0,64 Mbytes to the executable, but this was not due to StrCaseMap etc. The old JclStrings added 524.292 bytes of unitialized data (BSS) but these do not increase file size. The big difference between using JclStrings and not using it comes in resources. Is this due to Unicode data getting included? Here is the list of units that adding JclStrings pulls in: AnsiStrings 36 CODE AnsiStrings 20 ICODE AnsiStrings 8 DATA AnsiStrings 4 BSS ComConst 32 CODE ComObj 4.419 CODE ComObj 85 ICODE ComObj 48 DATA ComObj 17 BSS JclAnsiStrings 176 CODE JclAnsiStrings 8 ICODE JclAnsiStrings 12 BSS JclBase 431 CODE JclBase 19 ICODE JclBase 8 BSS JclCharsets 10.157 CODE JclCharsets 8 ICODE JclCharsets 2.416 DATA JclCharsets 4 BSS JclFileUtils 857 CODE JclFileUtils 8 ICODE JclFileUtils 48 DATA JclFileUtils 4 BSS JclMath 132 CODE JclMath 8 ICODE JclMath 8 DATA JclMath 4 BSS JclRegistry 1.032 CODE JclRegistry 20 ICODE JclRegistry 168 DATA JclRegistry 4 BSS JclResources 824 CODE JclShell 23 CODE JclShell 12 ICODE JclShell 4 DATA JclShell 4 BSS JclStrings 237 CODE JclStrings 8 ICODE JclStrings 8 DATA JclStrings 12 BSS JclSysInfo 2.602 CODE JclSysInfo 29 ICODE JclSysInfo 2.132 DATA JclSysInfo 16 BSS JclSysUtils 264 CODE JclSysUtils 21 ICODE JclSysUtils 16 DATA JclSysUtils 8 BSS JclUnicode 11.434 CODE JclUnicode 54 ICODE JclUnicode 1.860 DATA JclUnicode 40 BSS JclWin32 6 CODE Snmp 26 CODE Snmp 19 ICODE Snmp 4 BSS |
|
Delphi 2009+ includes Character.pas which covers pretty much the same functionality as StrCaseMap etc. and loads a character database as a resource (this has the advantage of saving the CPU time to compute the database). This unit is loaded by VCL and duplicating that functionality in JclStrings is wasteful. Ideally JclStrings and JclUnicode should use Character.pas with IFDEFs. Right now and even with the new version just adding JclStrings to the uses clause increase exe size by 600K, eats up 500 kbytes of heap memory, and uses quite a bit of CPU time to compute the StrCaseMap. All in all the current situation is not ideal. |
|
This increase in size is a consequence of JclUnicode.pas. This source file includes JclUnicode.res and this database is quite big. However, the JCL database contains more data than VCL's one. The data is more up-to-date (it is based on Unicode Character Database 5.2.0 http://www.unicode.org/ucd/ ) while the VCL is lagging behind. |
2010-09-22 03:12
|
JclWideStrings.pas (65,236 bytes) |
2010-09-22 03:13
|
JclStrings.pas (148,213 bytes) |
|
I have uploaded two modified files. a) JclWideStrings This file was using JclUnicode in just two functions StrICompW and StrLICompW. I have decoupled JclUnicode from JclWideStrings. b) JclStrings For Delphi 2009 and above I used the Character unit instead of StrCharTypes etc. The change was against the SVN version and not the patched version. The reason is that the overhead of StrCharTypes is tiny in the ANSI version so in that case the patch is meaningless. For the unicode Delphi versions the modified unit doesn't use StrCharTypes etc. at all. |
|
This change sounds promising :) Is the real name in your Mantis account right so I can add you to the author list? |
|
Yes |
2010-09-22 13:14
|
JclStrings_polished.pas (148,318 bytes) |
2010-09-22 13:14
|
JclWideStrings_polished.pas (65,378 bytes) |
|
I polished your modifications. We really prefer {$IFDEF}{$ELSE}{$ENDIF} over {$IFNDEF}{$ELSE}{$ENDIF}. The next change is to introduce an option to switch between JCL's character database and RTL's one. I don't want to have an automatic switch (like the $IFDEF SUPPORTS_UNICODE you use) since RTL's database is not as good as the JCL's one. This option will be added to the JCL installer. |
|
Fine thanks. However note that the alternative to the use of Character.pas is not the JclUnicode database but the on the fly generated StrCharTypes etc. which is based on the Windows function GetStringTypeEx. I suppose that Borland's character database is created in a similar way. For example, there is no guarantee that JclUnicode's UnicodeIs... functions agree 100% with JclString's Is... functions anyway. So even before the introduction of Character.pas in JclString, two sets of functions existed in JCL (one in JclUnicode and JclStrings) that use different approaches. |
|
This is committed in revision 3362. |
Date Modified | Username | Field | Change |
---|---|---|---|
2010-09-17 11:44 | Kiriakos | New Issue | |
2010-09-17 11:44 | Kiriakos | IDE version | => Delphi/C++Builder 2009 |
2010-09-20 20:36 | outchy | File Added: jcl_string_overhead.patch | |
2010-09-20 20:37 | outchy | Note Added: 0017686 | |
2010-09-21 00:30 | Kiriakos | Note Added: 0017687 | |
2010-09-21 06:15 | outchy | Note Added: 0017688 | |
2010-09-21 18:21 | Kiriakos | Note Added: 0017691 | |
2010-09-21 19:33 | Kiriakos | Note Added: 0017692 | |
2010-09-21 20:20 | outchy | Note Added: 0017694 | |
2010-09-22 03:12 | Kiriakos | File Added: JclWideStrings.pas | |
2010-09-22 03:13 | Kiriakos | File Added: JclStrings.pas | |
2010-09-22 03:21 | Kiriakos | Note Added: 0017695 | |
2010-09-22 06:35 | outchy | Note Added: 0017696 | |
2010-09-22 08:18 | Kiriakos | Note Added: 0017697 | |
2010-09-22 13:14 | outchy | File Added: JclStrings_polished.pas | |
2010-09-22 13:14 | outchy | File Added: JclWideStrings_polished.pas | |
2010-09-22 13:20 | outchy | Note Added: 0017700 | |
2010-09-22 14:30 | Kiriakos | Note Added: 0017701 | |
2010-10-05 15:24 | outchy | Note Added: 0017753 | |
2010-10-05 15:24 | outchy | Assigned To | => outchy |
2010-10-05 15:24 | outchy | Status | new => feedback |
2010-10-11 10:52 | outchy | Fixed in revision | => 3362 |
2010-10-11 10:52 | outchy | Status | feedback => resolved |
2010-10-11 10:52 | outchy | Fixed in Version | => Version 2.3 (Subversion repository/Daily zips) |
2010-10-11 10:52 | outchy | Resolution | open => fixed |