View Issue Details

IDProjectCategoryView StatusLast Update
0005735JEDI VCL00 JVCL Componentspublic2013-12-16 17:13
ReporterLast RomanticAssigned ToAHUser 
PrioritynormalSeverityminorReproducibilityalways
Status resolvedResolutionfixed 
Product Version3.45 
Target VersionFixed in Version3.46 
Summary0005735: TJvStrHolder saves Cyrillic characters incorrect
DescriptionAfter writing the Cyrillic characters in Strings property in the editor i save the project and reopen it. After that, I see instead of the Cyrillic characters are various other characters.

The Sample app in attach.

P.S. In TJvMultiStringHolder everything is ok.
Additional InformationDelphi 2010
Windows 7 Prof x32
TagsNo tags attached.

Activities

2011-12-08 12:45

 

project1.zip (4,573 bytes)

Arioch

2011-12-09 00:01

developer   ~0019191

D2010 is unicode-based AFAIR ? Type string = UnicodeString rather than AnsiString ?

Arioch

2011-12-09 00:04

developer   ~0019192

  object JvStrHolder1: TJvStrHolder
    Capacity = 4
    Macros = <>
    Left = 320
    Top = 40
    InternalVer = 1
    StrData = (
      ''
      '42354142'
      '313233')
  end
  
  It seems there is no cyrillic characters there at all, neither broken, nor real.
  
  Can u put same few russian and latin strings in both TJvMultiStringHolder and TJvStrHolder to compare.
  
  PS: i wonder if there is reason to have both those components, not choose most flexible one and remove more restricted one.

Last Romantic

2011-12-09 04:39

reporter   ~0019193

Well, I wrote this:

????
test
123

After saving in dfm we see it
  object JvStrHolder1: TJvStrHolder
    Capacity = 4
    Macros = <>
    Left = 320
    Top = 40
    InternalVer = 1
    StrData = (
      ''
      '42354142'
      '74657374'
      '313233')
  end

Take a look at TJvMultiStringHolder component and write the same
"????
123"

In dfm it looks like this:
  object JvMultiStringHolder1: TJvMultiStringHolder
    MultipleStrings = <
      item
        Name = 'str1'
        Strings.Strings = (
          0001090#1077#1089#1090
          'test'
          '123')
      end>
    Left = 448
    Top = 104
  end

Yes, D2010 is Unicode-based.

I have attached a new sample app

2011-12-09 04:39

 

project11.zip (4,680 bytes)

Arioch

2011-12-09 10:36

developer   ~0019194

can u compare those components, what is common, what are differences ?
can they be merged, or one maybe can be just removed and only most flexible one remain ?

Arioch

2011-12-09 10:38

developer   ~0019195

i wonder if http://sourceforge.net/projects/rxlib/ this project has StrHolder fixed for Unicode ;)

Arioch

2011-12-09 13:12

developer   ~0019196

quick fix would be i believe just to replace string with ansistring, pchar with pansichar and so one

real fix would probably take merging two components

Last Romantic

2011-12-09 16:40

reporter   ~0019197

The difference between these components is that the first of these uses encryption strings in the writer class and second not.
Yes, the encryption function does not support Unicode strings. Replacing the String with AnsiString in this function solves the problem.
Thanks a lot.

Arioch

2011-12-09 18:35

developer   ~0019199

DFM also shows the 1st using some Macro's, pehaps substitutions like %TEMP% or whatever.

And 2nd probavbly features 2-level storage, which is maybe mroe flexible but i can not quickly imagine the workflow, that was worth making a special component for it. But surely author had one.

Last Romantic

2011-12-12 06:07

reporter   ~0019203

This is the fixed functions. They working without replacing String with AnsiString.

function XorEncode(const Key, Source: string): string;
var
  I, C: Integer;
begin
  Result := '';
  for I := 1 to Length(Source) do
  begin
    C := Ord(Source[I]);
    if Length(Key) > 0 then
      C := Ord(Key[1 + ((I - 1) mod Length(Key))]) xor C;
    Result := Result + LowerCase(IntToHex(C, 4));
  end;
end;

function XorDecode(const Key, Source: string): string;
var
  I, C: Integer;
begin
  Result := '';
  for I := 0 to Length(Source) div 4 - 1 do
  begin
    C := StrToIntDef('$' + Copy(Source, I * 4 + 1, 4), 0);
    if Length(Key) > 0 then
      C := Ord(Key[1 + (I mod Length(Key))]) xor Ord(C);
    Result := Result + Char(C);
  end;
end;

obones

2012-02-24 17:04

administrator   ~0019541

Thanks for that, but I can't apply your changes directly.
When taking a DFM from a previous version, the Strings property gets corrupted because it is decoded by a different algorithm.
The fix must take care of this situation

obones

2012-06-11 17:10

administrator   ~0019826

Any news?

AHUser

2012-08-19 20:01

developer   ~0020121

Fixed in svn revision 13404.

Instead of writing UTF-16 hex values the component writes UTF-8 hex values to save space. The streaming format version was incremented to support loading older DFM data.

Arioch

2013-10-10 10:50

developer   ~0020659

AS. UTF8 is hardly space-saver: for all not-Latin languages it takes more space than UCS2 and for extended-ASCII (central-European) is is the same. It only is more space-efficient with regard to 7-bit ASCII chars. Maybe we could use explicitly specified encoding in extra property, which could provide for seamless backward compatibility and for saving space...

---

Now, something wrong was with encoder. I think it mistakes AnsiChar for WideChar...

Opening DFM

  object StrHNoBuncruptcy: TJvStrHolder
    Capacity = 8
    Macros = <>
    Left = 16
    Top = 93
    InternalVer = 1
    StrData = (
      ''
      
        'cde0ebe8f7e8e520eff0e8e7ede0eaeee220eff0e5e4ede0ece5f0e5ededeee3' +
        'ee20e1e0edeaf0eef2f1f2e2e020ede520e2fbffe2ebe5edee2e20c220f1ebf3' +

I get

 object StrHNoBuncruptcy: TJvStrHolder
    Capacity = 8
    Macros = <>
    Left = 16
    Top = 93
    InternalVer = 2
    StrData = (
      ''
      
        'c38dc3a0c3abc3a8c3b7c3a8c3a520c3afc3b0c3a8c3a7c3adc3a0c3aac3aec3' +
        'a220c3afc3b0c3a5c3a4c3adc3a0c3acc3a5c3b0c3a5c3adc3adc3aec3a3c3ae' +

Which is broken and 0xC38D stands for 0xcd only in windows 1250 or 1252 codepages effectively killing the rest.

Funny thing - if i delete InternalVer=1 then conversion is even weirder - i wonder what happens then, what is treated the input...

object StrHBuncruptcy: TJvStrHolder
    Capacity = 8
    Macros = <>
    Left = 16
    Top = 125
    StrData = (
      ''
      
        'cde020efe5f0e2eeec20fdf2e0efe520e2fbffe2ebe5ede8ff20eff0e8e7ede0' +
        'eaeee220eff0e5e4ede0ece5f0e5ededeee3ee20e1e0edeaf0eef2f1f2e2e020' +


is converted into

  object StrHBuncruptcy: TJvStrHolder
    Capacity = 8
    Macros = <>
    Left = 16
    Top = 125
    InternalVer = 2
    StrData = (
      ''
      
        '6364653032306566653566306532656565633230666466326530656665353230' +
        '6532666266666532656265356564653866663230656666306538653765646530' +

Arioch

2013-10-10 11:02

developer   ~0020660

I admit the test project is quite bad, and probably was of no use for you with win-1250 OS codepage...

Arioch

2013-10-10 16:50

developer   ~0020664

@AHUser - look here! http://sourceforge.net/mailarchive/forum.php?forum_name=jvcl-checkins&max_rows=25&offset=16&style=nested&viewmonth=200803

XorDecode already was fixed to work on AnsiStrings !!!

What the hell? Why it was broken again to WideChar and UnicodeString ????

Arioch

2013-10-10 17:07

developer   ~0020665

function was broken on

???????: 3f4129e99c053ed2946c897f8ec43033d4439548
SVN rev.
?????: obones
????: 10.09.2008 0:45:14
?????????: Changes required for D2009/C2009 support, merged from the 3.35 branch



Sorry, pals, but dealing with XOR and forgetting that WideChar is not Byte but Word - that is quite weird IMHO !

And now can we fix the broken XOR functions or anyone who used them from 2008 till today would be screwed ?

THEY ARE BROKEN BY DESIGN now.

* If we fix them - just revert to 7dbe4b484c1d061a8a621b1967c9a855ee445563 - we can break any code, that used this broken design.
* If we would left them - we would provoke more and more code developments as we already did for last 5 years!

In both options, they are dangerous and data-destructing in one or another environment.

It was good idea to keep them for compatibility, but that merge failed it.

I am for immediate removal of XorDecode/XorEncode.

Those, who used them, would be forced to know, that those functions were broken and check their data and plan data salvation with open eyes.

Meanwhile i am moving fixed XorDecode inside the JvStrHolder

2013-10-10 17:46

 

Test_1251.zip (3,504 bytes)

2013-10-10 17:48

 

StrHolder.patch (2,367 bytes)
diff --git "a/C:\\Users\\burov\\AppData\\Local\\Temp\\TortoiseGit\\JvS8E9B.tmp\\JvStringHolder-cea178a-left.pas" "b/D:\\DelphiProjects\\Libs\\JediVCL\\jvcl\\run\\JvStringHolder.pas"
index 9992cbb..e5b513c 100644
--- "a/C:\\Users\\burov\\AppData\\Local\\Temp\\TortoiseGit\\JvS8E9B.tmp\\JvStringHolder-cea178a-left.pas"
+++ "b/D:\\DelphiProjects\\Libs\\JediVCL\\jvcl\\run\\JvStringHolder.pas"
@@ -120,6 +120,7 @@ type
     procedure SetMacros(Value: TJvMacros);
     procedure RecreateMacros;
     procedure SetMacroChar(Value: Char);
+    class function XorDecode(const Key, Source: AnsiString): AnsiString; static;
   protected
     procedure AssignTo(Dest: TPersistent); override;
     procedure DefineProperties(Filer: TFiler); override;
@@ -773,6 +774,44 @@ begin
   Result := FStrings.Duplicates;
 end;
 
+class function TJvStrHolder.XorDecode(const Key, Source: AnsiString): AnsiString;
+// the one from JvJCLUtils was BROKEN on 10.09.2008 and should
+//     be AVOIDED to prevent data damage
+var
+  RI, LS, SI, LK, KI: Integer;
+  B: Byte;
+  HotSpot: String;
+begin
+  Result := '';
+  HotSpot := '$00';
+  LS := Length(Source); LK := Length(Key);
+  RI := 1; KI := 1; SI := 1;
+  SetLength(Result, (LS + 1) div 2 );
+
+  while SI <= LS do
+  begin
+    HotSpot[2] := Char(Source[SI]); Inc(SI);
+
+    if SI > LS // odd length of source
+       then SetLength(HotSpot, 2)
+       else HotSpot[3] := Char(Source[SI]);
+    Inc(SI);
+
+    B := StrToIntDef(HotSpot, 32);
+
+    if LK > 0 then begin
+       if KI > LK then KI := 1;
+       B := B xor Byte(AnsiChar(Key[KI]));
+       Inc(KI);
+    end;
+
+    Result[RI] := AnsiChar(B);
+    Inc(RI);
+  end;
+
+  SetLength(Result, RI - 1);
+end;
+
 procedure TJvStrHolder.ReadStrings(Reader: TReader);
 var
   Tmp: string;
@@ -791,9 +830,10 @@ begin
         if FReserved >= XorVersion then
           Strings.Add(XorDecodeString(KeyString, Tmp))
         else
-          {$WARNINGS OFF} // XorDecode is deprecated, but we need it for backward compatibility, so hide the warning
-          Strings.Add(XorDecode(KeyString, Tmp));
-          {$WARNINGS ON}
+          Strings.Add(string(XorDecode(
+                         AnsiString(KeyString),
+                         AnsiString(Tmp)
+          )));
       end
       else
         Strings.Add(string(XorString(ShortString(KeyString), ShortString(Tmp))));
StrHolder.patch (2,367 bytes)

Arioch

2013-10-10 19:20

developer   ~0020666

In my project text STILL is broken in the following scenario:

1) Form2 is inherited from Form1
2) Form1 contains StrHolder with InternalVer=1 and no text
3) Form2 contains copy of that StrHolder with no InternalVer and text in Ver=1 mode

obones

2013-12-13 11:48

administrator   ~0020804

So what, do we do here?

Arioch

2013-12-13 12:18

developer   ~0020810

My idea is told above and at http://newsportal.delphi-jedi.org/article.php?id=1213&group=jedi.jvcl

I did not looked into form's inheritance issue, as we had that problem only in single place and fixed it in DFMs.

I am for removal of potentially dangerous functions and forcing developers to realize the problem and explicitly think what would they do with it, rather then playing Russian Roulette with users data.

obones

2013-12-16 17:13

administrator   ~0020869

Ok, the "deprecated" warning is in place, there will be a release with it and once it is done, the functions will be removed altogether (if possible, they are used in the JVCL)

Issue History

Date Modified Username Field Change
2011-12-08 12:45 Last Romantic New Issue
2011-12-08 12:45 Last Romantic File Added: project1.zip
2011-12-09 00:01 Arioch Note Added: 0019191
2011-12-09 00:04 Arioch Note Added: 0019192
2011-12-09 04:39 Last Romantic Note Added: 0019193
2011-12-09 04:39 Last Romantic File Added: project11.zip
2011-12-09 10:36 Arioch Note Added: 0019194
2011-12-09 10:38 Arioch Note Added: 0019195
2011-12-09 13:12 Arioch Note Added: 0019196
2011-12-09 16:40 Last Romantic Note Added: 0019197
2011-12-09 18:35 Arioch Note Added: 0019199
2011-12-12 06:07 Last Romantic Note Added: 0019203
2012-02-22 14:57 obones Status new => acknowledged
2012-02-24 17:04 obones Note Added: 0019541
2012-02-24 17:04 obones Status acknowledged => feedback
2012-06-11 17:10 obones Note Added: 0019826
2012-08-19 20:01 AHUser Note Added: 0020121
2012-08-19 20:01 AHUser Status feedback => resolved
2012-08-19 20:01 AHUser Fixed in Version => Daily / SVN
2012-08-19 20:01 AHUser Resolution open => fixed
2012-08-19 20:01 AHUser Assigned To => AHUser
2012-09-10 14:15 obones Fixed in Version Daily / SVN => 3.46
2013-10-10 10:50 Arioch Note Added: 0020659
2013-10-10 10:50 Arioch Status resolved => feedback
2013-10-10 10:50 Arioch Resolution fixed => reopened
2013-10-10 11:02 Arioch Note Added: 0020660
2013-10-10 16:50 Arioch Note Added: 0020664
2013-10-10 17:07 Arioch Note Added: 0020665
2013-10-10 17:46 Arioch File Added: Test_1251.zip
2013-10-10 17:48 Arioch File Added: StrHolder.patch
2013-10-10 19:20 Arioch Note Added: 0020666
2013-12-13 11:48 obones Note Added: 0020804
2013-12-13 12:18 Arioch Note Added: 0020810
2013-12-16 17:13 obones Note Added: 0020869
2013-12-16 17:13 obones Status feedback => resolved
2013-12-16 17:13 obones Resolution reopened => fixed