Project JEDI - Issue Tracker
Mantis Bugtracker

Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0003377 [JEDI VCL] 00 JVCL Components major always 2005-12-22 12:03 2006-01-08 08:29
Reporter Christer Fahlgren View Status public  
Assigned To obones
Priority normal Resolution fixed  
Status resolved   Product Version 3.00
Summary 0003377: JVSimpleXML doesn't preserve preceding whitespace for text nodes
Description As discussed in the newsgroups:
When reading an xml-file with JVSimpleXML, the preceding whitespace for text nodes is removed. So a node like <test>___Hello!_</test> becomes the equivalent of <test>Hello!_</test> (exchange _ for a space).

I believe the preceding whitespace should be preserved.

I have attached a file with my proposed fix.

Best regards,
Christer
Additional Information
Tags No tags attached.
Attached Files zip file icon jvsimplexml.zip [^] (16,410 bytes) 2005-12-22 12:03
zip file icon JvSimpleXml.pas and patch.zip [^] (17,495 bytes) 2006-01-08 04:35

- Relationships

-  Notes
(0008267)
hasse42g (reporter)
2006-01-03 05:39

Maybe the XML specification on white space handling can be of help:

http://www.w3.org/TR/2004/REC-xml-20040204/#sec-white-space [^]
(0008290)
obones (administrator)
2006-01-05 08:08

But then, what if you have this:

<root>
__<test>
____Hello_
__</test>
</root>

That's the main problem I see.
(0008298)
elahn (developer)
2006-01-05 08:25

I never considered a text element would be structured like that... It doesn't make sense to me to use "formatting" for Text elements. Do you use or know of anything that uses a format like this?

Since only the preceeding whitespace is being ignored - the trailing whitespace is being included - it seems neither "use case" is being correctly supported.
(0008302)
obones (administrator)
2006-01-05 08:47

Well, for instance, the TJvFormStorage may store a TEdit value. Now, the user entered ' hello ' into it. The XML will contain this:

<EditText>_hello_</EditText>

TJvSimpleXML does not format the text, just like many other XML "pretty printers" do. With your changes "on", then the first space is lost. I haven't looked yet, but is it implemented as an option?
(0008303)
elahn (developer)
2006-01-05 09:08

They're not my changes...personally I don't have a vested interest, as I don't use Text elements at all - I use properties and encode/decode CRLF values.

Currently <EditText>_hello_</EditText> would result in the string 'hello '. I believe the idea of the changes is to have it result in ' hello '.

With the changes "on", then the first space is preserved. Currently, the first space is lost.

No, I don't think it's implemented as an option. I've only glanced at the code, but I think it could be optimized (just a little bit) before inclusion. I'm waiting for agreement on the change (and a little spare time) before actually doing it though. :)
(0008304)
Christer Fahlgren (reporter)
2006-01-05 14:06

As the filer of this report (and having an application that depend on this change) I just thought I'd reiterate my reasons to "fix" this issue:

* Consistency
Currently whitespace is handled inconsistent for Text elements, i.e. only preceding whitespace is removed. If there would be an option to whitespace handling for text elements, I believe the option should be either to remove all whitespace (preceding and trailing as well as *extra* whitespace between words) as HTML handles whitespace. The other option should be to preserve all whitespace. The current behaviour is IMHO not logical nor useful.

* Conformance with W3C recommendations
The specification says that whitespace should be passed up to an XML processor (which JVSimpleXML is).


The argument against this change that I can see would be that there might be some applications that depend on the current behaviour.

I think we can handle this in a couple of different ways:

1. Leave code as is - which in my opinion would make JVSimpleXML less useful than it can be
2. Change to preserve preceding whitespace in text nodes (the attached change)
3. Restructure the change to provide an option - preservewhitespacefortextnodes - which enabled would do option 2 and disabled would implement the HTML interpretation for whitespace (which would be a change to the current behaviour as well)

What option would you prefer? (Or can you think about another option)

I'd be willing to help create option 3 if desired.

Best regards,
Christer
(0008305)
elahn (developer)
2006-01-05 22:49

I don't see any benefit of implementing a HTML version of whitespace handling - I think this behaviour should be implemented by an application, not by TJvSimpleXML.

I believe that whitespace should be preserved in Text elements.

IMO, there is only one question: Should an option be added to keep the current inconsistent (and wrong?) behaviour. My vote is no.
(0008310)
obones (administrator)
2006-01-06 05:34

Well, my vote is yes. But turn it off by default. So the situation would be this:

Add sxoTrimPrecedingTextWhitespace option. If turned on, it disables the new code that does not trim.
Do not put this new option in the default value.
Add a line in changelog.txt stating clearly that should users want the old behaviour they should use the new option.

With all this, I think we can satisfy everyone.
(0008312)
elahn (developer)
2006-01-06 07:02

Fair enough, that will satisfy everyone. :)
(0008326)
elahn (developer)
2006-01-08 04:45

I've tweaked Christer's implementation and added that option. It compiles fine inside Delphi 7 IDE, but the JVCL installer crashes with an AV upon starting - i.e. it finishes compiling but instead of the installer appearing, the AV appears.
(0008329)
obones (administrator)
2006-01-08 06:50

Well, of course it crashes as there are MANY additional Text elements created by the new code. Those unrequired text elements don't even have a name. Basically, with this change we get a new text elemen for every indentation that is matched. The value even includes the carriage return, which to me is not acceptable.
Note that this is the reason why the installer crashes as it does not expect to find text element in the Items collection for the "models" node.
(0008330)
obones (administrator)
2006-01-08 07:03

Well, more precisely, it creates a text element for every white space found in between two closing elements. This should not happen. I'm investigating a solution that should come around in a few hours.
(0008331)
obones (administrator)
2006-01-08 08:29

This is now in CVS.

- Issue History
Date Modified Username Field Change
2005-12-22 12:03 Christer Fahlgren New Issue
2005-12-22 12:03 Christer Fahlgren File Added: jvsimplexml.zip
2005-12-22 12:10 elahn Issue Monitored: elahn
2006-01-03 05:39 hasse42g Note Added: 0008267
2006-01-03 05:42 hasse42g Issue Monitored: hasse42g
2006-01-05 08:08 obones Note Added: 0008290
2006-01-05 08:08 obones Status new => feedback
2006-01-05 08:25 elahn Note Added: 0008298
2006-01-05 08:47 obones Note Added: 0008302
2006-01-05 09:08 elahn Note Added: 0008303
2006-01-05 14:06 Christer Fahlgren Note Added: 0008304
2006-01-05 22:49 elahn Note Added: 0008305
2006-01-06 05:34 obones Note Added: 0008310
2006-01-06 07:02 elahn Note Added: 0008312
2006-01-08 04:35 elahn File Added: JvSimpleXml.pas and patch.zip
2006-01-08 04:45 elahn Note Added: 0008326
2006-01-08 06:50 obones Note Added: 0008329
2006-01-08 07:03 obones Note Added: 0008330
2006-01-08 08:29 obones Status feedback => resolved
2006-01-08 08:29 obones Resolution open => fixed
2006-01-08 08:29 obones Assigned To => obones
2006-01-08 08:29 obones Note Added: 0008331
2006-01-12 11:22 elahn Issue End Monitor: elahn


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker