A: The parser is just that: a parser. There is nothing there to write XML. (But hey, you can read XML quite fast ;-)
As XML is so simple, my preferred method for writing XML is Writeln.
A: They are handled as special parts:
TXmlParser will report them as a ptEmptyTag part type.
TXmlScanner will report them as an OnEmptyTag event.
A: TXmlParser has a property named CurAttr and TXmlScanner passes an Attributes parameter, both of which are of type TAttrList. You can access the attributes by name or index from there:
Edit1.Text := CurAttr.Value ('name'); // To get the value of the 'name' attribute
Name := CurAttr.Name (0); // To get the name of the first attribute Value := CurAttr.Value (0); // To get the value of the first attribute
The number of attributes can (of course) be found in the .Count property of TAttrList.
A: Everything in XML is case sensitive by definition. As XML is meant to be used for Unicode applications (which has been even expanded in XML 1.1), all XML element, entity and attribute names are case sensitive. In a lot of non-latin scripts there is no such thing as casing, so it would be hard to compare two strings in a non-case-sensitive way.
(I know that this is hard for Delphi language programmers, especially on a PC. But that's the XML way ...)
A: TXmlParser is a Delphi CLASS for easy parsing of an XML file. You have to create an instance of this class and use its methods and properties to read your XML.
TXmlScanner is a VCL wrapper for TXmlParser. So you have a non-visual component which represents the parser and you have events like "OnStartTag" which are fired when there is a start tag found in the XML. To start parsing, you have to call the Execute method.
Advantage of TXmlScanner: Very easy to use: put component on your form, fill out events, call .Execute
Advantage of TXmlParser: you have a local loop, so you can handle everything in local variables.
To sum it up: Use TXmlParser for serious work and TXmlScanner for "quick hacks".
C#/.NET: No -- there's too much specific code inside (this is why it's so fast ...)
There is nothing in the parser to find out the line number. So you'll have to code that on your own.
(Note that the parser does not convert all line breaks to Linefeed (#x0A) characters before parsing as defined by the XML specification).
There's nothing there for this. You'll have to code this on your own. I don't think that this is really bad, because then you can code it in a way that suits your application best.
A for Version 2: Just parse your file.
A for Version 1: The virtual method TXmlParser.TranslateEncoding is responsible for transcoding from the source character set of your XML to the destination character set of your application.
The default method tries to translate UTF-8 to Windows-1252, which is not a good idea if you use characters outside the Windows-1252 range. You should override TranslateEncoding with a method that just passes UTF-8 through:
FUNCTION TMyOwnXmlParser.TranslateEncoding (CONST Source : STRING) : STRING; // OVERRIDE;
Result := Source;
In this case, your application must be prepared to process UTF-8 strings.
You as an application programmer who reads XML can treat them the same.
CDATA sections are there to help with text content that has a lot of characters which would have to be escaped otherwise. However, as there is the character sequence ]]> which is not allowed in CDATA sections (because it terminates them), you'll have to be careful, too. So, for me, there's no use for CDATA sections when you write XML.
A: There are two or three important member variables for this:
DocBuffer (PChar) points to the first character of your XML
CurStart (PChar) always points to the first character of the part you are currently scanning
CurFinal (PChar) always points to the last character of the part you are currently scanning
So, for example, for a ptStartTag, CurStart points to the opening angle bracket (<) and CurFinal points to the closing angle bracket (>) of your tag.
You can use the difference between CurFinal and DocBuffer to show the progress of parsing
var DocSize : integer; // Document Size in bytes Progress : integer; // Progress in percent begin XP.LoadFromFile (...); DocSize := StrLen (XP.DocBuffer); while XP.Scan do begin case XP.CurPartType of ... end; Progress := Trunc ((XP.CurFinal - XP.DocBuffer) / DocSize * 100.0); end;
The progress is calculated using PChar arithmetics: Subtracting DocBuffer from CurFinal retrieves an integer value giving the "distance" between the two characters. This is related to DocSize and multiplied with 100. (The 100 is noted as "100.0" so there will be no integer arithmetics involved in this real type expression.)
A: Yes, there is no limitation and no royalty. You are not obliged to publish your source code or your work.
The XML parser is subject to my own DSL Licence, which says that you can do practically everything with my code.