+
+As of May 2005, Valgrind can produce its output in XML form. The
+intention is to provide an easily parsed, stable format which is
+suitable for GUIs to read.
+
+
+Design goals
+~~~~~~~~~~~~
+
+* Produce XML output which is easily parsed
+
+* Have a stable output format which does not change much over time, so
+ that investments in parser-writing by GUI developers is not lost as
+ new versions of Valgrind appear.
+
+* Have an extensive output format, so that future changes to the
+ format do not break backwards compatibility with existing parsers of
+ it.
+
+* Produce output in a form which suitable for both offline GUIs (run
+ all the way to the end, then examine output) and interactive GUIs
+ (parse XML incrementally, update display as we go).
+
+* Put as much information as possible into the XML and let the GUIs
+ decide what to show the user (a.k.a provide mechanism, not policy).
+
+* Make XML which is actually parseable by standard XML tools.
+
+
+How to use
+~~~~~~~~~~
+
+Run with flag --xml=yes. That`s all. Note however several
+caveats.
+
+* At the present time only Memcheck is supported. The scheme extends
+ easily enough to cover Helgrind if needed.
+
+* When XML output is selected, various other settings are made.
+ This is in order that the output format is more controlled.
+ The settings which are changed are:
+
+ - Suppression generation is disabled, as that would require user
+ input.
+
+ - Attaching to GDB is disabled for the same reason.
+
+ - The verbosity level is set to 1 (-v).
+
+ - Error limits are disabled. Usually if the program generates a lot
+ of errors, Valgrind slows down and eventually stops collecting
+ them. When outputting XML this is not the case.
+
+ - VEX emulation warnings are not shown.
+
+ - File descriptor leak checking is disabled. This could be
+ re-enabled at some future point.
+
+ - Maximum-detail leak checking is selected (--leak-check=full).
+
+
+The output format
+~~~~~~~~~~~~~~~~~
+For the most part this should be self descriptive. It is printed in a
+sort-of human-readable way for easy understanding. You may want to
+read the rest of this together with the results of "valgrind --xml=yes
+memcheck/tests/xml1" as an example.
+
+All tags are balanced: a <foo> tag is always closed by </foo>. Hence
+in the description that follows, mention of a tag <foo> implicitly
+means there is a matching closing tag </foo>.
+
+Symbols in CAPITALS are nonterminals in the grammar and are defined
+somewhere below. The root nonterminal is TOPLEVEL.
+
+The following nonterminals are not described further:
+ INT is a 64-bit signed decimal integer.
+ TEXT is arbitrary text.
+ HEX64 is a 64-bit hexadecimal number, with leading "0x".
+
+Text strings are escaped so as to remove the <, > and & characters
+which would otherwise mess up parsing. They are replaced respectively
+with the standard encodings "<", ">" and "&" respectively.
+Note this is not (yet) done throughout, only for function names in
+<frame>..</frame> tags-pairs.
+
+
+TOPLEVEL
+--------
+
+The first line output is always this:
+
+ <?xml version="1.0"?>
+
+All remaining output is contained within the tag-pair
+<valgrindoutput>.
+
+Inside that, the first entity is an indication of the protocol
+version. This is provided so that existing parsers can identify XML
+created by future versions of Valgrind merely by observing that the
+protocol version is one they don`t understand. Hence TOPLEVEL is:
+
+ <?xml version="1.0"?>
+ <valgrindoutput>
+ <protocolversion>INT<protocolversion>
+ PROTOCOL
+ </valgrindoutput>
+
+Valgrind versions 3.0.0 and 3.0.1 emit protocol version 1. Versions
+3.1.X and 3.2.X emit protocol version 2. 3.4.X emits protocol version
+3.
+
+
+PROTOCOL for version 3
+----------------------
+Changes in 3.4.X (tentative): (jrs, 1 March 2008)
+
+* There may be more than one <logfilequalifier> clause.
+
+* Some errors may have two <auxwhat> blocks, rather than just one
+ (resulting from merge of the DATASYMS branch)
+
+* Some errors may have an ORIGIN component, indicating the origins of
+ uninitialised values. This results from the merge of the
+ OTRACK_BY_INSTRUMENTATION branch.
+
+
+PROTOCOL for version 2
+----------------------
+Version 2 is identical in every way to version 1, except that the time
+string in
+
+ <time>human-readable-time-string</time>
+
+has changed format, and is also elapsed wallclock time since process
+start, and not local time or any such. In fact version 1 does not
+define the format of the string so in some ways this revision is
+irrelevant.
+
+
+PROTOCOL for version 1
+----------------------
+This is the main top-level construction. Roughly speaking, it
+contains a load of preamble, the errors from the run of the
+program, and the result of the final leak check. Hence the
+following in sequence:
+
+* Various preamble lines which give version info for the various
+ components. The text in them can be anything; it is not intended
+ for interpretation by the GUI:
+
+ <preamble>
+ <line>Misc version/copyright text</line> (zero or more of)
+ </preamble>
+
+* The PID of this process and of its parent:
+
+ <pid>INT</pid>
+ <ppid>INT</ppid>
+
+* The name of the tool being used:
+
+ <tool>TEXT</tool>
+
+* OPTIONALLY, if --log-file-qualifier=VAR flag was given:
+
+ <logfilequalifier> <var>VAR</var> <value>$VAR</value>
+ </logfilequalifier>
+
+ That is, both the name of the environment variable and its value
+ are given.
+ [update: as of v3.3.0, this is not present, as the --log-file-qualifier
+ option has been removed, replaced by the %q format specifier in --log-file.]
+
+* OPTIONALLY, if --xml-user-comment=STRING was given:
+
+ <usercomment>STRING</usercomment>
+
+ STRING is not escaped in any way, so that it itself may be a piece
+ of XML with arbitrary tags etc.
+
+* The program and args: first those pertaining to Valgrind itself, and
+ then those pertaining to the program to be run under Valgrind (the
+ client):
+
+ <args>
+ <vargv>
+ <exe>TEXT</exe>
+ <arg>TEXT</arg> (zero or more of)
+ </vargv>
+ <argv>
+ <exe>TEXT</exe>
+ <arg>TEXT</arg> (zero or more of)
+ </argv>
+ </args>
+
+* The following, indicating that the program has now started:
+
+ <status> <state>RUNNING</state>
+ <time>human-readable-time-string</time>
+ </status>
+
+* Zero or more of (either ERROR or ERRORCOUNTS).
+
+* The following, indicating that the program has now finished, and
+ that the wrapup (leak checking) is happening.
+
+ <status> <state>FINISHED</state>
+ <time>human-readable-time-string</time>
+ </status>
+
+* SUPPCOUNTS, indicating how many times each suppression was used.
+
+* Zero or more ERRORs, each of which is a complaint from the
+ leak checker.
+
+That's it.
+
+
+ERROR
+-----
+This shows an error, and is the most complex nonterminal. The format
+is as follows:
+
+ <error>
+ <unique>HEX64</unique>
+ <tid>INT</tid>
+ <kind>KIND</kind>
+ <what>TEXT</what>
+
+ optionally: <leakedbytes>INT</leakedbytes>
+ optionally: <leakedblocks>INT</leakedblocks>
+
+ STACK
+
+ optionally: <auxwhat>TEXT</auxwhat>
+ optionally: STACK
+ optionally: ORIGIN
+
+ </error>
+
+* Each error contains a unique, arbitrary 64-bit hex number. This is
+ used to refer to the error in ERRORCOUNTS nonterminals (see below).
+
+* The <tid> tag indicates the Valgrind thread number. This value
+ is arbitrary but may be used to determine which threads produced
+ which errors (at least, the first instance of each error).
+
+* The <kind> tag specifies one of a small number of fixed error
+ types (enumerated below), so that GUIs may roughly categorise
+ errors by type if they want.
+
+* The <what> tag gives a human-understandable description of the
+ error.
+
+* For <kind> tags specifying a KIND of the form "Leak_*", the
+ optional <leakedbytes> and <leakedblocks> indicate the number of
+ bytes and blocks leaked by this error.
+
+* The primary STACK for this error, indicating where it occurred.
+
+* Some error types may have auxiliary information attached:
+
+ <auxwhat>TEXT</auxwhat> gives an auxiliary human-readable
+ description (usually of invalid addresses)
+
+ STACK gives an auxiliary stack (usually the allocation/free
+ point of a block). If this STACK is present then
+ <auxwhat>TEXT</auxwhat> will precede it.
+
+
+KIND
+----
+This is a small enumeration indicating roughly the nature of an error.
+The possible values are:
+
+ InvalidFree
+
+ free/delete/delete[] on an invalid pointer
+
+ MismatchedFree
+
+ free/delete/delete[] does not match allocation function
+ (eg doing new[] then free on the result)
+
+ InvalidRead
+
+ read of an invalid address
+
+ InvalidWrite
+
+ write of an invalid address
+
+ InvalidJump
+
+ jump to an invalid address
+
+ Overlap
+
+ args overlap other otherwise bogus in eg memcpy
+
+ InvalidMemPool
+
+ invalid mem pool specified in client request
+
+ UninitCondition
+
+ conditional jump/move depends on undefined value
+
+ UninitValue
+
+ other use of undefined value (primarily memory addresses)
+
+ SyscallParam
+
+ system call params are undefined or point to
+ undefined/unaddressible memory
+
+ ClientCheck
+
+ "error" resulting from a client check request
+
+ Leak_DefinitelyLost
+
+ memory leak; the referenced blocks are definitely lost
+
+ Leak_IndirectlyLost
+
+ memory leak; the referenced blocks are lost because all pointers
+ to them are also in leaked blocks
+
+ Leak_PossiblyLost
+
+ memory leak; only interior pointers to referenced blocks were
+ found
+
+ Leak_StillReachable
+
+ memory leak; pointers to un-freed blocks are still available
+
+
+STACK
+-----
+STACK indicates locations in the program being debugged. A STACK
+is one or more FRAMEs. The first is the innermost frame, the
+next its caller, etc.
+
+ <stack>
+ one or more FRAME
+ </stack>
+
+
+FRAME
+-----
+FRAME records a single program location:
+
+ <frame>
+ <ip>HEX64</ip>
+ optionally <obj>TEXT</obj>
+ optionally <fn>TEXT</fn>
+ optionally <dir>TEXT</dir>
+ optionally <file>TEXT</file>
+ optionally <line>INT</line>
+ </frame>
+
+Only the <ip> field is guaranteed to be present. It indicates a
+code ("instruction pointer") address.
+
+The optional fields, if present, appear in the order stated:
+
+* obj: gives the name of the ELF object containing the code address
+
+* fn: gives the name of the function containing the code address
+
+* dir: gives the source directory associated with the name specified
+ by <file>. Note the current implementation often does not
+ put anything useful in this field.
+
+* file: gives the name of the source file containing the code address
+
+* line: gives the line number in the source file
+
+
+ORIGIN
+------
+ORIGIN shows the origin of uninitialised data in errors that involve
+uninitialised data. STACK shows the origin of the uninitialised
+value. TEXT gives a human-understandable hint as to the meaning of
+the information in STACK.
+
+ <origin>
+ <what>TEXT<what>
+ STACK
+ </origin>
+
+
+ERRORCOUNTS
+-----------
+This specifies, for each error that has been so far presented,
+the number of occurrences of that error.
+
+ <errorcounts>
+ zero or more of
+ <pair> <count>INT</count> <unique>HEX64</unique> </pair>
+ </errorcounts>
+
+Each <pair> gives the current error count <count> for the error with
+unique tag </unique>. The counts do not have to give a count for each
+error so far presented - partial information is allowable.
+
+As at Valgrind rev 3793, error counts are only emitted at program
+termination. However, it is perfectly acceptable to periodically emit
+error counts as the program is running. Doing so would facilitate a
+GUI to dynamically update its error-count display as the program runs.
+
+
+SUPPCOUNTS
+----------
+A SUPPCOUNTS block appears exactly once, after the program terminates.
+It specifies the number of times each error-suppression was used.
+Suppressions not mentioned were used zero times.
+
+ <suppcounts>
+ zero or more of
+ <pair> <count>INT</count> <name>TEXT</name> </pair>
+ </suppcounts>
+
+The <name> is as specified in the suppression name fields in .supp
+files.
+