pp.1-2:
Unlike the textual representation of linguistic structure, which includes keywords, identifiers, operators, and punctuation, documentary structure consists of those textual aspects explicitly defined to be not part of the language: white space (new lines, spaces, tabs), comments, and choice of names.
Viewed differently, documentary structure is what programmers add to source code for the sole purpose of aiding the human reader. This is of enormous importance because of the central role of reading during software development [14]. Programmers clearly understand this: they arrange code carefully, complain about inadequate comments, and argue passionately about the exact placement of braces in code (purely a matter of white space in most languages).
It is almost tautological that documentary structure is outside the formal language. It is a much more subtle fact that documentary structure is mostly orthogonal to language structure. An important consequence is that compiler-oriented tools do not represent documentary structure adequately. Compilers discard this information freely because it is not needed: humans seldom read compiler output. For other language-based tools, however, losing documentary structure violates the tool builder\ufffds equivalent of the physician\ufffds oath to \ufffdfirst do no harm.\ufffd
Cheers,
Scott.