In the Lexical Notes on page 414 of The Dylan Reference Manual, add the following two paragraphs:
Whitespace can be one or more contiguous space characters, horizontal tab characters, and/or newline indications. Implementations can define additional whitespace characters.
A printing character (including space) is one of the characters denoted by ASCII codes 32 through 126 inclusive. Implementations can define additional printing characters.
Replace the table of control characters generated by backslash escapes at the top of page 19 of The Dylan Reference Manual with the following:
| Backslash Character | Meaning | ASCII code |
| a | alarm | 7 |
| b | backspace | 8 |
| e | escape | 27 |
| f | form feed | 12 |
| n | newline | - |
| r | carriage return | 13 |
| t | tab | 9 |
| 0 | null | 0 |
- ASCII codes in the above table are used only to identify the characters. They do not specify the value returned by
as(<integer>, character). There is no ASCII code for the newline character because newline indication is implementation
dependent.
- Note:
- This proposal makes no attempt to specify what characters are available to a Dylan program at runtime, i.e. the complete set of instances of the class
<character>. However, this clearly must include at least all characters that can appear in a character or string literal, i.e. the printing characters (including space), the eight control characters representable by backslash escapes, and any Unicode characters (represented by backslash angle bracket hexadecimal escapes) that the implementation is able to represent. Because '\n' is an instance of <character>, there must be a runtime character for newline indication, even if a newline indication in files is something other than a single character.
- This proposal is a clarification. It might be regarded as an incompatible change for programs written in an extended character set, but such programs were already non-portable before this proposal and can continue to work in whichever implementations they worked in before this proposal.
RATIONALE:
- Only source code in Dylan Interchange Format is portable, so only that format is addressed here.
- I say "newline indication" instead of "newline character" because the encoding of newline in a source file might not be a single character (e.g. CR-LF in MS-DOS), or might not be in the form of characters at all (e.g. a record-structured file).
- Implementation-defined characters are mentioned in several places, even though they cannot appear in a standard Dylan Interchange Format file, to clarify where implementors have freedom in their own file format (which could be an extension of Dylan Interchange Format). This information is only a hint to implementors, as it has no effect on programmers writing portable programs.
- I made the two definitions of whitespace on page 16 and page 451 consistent.
- I allowed horizontal tab as whitespace because of current practice in Dylan, C, and other languages. I was a bit reluctant about this, as there is no universal agreement on the formatting effect of horizontal tab, but it seemed safest to conform to current practice.
- The rationale for removing newpage as a whitespace character is the weak one that it is only mentioned in one place and seems unnecessary. Is the newpage character mentioned on page 451 the same as the form feed character mentioned on page 19, except the former is in source code and the latter is at runtime? If someone thinks newpage as a whitespace character is important, it should be defined as a character denoted by an ASCII code and added uniformly.
- Allowing comments as whitespace (p.16) was an error. For example,
abc/*def*/ghi is a single Dylan word, not two words separated by a comment. This is clear from the remark on page 16 about a comment blending with a token.
- I used the minimum possible definition of "printing character", based on my belief that if this had been intended to include control characters such as tab, backspace, or newline, they would have been mentioned explicitly (as space is).
- The ASCII codes in the table of backslash escapes are intended to reflect universal current practice, and not to change anything.
- This proposal doesn't really make anything significantly easier or harder, it just clarifies what is permissible. It might decrease unnecessary portability problems.
- This proposal has no effect on speed or safety.
EXAMPLES:
Not applicable.
COST TO IMPLEMENTORS:
- This has no cost to implementors, unless they choose either to write their own code to be portable, or to provide a tool to check portability of code, which ought to check these character set restrictions. The costs in those cases would not be significant.
RELATED PROPOSALS:
None.
REVISION HISTORY:
- Version 1, 7-January-1997, by Dave Moon (not published)
- Version 2, 8-January-1997, by Dave Moon
STATUS:
- Open 9-January-1997