1. Tokens
Understanding PostScript’s lexical structure and token types.
1.1. Overview
A token is the basic lexical unit of PostScript syntax. The PostScript scanner recognizes several types of tokens, each with specific syntax rules.
1.2. Token Types
1.2.1. Numbers
Integer Numbers:
42 % Decimal integer
-17 % Negative integer
0 % Zero
+100 % Positive sign optional
Real (Floating-Point) Numbers:
3.14 % Real with decimal point
-0.5 % Negative real
.25 % Leading decimal point optional
6.02e23 % Scientific notation
1.0E-10 % Exponent can be E or e
Radix Numbers (Base 2-36):
8#177 % Octal: 127 decimal
16#FF % Hexadecimal: 255 decimal
2#1010 % Binary: 10 decimal
1.2.2. Names
Names are identifiers that can be used literally or as executable objects.
Literal Names (preceded by /):
/Times-Roman % Font name
/x % Variable name
/myproc % Procedure name
Executable Names (no / prefix):
add % Operator
moveto % Command
myvar % User-defined name
Immediately Evaluated Names (preceded by //):
//add % Bind to current definition
//gsave % Level 2+
1.2.3. Strings
Regular Strings (parentheses):
(Hello, World!) % Simple string
(String with (nested) parens) % Balanced parens
(Line 1\nLine 2) % Escape sequences
(Tab\there) % Tab character
Hexadecimal Strings (angle brackets):
<48656C6C6F> % "Hello" in hex
<4142> % "AB"
1.2.4. Arrays
Literal Arrays:
[1 2 3] % Integer array
[/name 42 (text)] % Mixed types
[] % Empty array
Procedures (executable arrays):
{ add } % Simple procedure
{ dup mul } % Multiple operations
{ % Multi-line procedure
gsave
0.5 setgray
fill
grestore
}
1.3. Escape Sequences in Strings
| Sequence | Meaning |
|---|---|
|
Newline (line feed) |
|
Carriage return |
|
Horizontal tab |
|
Backspace |
|
Form feed |
|
Backslash |
|
Left parenthesis |
|
Right parenthesis |
|
Octal character code (1-3 digits) |
1.4. Comments
Single-Line Comments:
% This is a comment
42 % Comment after code
Document Structure Comments (DSC):
%%Title: My Document
%%Creator: PostScript Guide
%%Pages: 10
1.5. Whitespace
Whitespace Characters:
-
Space (ASCII 32)
-
Tab (ASCII 9)
-
Newline (ASCII 10)
-
Carriage return (ASCII 13)
-
Form feed (ASCII 12)
-
Null (ASCII 0)
Whitespace separates tokens but is otherwise ignored.
1.6. Token Delimiters
Delimiter Characters:
( ) < > [ ] { } / %
These characters terminate tokens and have special meaning.
1.7. Name Syntax Rules
Valid Characters in Names:
-
Any printable ASCII character except delimiters
-
Special characters:
!"#$&'*,.:;=?@^_`|~
Invalid in Names:
-
Whitespace
-
Delimiter characters:
( ) < > [ ] { } / %
1.8. Token Parsing
The token operator can be used to parse tokens from strings or files:
(42 /name) token % Returns: 42 ( /name) true
( ) token % Returns: false (no tokens)
1.9. Common Patterns
1.10. Best Practices
| Use Meaningful Names - Choose descriptive names for procedures and variables. |
| Comment Complex Code - Use comments to explain non-obvious logic. |
| Consistent Formatting - Use consistent indentation and spacing for readability. |
| Name Length Limits - While PostScript supports long names, keep them reasonable for maintainability. |