flat assembler, the next generation

Artifact Content
Login

Artifact bbcd0f24e63a71707d332bf5b2b860b1025ea0c7:


fasm g design notes


The list of special characters is extended, the ".", "?" and "!" are now special and can no longer be part of names.


The basic symbols are defined just like in fasm 1.

	label:
	constant = 1

There is a single hash tree for all kinds of symbols (including [macro-]instructions).
This means that given symbol can have only one meaning, trying to use the same symbol for different purposes will cause error - with a few exceptions. The macroinstruction and structures are allowed to have the same identifier, and they can also share name with any internal operator.

[Allocation of a leaf in tree is permanent. A pointer to the specific symbol is going to be unchanged for all the passes of assembly.]

Every kind of symbol can be forward-referenced.

	dd size
	size = $

	one
	macro one
		db 1
	end macro

The definition of symbol can refer to its own value, it is a special case of forward-referencing that translates to self-dependence (in case of macroinstructions it is recursion).

	x = (x-1)*(x+2)/2-2*(x+1)
	
	macro countdown n
		if n>0
			db n
			countdown n-1
		end if		
	end macro

Some types of symbols, like numeric values and macroinstruction allow the value to be re-defined (defined more than once in source).
If this happens, they no longer can be forward-referenced.

	a = 1
	db a
	a = 1000
	dw a

The new variant of numeric constant definition, ":=", forces the value to be constant and thus prohibits any re-definitions.

If macroinstruction is this kind of variable and cannot be forward-referenced, it can still use its own previous value.

	macro one
		db 1
	end macro
	macro one
		one
		display "debug: one"
	end macro

If a symbol that is assembler's internal instruction get defined as macroinstruction, it instantly forces this macro into a variable mode, disabling forward-referencing (and recursion) and allowing to use the original instruction in the body of macro. So there is no difference between internal instruction and instruction defined as macro in this matter.

[

The symbol structure located in the tree leaf contains a pointer to the value definition structure, which in turn contains pointer to the value. The value definition pointer changes only when there is an additional value assigned to the symbol (the new value definition contains pointer to the previous value definition). The value itself may be movable in memory, and it may be shared among a few synonymous symbols (like the addressing space labels). For this reason every reference to the value is through the pointer to the value definition structure, and the value definition contains the reference counter. This counter tracks number of references to value - when it goes down to zero, the memory block containing value may be reused or freed.

Even in case of constant symbols, which have only one value and cannot be redefined, their chain of value definitions may be longer than one element. In case of values that are built over time, the first value definition may be flagged as "in construction" one, and then the previous value (which is most likely the value from previous pass) has to be used still, until the new value is completed.

Once the value (the contents of movable memory block) is completed, it becomes immutable.

{ The "element" type symbols always have exactly one value definition, as they are not allowed to be variable symbols, their values are not built over time, and their reference count always stays at one, so every new value is written into the same value definition. Pointer to the value definition of element is used in polynomial values and it does not affect the reference counter, because it is a reference to value definition itself, not to the current value that is stored there (it contains element's metadata, which may vary between passes). }

The value definition has field containing the number of pass in which the value was defined. The symbol structure has field with the number of most recent pass in which the value was accessed, and it also contains the register of predictions for the "defined" and "used" states (including the number of most recent pass in which the prediction happened).

{ The value definition also holds a count of times when value changed from pass to pass. After the "grace period" of initial few passes, the non-variable symbols that changed value too frequently in relation to the total number of passes are marked as "unstable". This mark is permanent (like the status of variable) and it might be testable in logical expressions with a special operator. This should help to choose the safe encoding for an instruction that may be self-dependently optimized. }

]

Every symbol in tree can have its private sub-tree, called its namespace. The "." operator can be used to refer to a symbol in the namespace of other symbol.

	space:
	space.x = 1
	space.y = 2
	space.color:
	space.color.r = 0
	space.color.g = 0
	space.color.b = 0

If the "." operator is used with empty argument for the namespace, it refers to the namespace of the latest label that was defined in the current namespace. This party emulates the behavior of fasm 1.

	space:
	.x = 1
	.y = 2
	.color:
	.color.r = 0
	space.color.g = 0
	space.color.b = 0

The "." operator with no argument at all refers to that latest label directly.

	address:
	dd .

To change the current namespace (that is: switch the tree inside which all new symbols are created) the "namespace" directive should be used.

	space:
	namespace space
		x = 1
		y = 2
		color:
		.r = 0
		.g = 0
		.b = 0
	end namespace

The namespaces can be nested indefinitely.

	space:
	namespace space
		x = 1
		y = 2
		color:
		namespace .		; same as "namespace color"
			r = 0
			g = 0
			b = 0
		end namespace
	end namespace

{ 

If the name is followed by a single dot and nothing more, this refers to the recognized namespace symbol itself. This allows to re-define symbols from parent namespaces when such already exist, because such identifier used in definition does not automatically create a local symbol, but first looks whether such symbol already exists

	a equ 0
	b equ 0
	namespace space
		a equ 1			; defines "space.a"
		b. equ 1		; re-defines global "b"
	end namespace

	X? equ 0
	Y? equ 0
	x equ 1				; defines case-sensitive symbol
	y. equ 1			; re-defines case-insensitive symbol

 }

The new variant of numeric variable definition, "=:", allows to retain the previous value so that it can be later restored with "restore" directive.

	a =: 1
	a =: 2
	restore a

Since symbolic and numeric variables are the same class of symbols, "restore" directive can be applied to both of them. Symbolic variables always keep the stack of previous values, while numeric ones require the "=:" syntax to stack new value upon the previous one.

	while defined variable
		restore variable
	end while

{ The above snippet should work well with numeric variables, not necessarily with symbolic - because the symbolic variable is replaced with its value before the "defined" operator is processed. }

{ Macroinstructions and structures still need separate directives for this purpose, "purge" and "restruc". A macro can remove its own definition with "purge". This is different from fasm 1, where it only affected the previous definitions. }

{ Additional directives like "reequ" and "redefine" may be introduced to allow replacement of symbolic values analogous to what plain "=" does. }

If a name of symbol is followed by "?" character, it is a case-insensitive name (only with respect to English alphabet). The value of such symbol will be used only for the case-sensitive variants that have not been defined.

	tester? = 0
	tester = 1
	TESTER = 2
	db tester	; 1
	db Tester	; 0
	db TESTER	; 2
	db tester?	; 0

The "?" operator works everywhere, also inside the arguments for "." operator. Both these operators are special in that they do not ignore whitespace. There must be no whitespace between them and the names they apply too. The whitespace is treated as the beginning of a separate identifier.

	space:
		.tester? = 0
		.tester?.x = 1
	db space.TESTER.x	; valid, while space.TESTER.X is not

[Nomenclature note: a symbol name is a name token that defines symbol relative to some namespace; a symbol identifier is the complete identifier expression composed (optionally) with "." and "?" operators. A symbol identifier may of course consist of just a symbol name and nothing more.]

{ A lone "?" operator, not preceded by a symbol identifier, is an idiom used for uninitialized data statements. }

In the definition of macroinstruction the symbol identifier can be also followed by "!" operator, and then it defines the unconditional macroinstruction, which gets recognized and processed even when assembler would otherwise skip it (like inside the false condition "if" block, definition of another macro, etc.).

	macro endif_with_fireworks!		
		end if
		; here put the fireworks
	end macro
	if 0
		regular macros not processed here
	endif_with_fireworks

The "struc" macro is a variant of macroinstruction that always have to be preceded with name. The name of such macro takes the role of symbol accessible with "." operator not preceded by a namespace name. The value of this symbol does not get defined automatically though. To define it as a label, the ".:" construction needs to be used.

	struc POINT
		.:
		.x dd ?
		.y dd ?
	end struc

[

This encourages to create new syntax variants:

	macro struct? definition&
		struc definition
		.:
		namespace .
	end macro

	macro ends?!
		end namespace
		end struc
	end macro

	struct POINT vx:?,vy:?
		x dd vx
		y dd vy
	ends
]

{ The name of structure macro becomes the "latest label" entry in the local namespace of this macro (even though it is a symbol from a different namespace). The "." operator first looks for the "latest label" entry in the local namespace, and only when there is none, it uses the one from the current (regular) namespace. }

The prioritized symbolic constant may be defined by following its name with "!" operator (?).
{ It may not be allowed to define such constant directly. But assembler is going to define and use them internally. }
Such symbol gets replaced with its value before any other processing of source line is started. It can never be forward-referenced, it is always a variable - from now on it will be called a parameter variable.

[

There are four classes of symbols: parameters, instructions, structures, and expression elements (labels, numeric variables, etc.). When trying to recognize a symbol, assembler ignores symbols of class other than one expected for a given context. However, when no instruction or structure is found when searching for one, the expression element will be found instead.

A linked list in tree leaf consist of name entries with the same hash. Each name entry contains a pointer to the child namespace, and a pointer to the first symbol entry in the linked list of symbols sharing the same name. For a given name there is always at most one symbol of any given class.
For a symbol to get found, in addition to being the right class it has to have a value definition from current or previous pass (when it is from previous pass, the prediction for being defined is noted down in symbol structure).
The tree scanner, with the class filtering in mind, starts from the root namespace and looks for symbol in all the namespaces on the path leading down to the current one. If there is a macro namespace, it scans it even before the root namespace. If the scan is for "any case" symbol, and nothing was found, it repeats the whole scan searching for case-insensitive symbol this time. If the search was for case-insensitive symbol from the start, it stops after the first scan.
The search inside a specific namespace (when "." operator is used) replaces the scan of multiple namespaces starting from root with a scan of just one namespace provided.
When nothing is found, finally an undefined no-class symbol entry is added in the last namespace that was scanned (either the current namespace, or a specified namespace). If the "?" operator was applied to the name, this newly added symbol is in a case-insensitive name entry, otherwise it is always in case-sensitive one.

{ If the case-sensitive symbols get to be disallowed to share the name with a case-insensitive symbol, the scan is for case-insensitive values first, and only then the case-sensitive ones. }

{ When tree scanner finds a name equal to the one it searches for, it replaces its pointer to the name with the one used by tree. This way the following comparisons with other symbols entries will be faster, because the may (and should) use the same pointer to the name and when pointers are equal, no string comparison is needed. }

In the chain of value definitions all the entries need to be of the same type. Most of the types are allowed only for the expression class. The instructions, structures and parameters have always the symbolic value.

	symbolic		allowed for: instruction, structure, expression, parameter
	numeric			allowed for: expression
	element			allowed for: expression
	string			allowed for: expression
	floating point		allowed for: expression
	area			allowed for: expression
	numeric operator	allowed for: expression
	logical operator	allowed for: expression
	size operator		allowed for: expression

Any of these types can be additionally marked as "internal", in such case there is no pointer to a regular value, but some specific information that identifies internal instruction or variable. The operator classes are always internal. {Perhaps except for the size operators?}

{ Any symbol with an internal value is automatically a variable symbol. For internal symbols forward-referencing is meaningless anyway, and making them variable allows to overload them with macros. }

The internal value cannot be removed, "restore" directive does not affect it.

{ If all values of variable symbol are removed with "restore" directive, is it possible to define a value with different type for a given class? It should not be a problem. }

The "area" type contains the value of an address of addressing space, and its complete contents, no matter whether it is virtual or not. The output of the assembly is a concatenation of all the non-virtual "area" values from the last pass. An "area" value does also contain additional size of uninitialized data at the end of it. Only when an initialized data has to be appended at the end, the uninitialized part gets allocated (and zeroed). Load from uninitialized part returns zero without it needing to be allocated.

The "area" values for non-virtual blocks are kept in an output list (it increases the reference count for each one of them). All the blocks from output list are written into an output file when assembly is sucessful and complete. Every "area" value may have an additional attribute {a flag in value definition?} deciding whether the uninitialized data should get written into output or not. The "org" directive creates a new addressing space with this attribute (so uninitialized data from previous block is written in form of zeroes into output file), and the "section" directive creates a new addressing space without this attribute (it has the same syntax as "org"). Just like in fasm 1, "$$" and "$" are special internal values evaluated to the starting address of current addressing space and current address in this space, respectively. The new "$%" symbol computes current offset in output file (taking into consideration sizes and attributes of all the addressing spaces in the output list), assuming that some initialized data would be generated at current position (if used inside uninitialized data definitions it may land outside of the boundaries of file or section).

{ 

The "load" directive can access the predicted area value, but only when it is used with an area label (one defined with "::" symbol). When "load" is used with a simple address, it is limited to the data already generated in current pass, just like it was with fasm 1.

When the area is modified with "store", the complete addressing space is marked as unpredictable. It is not allowed to have predicted "load" from such area, all loads have to be limited to the data already generated in current pass. The "store" is always limited this way. Nonetheless, the "store" and subsequent "load" outside the maximum boundary of already closed addressing space is a recoverable error, and assembler may extend the reserved memory block for the area value to store this value for later "load" accesses. For this reason the reserved portion of area memory block has to be always initially filled with zeroes (resizes need to zero-fill the added portion).

This could be less restrictive, like marking area as unpredictable only when there is a "store" after some value was already loaded from this value. But it is a similar situation to allowing to forward-reference variable symbols (where the last value from previous pass would get used) - while technically possible, it would be too confusing and thus it is more practical to restrict it more. 

Some additional syntax variant, like "load final", may be devised to allow loading the value from previous pass. But it would only be for performance reasons - because just like forward-referencing the final value of variable is possible through the intermediate definition of a constant at the end of source, the "unpredictable" area can be duplicated into a "predictable" one with looped "load" and "db" combination.

}

Since the values is general have variable size, the numeric value can have any size, too. This allows to optimize memory usage - small constants may use up just one byte for numeric value. The calculations are performed, if possible, as on infinite 2-adic values sign-extended from the value chunks. The resulting value is then fit into the shortest possible string of bytes that produce the same value after being sign-extended.

{ It should use recursive Karatsuba algorithm for multiplication, and perhaps something nice for division with remainder, too. }

Since the numerical values are going to be theoretically unlimited in length, it should be possible to store a complete string there, too. But a separate type for string values is needed, so that directive like DB can distinguish whether it should signal overflow, or just use a string input.

]

Some of the features of language are going to be implemented as global parameter variables, for example the prioritized "%" symbol with varying value reflecting current number of repetition of "repeat", "while", "irp", etc. The special parameters of "repeat", "irp", "match" and similar directives are also going to be implemented this way, with value discarded when the directive block gets closed.

{ The "%%" parameter should evaluate to a total number of repetitions inside a block like "repeat" or "irp". Inside "while" block it has undefined value. }

The "#" operator concatenates the name tokens, but only if there is no whitespace next to it.

	repeat 4
		label#%:
	end repeat

	i = 1
	while i<256
		repeat 1, p:i		; well, it has to have quirks like this,
			label#p:	;  it is a member of fasm family after all
		end repeat
		i = i*2
	end while

{ 

The "#" is processed at the time when the symbol or value has to be recognized and used, and "#" is never actually removed from source. 

	macro tester suf
		pre#suf:
	end macro

During the definition of the above macro the assembler is going to look for a prioritized macro called "presuf". If no such macro is found, the "pre#suf:" line becomes part of the macro and the concatenation may be later performed when macro is called, with the replaced "suf" value.

This demonstrates that the prioritized macros are an advanced feature that should be used with care.

}

The "`" operator is now a modifier that applies to the name of parameter symbol, and it converts entire contents of the parameter value into a quoted string. When it is not followedy by a name of known parameter, it is left untouched and can later be matched and processed just like any other special character.

Every time a macro is called it gets a special unique namespace. It does not become the "current" namespace, but while processing the macro the symbols from this namespace will overshadow any regular symbols of the same name. The parameters of macro are defined to be parameter variables inside this special namespace. The "local" directive declares a symbol in that special namespace of a macro, without giving it an actual value - it can be defined as anything, and will only be accessible inside this macro call.

	macro tester
		local label
		label:
		dd label
	end macro

The "local" directive has the same meaning no matter where in macro it is put. For clarity it is recommended to put it in the beginning of macro.

{ The name of macro's local namespace must be generated in such a way, that is not only unique, but also possibly stable between the consecutive passes. }

The "esc" directive causes the instruction that follows it to become a part of the currently defined macro. This can be used to supress evaluation of unconditional macro/directive, in particular to use "macro" or "end macro" commands without proper nesting:

	macro begin name
		esc macro name
	end macro

{ Note that the contents of line following "esc" is still preprocessed (that is: the parameters are replaced with their values). The "\" character may be made to supress preprocessing of the name that follows it if needed, but not implementing this feature may leave "\" operator to be recognized in custom syntaxes, and this might be preferable. }

The symbolic constant is more or less an exact textual substitution, but if it contains any symbol identifiers, they preserve the symbol recognition context in which the value was defined. Such symbolic context has two elements: the current namespace and the overshadowing namespace of macroinstruction local symbols.

	define g		; declare global
	namespace a
		x = 1
		g.numeric = x
		g.symbolic equ x
		x = 2
	end namespace
	namespace b
		x = 3
		dd g.numeric	; 1
		dd g.symbolic	; 2, it is equivalent to a.x
	end namespace

The evaluation of symbolic constants is performed at the time of parsing the expressions [this is when the context of "operator" symbol class is used], and it is a "deep" replacement - if the text composed from the already performed replacements refers to another symbolic value, it is expanded again (with a potential for unwanted infinite recursion - but only in case of symbolic constants, not symbolic variables, as the latter are similar to macros: when nested, they refer to previous value). The macroinstruction calls and some other directives (like "restore") which do not evaluate expressions in the arguments, do not perform such symbolic replacements. In case of this "deep" replacement any symbol identifier should always be completely contained within boundaries of a single symbolic value - for identifiers crossing boundaries the behavior of assembler is undefined and may be implementation-dependent.

The "define" defines symbolic constants or variables without the value evaluation.

{ The "match" directive in its default mode behaves much like in fasm 1, and it replaces the symbolic variables with their values in the matched string. There should also exist a simplified variant of "match" (perhaps the "match!" idiom may be used for this purpose) that would operate on unprocessed text. Since it would work on raw text, it would also not preserve the recognition context for the matched pieces of text. }

{ The "irps" directive in a form analogous to the one from fasm 1 is not needed, since the "match" combined with control directives is able to perform the same tasks while better handling the whitespace nuances. The implementation of "irps" may thus be postponed until it becomes clear what purpose should it serve. }

The definition of macro arguments may use "*", ":" and "&" characters in the same way as in fasm 1. There is no argument grouping with square brackets, though. Such processing is performed with "irp" directive instead.

	irp <name,value>, a,1, b,2, c,3
		name = value
	end irp

{

A new advanced directive, "indx", may be introduced to allow moving iterator index freely (while not affecting the "%" value and the total number of repeats). This should serve as a replacement for "forward"/"reverse" directives from fasm 1 and at the same time open many other ways of traversing the list of values.

	irp <lo,hi:0>, 1,2,3,4,5
		indx 1+%%-%
		db PUSH_OPCODE
		dw hi shl 8 + lo
	end irp

	irp str, 'alpha','beta','gamma'
		repeat %%
			dd offset#%
		end repeat
		repeat %%
			indx %
			offset#% db str
		end repeat
		break
	end irp

	macro call proc*,args&
		irp arg,args
			indx 1+%%-%
			db PUSH_OPCODE
			dw arg
		end irp
		db CALL_OPCODE
		dw proc
	end macro

}

The use of "?" or "!" operator may "force" the macro definition to use an empty name for a macro. It is allowed, and such macro will be applied to every single line that follows, with the complete content of line becoming macro arguments.
The empty name is assumed to be an internal instruction (reflecting the default line parser), so this macro is always a variable symbol, not allowing recursion.
This allows to recognize and process unusual syntax variants.

	macro ? line&
		match .tail, line
			dot_#tail
		else
			line
		end match
	end macro

The regular variant of this macro ("macro ?") is called only when a line does not call an unconditional instruction (like "end if" or unconditional macro). The uncoditional variant of this special macro ("macro !") overrides all the processing and it should be used with care. 
{ If "?" was replaced with "!" in the above example, such macro would effectively disable any control directives in the code that follows, because of the disallowed nesting that would occur if "line" parameter contained any control directive. }

Every numeric value (label, numeric variable, etc.) can be in fact a linear polynomial. The "element" directive defines symbol as a variable for polynomials. The "relativeto" operator, just like in fasm 1, determines whether the difference of two values consist only of a constant term.

	element bx
	label alpha at bx+1
	if alpha relativeto bx
		db alpha-bx
	end if

The "element" and "scale" used as operators allow to extract polynomial terms from any value, the second argument is the index of term. Index 0 always refers to the constant term, thus "value element 0" is always 1, while "value scale 0" is equal to the value of constant term. If index value is equal to the number of terms within polynomial (or higher) the "value element index" returns 0.
The "element" directive allows to assign a value (which can be polynomial itself) to the variable. This value can later be retrieved with "metadata" operator.

	element r16
	element ax: r16+0
	element bx: r16+1
	value = ax+bx
	i = 1
	while value scale i > 0
		if value metadata i relativeto r16
			db (value metadata i)-r16, value scale i
		else
			err
		end if
		i = i + 1
	end while

[

The source context is a stack of pointers to the current place in source. Each entry in stack is either a pointer to a line in source file, or a pointer to a value definition of macro and a line inside it. When macro or a file ends, assembler pops its entry from source context and continues processing the previous entry.
The repeating directives ("repeat", "while", "irp") preserve the source context of where they were started and restore it for every repetition. This may mean re-entering the macro that was already processed to the end - and disabling its value definition again (every time a variable macro is called, its current value is disabled so that the previous values are going to be used).

All blocks related to repeating directives or conditional directives ("if" and "match") must be properly nested, just like in fasm 1. This is required, because when skipping such block without processing (like when "if" or "while" condition is false, or count for "repeat" is zero) assembler wants to count all the nested blocks to recognize the "end" directive for the current block in a useful manner. The remaining types of blocks may overlap with any other freely. But if a macro gets defined inside such block, any directives controlling the flow of assembly (including "end if", etc.) inside the macro definition are going to be ignored, up until the "end macro". It has a practical effect such that macro definition inside a control block must be completely contained within in. On the other hand, the macro definition itself can contain partial or malformed control blocks.

]

As in fasm 1, the labels may have size attached to them. This value can be retrieved with "metadata 0" operator.

	label a byte at 417h
	if a metadata 0 = 1
		display 'true'
	end if

{ 

It may be allowed to define a label with custom size (non-polynomial). For example "label b:100 at 0". 
In fact, size operators may not be defined internally at all, simple definitions like "byte? := 1" may be used instead.

}

The "eq" and similar operators are no longer available. This is to discourage use of "if" directive with macro parameters, since they may contain symbols that could interfere with the correct processing of condition. When macro argument is expected to be a valid expression, it can be sanitized by passing through "=" definition, for general dismantling of custom syntaxes the directives like "match" should be used instead.

{ The "eqtype" may be re-introduced, maybe under a different name, to compare whether expression arguments give results of the same type (integer, string, float). }

{ 

Because ".", "?" and "!" are now special characters, the "match" directive should allow matching them like any other. However the "?" should have an added special meaning in the "match" pattern, so "=?" pattern is going to be necessary to match the "?" character literally. If "?" character follows immediately (without whitespace) the name symbol in "match" pattern, it defined case-insensitive parameter inside "match" block - this is analogous to defining parameters with other block directives like "irp".

Since assembler is now whitespace-aware, the whitespace in "match" pattern can also be matched against text. The whitespace between two exactly matched elements in pattern matches any or no whitespace in value, so "+ +" matches both "++" and "+ +" (while "++" does not match "+ +"). The "=" followed by whitespace means that the whitespace is required, so "+= +" matches "+ +" but not "++". Any plain whitespace (not modified by preceding "=") that is not between two exactly matched elements (eg. next to the wildcard name) is completely ignored.

}

{

The source text is converted into tokens. All the symbolic values (including macros) are stored as sequences of tokens, not in text form.

	Name token:
		byte 1Ah
		dword N
		N bytes of name
		dword hash of name
		dword hash of case-normalized name

	String token:
		byte 22h
		dword N
		N bytes of string

	Broken string token:
		byte 27h
		dword N
		N bytes of string
	(a string that was missing the ending quote)

	Whitespace token:
		byte 20h

	Punctuation token:
		any byte equal to one of the special characters

	End of line:
		byte 0Ah
	(any line break is converted to this token)

	End of file:
		byte 0

Every source file gets tokenized the first time it is acessed and it is cached in this form indefinitely.

{ The binary files should be cached too, in their original form. Perhaps only the part that was actually read, possibly extended within cache when more of it is needed in the future. This is needed to ensure that there are no external inputs that may change during the resolving process and disturb it. For the same reason "%t" symbol should reflect the timestamp of when the assembly started, constant throughout the whole process. }

During the assembly, the tokenized source text is preprocessed (including replacement of parameters) into a second-level tokens language. This language is also used for all the symbolic values, including macroinstructions.

	Name token:
		byte 1Ah
		dword pointer to the contents of name token (length, name, two hashes)

	String token:
		byte 22h
		dword pointer to the contents of string token (length, data)

	Broken string token:
		byte 27h
		dword pointer to the contents of string token (length, data)
	(a string that was missing the ending quote)

	Whitespace token:
		byte 20h

	Punctuation token:
		any byte equal to one of the special characters

	End of line:
		byte 0

	Internal number:
		byte 30h
		dword N
		N bytes of numeric data

	Name context switch:
		byte 40h
		dword pointer to namespace
		dword pointer to current label
		dword pointer to local (macro) namespace

The "internal number" points to a binary number. When treated with operator like "#" or "`", this token is converted into a textual representation of number, otherwise it is used directly.
The "name context switch" token can only occur in the values of "symbolic" type, it temporarily changes the context of name recognition.
If the pointers are zero, it restores the context to normal operation. When a new symbolic value is defined, a token with context from the time of definition is prepended to the new value, and the context restoring token is attached at the end. Any context restoring tokens inside the value (inherited from possible parameter replacements) are replaced with the context token of the new variable.

	; in context A
	a equ 1	; defines sequence [ context:A name:1 context:0 ]
	
	; in context B
	b equ a+a ; defines sequence [ context:B context:A name:1 context:B punctuation:+ context:A name:1 context:B context:0 ]

The ending "context:0" may be implicit and ommited in value.

Because namespaces are elements of the tree, they are persistent pointers that never change after they are allocated. Therefore they may be safely used as a part of symbolic source because they do not prevent these values from stabilizing during the assembly.

}

{

"eval" generates as sequence of bytes, like "db" or "display", but instead of outputting them, it parses them like a source line and executes the resulting line.

	eval "db 1"

}

{

Interal structure of any numeric (polynomial) value:

		dword N
		N bytes of sign-extended binary number (constant term)
	
		dword pointer to element
		dword M
		M bytes of sign-extended binary number (scale)

		...		

		dword 0
		
		dword S
		S bytes of sign-extended binary number (size of labelled data)

	(the block starting with pointer to element is repeated for every non-constant term of polynomial; the block starting with S is optional)

}