Target Files

This page tries to explain the different behaviours of the SB-Assembler with regard to target files. It's not intended to explain the different individual target file types, because I have already done that in this corner of my web site.

You can specify what target file type you want to use with the .TA directive. This same directive also allows you to set the maximum line sizes in your target file.
I think it goes without saying that you should select a target file type which your programming device understands. Otherwise you'll end up with a program which can not be programmed into your system.

The SB-Assembler distinguishes between two main types of target files, formatted and unformatted files. And then there is also the difference in maximum code sizes between the target files. Finally there can also be a difference between processor families. Most of the 8-bit processors for instance store one byte per instruction. However the modern ones, like the AVRs and the PICs need to store 2 or even more bytes per instruction. This changes the relationship between the address counter and the target addresses.

Unformatted Target Files

The unformatted target files are the easiest. Only two of the supported target file formats are unformatted, Binary and Hex files. What sets the unformatted target files apart from the formatted ones is that unformatted files do not contain address information.

Binary And Hex

This means that these files only contain data. Binary files contain just plain bytes. These bytes are nothing else then a large number of 8 bits grouped together.
Hex files, on the other hand, contain data which is encoded as ASCII characters. Each byte of data is encoded as a pair of ASCII characters from 0 to 9 and A to F. A certain number of these character pairs are stored on each line, unless there are not enough left to fill the last line completely. A line length of 16, which is the default for the SB-Assembler, holds 16 pairs of hexadecimal digits, which represent 16 real bytes of data.

No Address Fields

Because these unformatted files do not hold any address information, the data bytes are simply stored as they are translated by the SB-Assembler. This means that you can not simply change the address of part of your program as there is no way to make that apparent in the target file. The only way you can change the address for part of your program is to fill all the unused space with some data, any data.
The .NO directive will do that for you.

Maximum Code Size

The lack of any address fields has the advantage that your target file may grow as large as your operating allows it to. However I have set a sensible maximum size of 4GB because the SB-Assembler internally works with 32 bit numbers.

Formatted Target Files

Formatted files do have an address field. This means that the SB-Assembler can tell where the generated code/data is supposed to go in the memory of your target system. Whether that target memory is simply internal Flash memory, or an external EPROM is irrelevant.

Another thing all formatted target files have in common is one or more checksums per data line. This way the programming device can check whether the data is still valid in each data line.

Maximum Code Size

Many different target file formats exist. In the past just about every memory manufacturer used to have their own proprietary file format.
Some of these target file formats are restricted to only 16-bit address fields. This means that in that case the target address can not exceed the 64k byte limit. Please note that this doesn't mean that your code can grow as large as 64k byte, because if your program begins at address $C000 for instance, there's only 16k left before the address counter overflows to $0000 again. Version 3 of the SB-Assembler will give you a Address in target file wrapped back warning when that happens.
Other file formats have a 24 bit or even a 32 bit address field, which allows these files to grow up to 16M bytes or even 4G bytes.

Nowadays you would best choose the Intel or Motorola S37 file formats if you intend to write programs which grow larger than 64k bytes. The FPC file format can also store up to 4G bytes of code/data, but not many programmers support this target file format.
And let me remind you that can use as many target files as you like for your program. So basically there is no limit to how large your program can grow.

Where The Code Goes

The presence of an address field allows the SB-Assembler to change the target address to any value supported by the target file type. You may even change the address as often as you want. You can even set the address back to a value that has already been used. Be careful though, because programming devices usually don't like that. This would result in a program which can't be programmed into your target system, which makes the program useless.

Line Length

Each line in any of the formatted target files has a certain maximum line length. You can set this line length to any value between 8 and 32 of code/data bytes. All the lines will have an address field somewhere at the beginning of the line. Where it is depends on the particular file type, but it will always be there. This address tells the programming device where the first code/data byte of the line should go. The next byte will go in the next location, etc, until all the bytes of the line have been stored.

It may happen that there are not enough code/data bytes to be saved to the target file on the last line of the file. In this case the line is simply shorter and has less code/data bytes. All file types provide for a byte counter on each line, which tells the programming device how many bytes are to be expected.
It may also happen that a particular line is not full yet when you change the target address. In such a case the SB-Assembler will simply flush all the code/data bytes it hasn't stored yet and then ends the current line. Then a new line is started with the new address of the next piece of code.

Different Target Pointers

The SB-Assembler uses quite a few different target pointers. Normally you don't have to worry about them, as the SB-Assembler will take care of them automatically. But at other times it is good to understand what is going on behind the scenes. This way you may recognize potential pitfalls. Knowing how the target pointers relate to each other may also be valuable information when you want to create your own Cross Overlay.
The information below applies to Version 3 of the SB-Assembler. Version 2 lacks some of the pointers. Because Version 2 is obsolete I won't bother describing the target pointers for Version 2 now.

Program Pointer

This is the main Program Pointer. You set the address of this pointer with the .OR directive. On processors with 8-bit instructions this address is incremented once for each byte which is saved to the target file.
Usually the Program Pointer is listed on the beginning of the list file, unless an .EQ or .SE directive is found on that program line. A new label will also get this address assigned to it, unless an .EQ or .SE directive is found on that program line.

On processors with 16-bit instructions (AVR or PIC for instance), this address is incremented once for each 2-byte word which is saved to the target file.

Patch Pointer

Normally the address of the Patch Pointer is equal to and remains in sync with the Program Pointer. So if you set a new Program Pointer using the .OR directive, the Patch Pointer will get the same address assigned to it.
Each time a byte (or a word) is saved to the target file the Patch Pointer is incremented, just like the Program Pointer would.

The fun starts when you have changed the Patch Pointer using the .PH directive. Now the Patch Pointer differs from the Program Pointer, but will still remain in sync. Each byte (or word) saved to the target file will increment both pointers.

The .EP directive will set the Patch Pointer to the same value as the Program Pointer again. This effectively ends the Patch mode.

Originally (in the Apple ][ version and in Version 2) this trick was used to pretend to assemble a piece of your code for a totally different address range, while storing the generated code/data bytes in a consecutive order.

One typical use of this Patch mode was to store your code from address $0000 and upwards into an EPROM, while the assembler thinks it is building code which starts at address $C000. You do that by defining an .OR of $0000, followed by a .PH of $C000. This makes the code to be stored from address $0000 in memory, while all generated labels get address $C000 and upwards assigned to them.
In Version 3 of the SB-Assembler the .TA will turn this around to a more comprehensible way. Therefore new programs should best use the .TA directive instead of the .PH method of changing the Target Address.

Target Pointer

Here's the third and final pointer for you, This pointer is also set by the .OR directive. It is usually set to the same value as the Program Pointer and will keep in sync with it after each code/data byte is saved.

You can change the address of this Target Pointer (Version 3 only) using the .TA directive. This way you can now easily tell the SB-Assembler that it should continue assembling as it is used to, but direct the output data to a totally different address location in the target file.
This way of saving code/data is easier to comprehend I think.

The Target Pointer also existed in the older versions of the SB-Assembler in one way or the other. The only difference is that in Version 3 you can specifically change it now.

As I said the .OR directive usually assigns the same value to the Program Pointer and the Target Pointer. This implies that there are other situations, doesn't it? In case of processors which store 2-byte instructions, like the AVR and PIC processors for instance, the Target Pointer is set to double the value of the Program Pointer.
This assures that the next code/data byte will be stored in the right location in the target file. Fortunately these modern processors don't usually require you to change either the Patch Pointer or the Target Pointer, because doing so can be quite confusing. Normally you would only set the .OR to $0000 on these processors, never to touch the .OR, .PH, or .TA directive again.

Memory Segments

Version 3 of the SB-Assembler knows 3 different memory segments, while older versions of the SB-Assembler didn't know the concept of memory segments. Per default everything is targeted to the Code memory segment, which makes Version 3 backwards compatible with the older versions.
You can change the memory segment to any of the 3 available segments using the .SM directive. Each segment has its own target pointer and or target file.

Having more than one memory segments allows you to mix Code, RAM and EEPROM related blocks as you go. You don't have to concentrate these actions to one block of your program any more. If, for instance, you are working on a piece of code which requires some bytes in EEPROM, you can simply switch to EEPROM memory, define some bytes there and then switch back to Code memory to continue your program. Without worrying about where the EEPROM code went, or where you were in your code.

Code Segment

The Code Segment is the default segment to work in. Despite its name, it can hold a mix of code and data. This means that it is perfectly legal to have instructions and data mixed together in this memory segment.

The Code segment has its own separate set of Program, Patch and Target Pointers. It also has its own target file or even multiple target files. Whenever you are in Code memory and change any of the pointers (using .OR, .PH, or .TA) or target tile (using .TF) it will only affect the Code memory segment. Pointers and target files for other memory segments are not affected.

The output of the Code segment is to be sent to at least one Target file, which can later be used to program the code memory of your target system.

RAM Segment

The RAM segment can not hold any data, simply because RAM is volatile by nature. The only reason why this segment exists is to allow you to define RAM space for all sorts of variables you need in your program. Simply define labels here by skipping the required number of bytes for each label.

The RAM segment has its own Program Pointer. Other pointers are not implemented, which means that Patch mode (.PH) or Target Address mode (.TA) are not allowed while in the RAM memory segment.
It is also not possible to open a Target file (.TF) while in RAM memory. A RAM memory can't have a target file error will be reported if you do.

You can generate data while in RAM memory, but all generated data will get lost because you can't open a Target File while in RAM memory. The only effect it would have is that the defined number of bytes are added to the RAM Program Pointer, which allows you to define a new label on the next available memory location again.
It is not possible to generate program code while in RAM mode though. This means that you can not use Mnemonics while in RAM memory.

EEPROM Segment

EEPROM memory is intended to define locations and data to be stored into an EEPROM device, which usually holds configuration data. By nature EEPROM memory is not volatile, so you can save actual data to this segment. You should open a separate Target File, while you're in the EEPROM memory segment, which must be a different Target File than the one your Code memory goes to. Then you can switch back and forth between Code memory and EEPROM memory to store code and data bytes in Code memory, or data bytes in EEPROM memory.

The EEPROM memory segment has it's own separate Program Pointer, which can be set by the .OR directive. The EPPROM memory segment does not have a Patch or Target Pointer, so the directives .PH and .TA won't work while in the EEPROM memory segment.

You can only store data into EEPROM memory. Code can't be stored there, which means that Mnemonics can't be used while you're in the EEPROM memory segment.

The output of the EEPROM segment should be sent to at least one target file, which can then be used to program your EEPROM device.

Multi Byte Instructions

This is where it may get a bit complicated when you're only used to processors which store everything in byte size increments. In such cases every byte stored in the Target File will increment the Program Pointer, the Patch Pointer and the Target Address Pointer. Things get different with multi byte instructions though, such as the AVR and PIC processors for instance.
Mind you, RAM memory and EEPROM memory are always byte sized. So all of this will only apply to Code memory segment.

Program Pointer v.s. Target Pointer

Let's concentrate on Word (2 bytes) sized instructions only, even though the SB-Assembler can handle even more bytes per instruction. Currently none of the Cross Overlays use more than 2 bytes per instruction. Every Word is stored in its own memory location, which are 16-bits wide. As usual the Program Pointer (and the Patch Pointer) are incremented only once per Word that is stored in the target file.
This creates a slight challenge though, because all target file formats are based on bytes only. No problem, we'll simply save two bytes per instruction then. But wait a minute, this means that the Target Pointer will grow twice as fast as the Program (and Patch) Pointer. That's exactly what will happen here. If you save a Word, the Program Pointer (and Patch Pointer) will be incremented by one, while the Target Address Pointer is incremented by 2, because 2 bytes were saved.

The SB-Assembler Does Most Of The Work

Normally this process is rather transparent for the user. If you set the .OR of the program to $1000 for instance, the Program and Patch Pointers are both set to $1000 (as you might expect), but the Target Pointer is set to $2000 instead, which is twice as high as the other two pointers.
If you skip 200 bytes of memory, using the .BS directive, the Program and Patch Pointers will grow by half of that, while the Target Address grow by 200, like you would (or should) expect.
If you want to skip ahead to address $2000 by using the .NO directive, the Program and Patch Pointers are set to $2000, whereas the Target File is filled up to a point where the Target Pointer will be $4000.

But what should the SB-Assembler do when you've set the Program Pointer to $1000 and the Patch Pointer to $3000 while still using a one byte per instruction Cross overlay and then switch over to a multi-byte per instruction Cross Overlay? Fortunately this is not very likely to happen, because it is a very extreme use case scenario. And the SB-Assembler will warn you that it gets confused. You should simply reset all pointers to a known state using the .OR directive before you continue in such an extreme case. Better be safe than sorry.

Boundary Sync

So far so good. But what happens if you generate an odd number of bytes, using one of the data generating directives? Mind you, an instruction should never be able to generate an odd number of bytes. Here's Boundary Sync to the rescue.

Boundary Sync is an automated process of many parts of the SB-Assembler. Boundary Sync ensures that the next byte is not stored on an odd Target location. Should the Target Pointer be an odd numbered address, when the SB-Assembler should store the next Word in an even location, a padding byte with the value of 0 will be pushed to the Target file first, to make the Target Pointer point to an even address again.

Most of the time Boundary Sync is fully automated by the SB-Assembler. A label, for instance, can only be assigned to an even Target Address. Mind you, that it can be assigned to any Program Pointer address, but the Target Pointer is always double the value of the Program Pointer, which makes it always even. So if you are assigning a label, while the Target Pointer has an odd value, a padding byte will be saved to the Target File first.

All instructions of a multi-byte processor will start on an even Target address. If it's not, a padding byte will automatically be added.

Most directives will also perform a Boundary Sync. Some don't though. For instance data generating directives don't, which allows you to have multiple data generating directives following one after the other to create a large data block, without the danger of the inclusion of unwanted padding bytes.
Whether a directive does or does not do Boundary Sync can be found in the description of the directive in the directives chapter.

And what if the program ends in a data block with an odd number of data bytes? Well, the Cross Overlay cleaning function should add one last padding byte for you in this case.

Potential Pitfalls

Suppose you want to set the .OR of a program to $1000, and then set the Cross Overlay to AVR? Not that this is a useful thing to do, because an AVR processor wants its program to start at $0000, but who cares.
This will fail silently. No warnings are given, but something goes wrong here anyway. Because per default the instruction size is 1 byte per instruction, the .OR will set the Target Pointer to $1000, just like it is supposed to do. Loading the Cross Overlay will not change that. So your target address pointer is not doubled, as it should haven been.
Therefore it is best to define the required Cross Overlay as soon as possible in your program, which is a good idea anyway.

Be careful with the .PH and .TA directives when using a multi-byte Cross Overlay. Careless use of these directives may cause your program to fail, even though there are no errors in your logic. For instance setting the Target Pointer to an odd address will most likely result in a useless piece of code.
Fortunately the modern multi-byte per instruction processors have less use for these two directives anyway because their ROM memory is on chip and always starts at address $0000.