PIC Mid-range Cores Introduction

        .cr     pic14   To load this cross overlay

This page describes the cross overlay for the PIC Mid-range family of cores. These cores store all their instructions in 14-bit words in program memory. I have used the Code memory architecture to name the cross overlays. So the pic14 cross overlay is for the Mid-range family, which uses 14-bits instruction words.

According to Microchip all these controllers are easy to learn because there are only some 30 to 40 instructions to learn. Unfortunately not everything is as easy as it may seem. I must admit that the base models are quite easy to comprehend. But things get complicated very quickly when larger controllers start working with memory paging and register bank switching.

SB-Assembler Version Differences

This page describes the PIC Base-line cross overlay for SB-Assembler Version 3 only. Because it differs so much from SB-Assembler Version 2 it deserves a page of its own. Please note that Version 2 of the SB-Assembler is no longer maintained and therefor lacks certain instructions which were added to later models by Microchip.

Programming Model

I only include a little summary about the features of the PIC Mid-range family. All these family members have a 14-bit instruction size in common. It is not my intention to make the original documentation obsolete, so please refer to the original documentation for further details.

PIC14 programming model

The register file can be addressed by a 7-bit pointer, giving a total addressing range of 128 bytes. The first 32 bytes of this range contains up to 4 banks of special function registers. Those registers are used to set options, control peripherals, and address the I/O ports.
The 96 remaining addresses of the register file are general purpose registers. The main part of this area can also be bank switched with up to 4 banks, depending on the device. Only the registers $070 to $07F are never bank switched and are accessible from within any bank. This means that addresses $070 to $07F are equal to addresses $F0 to $FF and addresses $170 to $17F and addresses $1F0 to $1FF. Some devices may implement unused Special Function Registers as normal general purpose registers.
All registers can be accessed directly or indirectly. Addresses from bank 0 range from $000 to $07F, bank 1 ranges from $080 to $0FF, bank 2 ranges from $100 to $17F, and finally bank 3 ranges from $180 to $1FF. The bank switching itself is done by setting 2 bits in the Status register appropriately.

I will only describe the most important control registers that are present on almost all devices here.

W

The Working register W is not part of the register file and is used as an operand in most instructions. On other micro processors this register would have been called the Accumulator.

f$00 INDF (Indirect Data Addressing)

This is not a physical register. It is used to indirectly address any of the file registers, even the ones which can't be addressed directly. Whenever this register is the source or destination of an operation the contents of the FSR register, combined with the IRP bit, is used to point to the actual file register to be used for the operation.

f$02 PCL

This is the low byte of the program counter. Normally the entire program counter is incremented after fetching each instruction. The program counter can access up to 8 k words of memory. After reset the program counter is initialized to all 0's, effectively starting the program at the lowest available address. Only the low byte of the program counter can be read and written under program control. All other bits of the program counter can only be written to by means of the PCLATH f$0A register.
Writing to the PCL register in any way effectively causes a computed jump. Bit b12..b8 of the program counter will get the value of the PCLATH register, effectively loading the program counter with a 13-bit new value. Devices with less memory will mirror their normal ROM memory into the unoccupied areas (unimplemented bits of the program counter are ignored).
GOTO and CALL instructions can only change the bits b10..b0 directly. Bits b12 and b11 will get their value from the corresponding bits (b4 and b3) of the PCLATH register, when implemented in the device.
It is the programmer's responsibility to set PCLATH properly before any GOTO or CALL instruction.

PCL is never bank switched and can always be reached at addresses $002, $082, $102 and $182.

The Mid-range series of cores have a 13-bits wide eight level deep hardware stack for return addresses. This means that subroutines may be nested 8 levels deep (including one level for a pending interrupt) before the stack overflows which looses the oldest return address. The stack can not be manipulated in any other way then by the CALL, RETLW, RETURN and RETFIE instructions or by an interrupt.

f$03 Status word register

This register contains 8 system flags.

Bit 7IRPIndirect Register bank select bit
Bit 6RB1Select Register Bank bit 1
Bit 5RP0Select Register Bank bit 0
Bit 4TOTime Out Flag
Bit 3PDPower down Flag
Bit 2ZZero Flag
Bit 1DCBCD Digit Carry Flag
Bit 0CCarry Flag

The C, DC and Z flags reflect the status of previous instructions. The PD and TO bits can be used to find the reason why the device was restarted and are read-only bits. The bits RP1 and RP0 are used to select one of the 4 possible register file banks during direct addressing. The IRP bit is the ninth bit (b8) during indirect addressing of the register file, while the FSR register holds the other 8 addressing bits.

STATUS is never bank switched and can be reached at addresses $003, $083, $103 and $183.

f$04 FSR

The File Select Register is used as a data pointer for indirect addressing of any of the registers in the file. The FSR register holds 8 bits of the 9 bits which are needed to address up to 4 banks of 128 registers each. The ninth bit (b8) is the IRP bit in the Status register.
Reading or writing to register f$00 will effectively read or write to the address pointed to by the contents of the FSR register and the IRP bit in the Status register.

FSR is never bank switched and can be reached at addresses $004, $084, $104 and $184.

f$0A PCLATH

This register is used as a buffer for the 5 most significant address bits.
The bits b4 and b3 of this register are copied to the program counter whenever a GOTO or CALL instruction is executed.
The bits b4..b0 of this register are copied to the 5 most significant bits of the program counter whenever the PCL register is directly written to.

It's the programmer's responsibility to set PCLATH before any displacement operation.

PCLATH is never bank switched and can be reached at addresses $00A, $08A, $10A and $18A.

Please note that none of the above mentioned register names are pre-defined by the SB-Assembler. You can use normal labels in combination with the .EQ directive to declare names for all the file registers, including the special function registers.

Reserved Words

The SB-Assembler PIC14 Mid-range cross overlay has only 2 reserved words, W and F. Avoid assigning labels with these two names, and you're safe.

Target Files

Storing 14-bit instruction words in an 8-bit oriented target file requires special treatment. You may choose any target file format you like. Most PIC programming devices require you to supply an Intel HEX file though, so I advise you to use only the Intel format for your target files when writing programs for the PICmicro families.

Microchip recommends 2 methods of storing your target files. The first method is to split the 14-bit instruction words and store the low bytes in one target file and the remaining bits in a second target file. The SB-Assembler doesn't support this method, which is not such a big problem because the second method is accepted by all PIC programming devices I have seen so far.
The second method that is recommended by Microchip is to store all instruction words in one target file, low byte first, and the remaining bits in the next byte. This method is supported by the SB-Assembler.

Please note that the PIC14 cross overlay of the SB-Assembler will always store pairs of bytes so that all source addresses are doubled in the target file. So if you write an instruction at word address $0123 it will end up at address $0246 and $0247 in the target file.

In case a device has built in EEPROM memory, this memory is stored in the target file directly following the program memory. So if a device has 1024 words of program memory, its EEPROM memory will start at byte address 2048 (double the program word size). EEPROM data bytes are stored as words, leaving the high byte 0.

Every PIC processor allows you to write 4 ID words. These ID words are usually stored at word addresses $2000 to $2003 (Byte addresses $4000 to $4007) These ID words can be used to identify your software versions, or may contain any other information you like. The ID words can not be read under program control, they are only accessible by programming devices. Each ID word occupies 2 bytes, but can contain only 7 bits.
Please refer to the new .ID directive for more information on how to include the ID words in your program.

Every device also has a Config word, allowing you to select the type of oscillator and some other configuration options. This Config word can be included in the target file too and is usually written at word address $2007 (which is byte address $400E and $400F). Please refer to the new .CW directive for more information.

Special Features

I've added a few new directives specifically tailored to suit the PIC cross overlay. All new directives are explained later on this page.

Altered behaviour of Directives

Every data byte written to program memory has to be translated to the RETLW instruction. Therefore all data generating instructions will write every generated byte as a RETLW instruction, as long as we're still writing to program memory. Data bytes stored beyond the program memory are written as words, with the high byte being $00, to the target file. Usually these data bytes will end up in the device's EEPROM memory.
The .MS (Memory Size) directive is used to set the size of the program memory. This is important for the behaviour of assembler when it comes to writing data bytes.
The above only applies to Code memory. It does not apply to the RAM and EEPROM memory segments. Please note that the use of EEPROM segments will write the data as bytes to a separate file, not as words. This can still be useful if you are using an external EEPROM device.

File Register addressing

File registers are numbered from $00 to $1FF, while the actual addressing range runs from $00 to $7F. The SB-Assembler doesn't check the address when addressing any file register, although the actual instruction keeps only the 7 lowest bits. This allows you to access the file registers by their absolute addresses, but you must manually take care of the proper addressing of the upper two bits of the file address.

Immediate prefix is optional

Immediate data is called literal data in Microchip's documentation. All instructions that can handle immediate data end with the letters LW, which indicates that a literal value is to be used. The immediate pre-fixes #, /, = or \ may be used to identify an immediate value. The assembler will assume the default # pre-fix if no pre-fix symbol precedes the operand value, effectively using the LSB of the 32-bit value as operand.

RETLW

Data bytes are stored as RETLW instruction words in the program memory. If you want to read some data from program memory you should CALL to the desired location. The RETLW instruction at that location will immediately return to the calling routine with a literal value loaded in W.
You may add multiple expressions behind a single RETLW instruction, effectively generating multiple RETLW instructions, each with its own return value.

You may also omit the data byte completely after a RETLW instruction, effectively making it a RETURN instruction. Keep in mind that the RETLW instruction without return value will return with the W register cleared to 0. Fortunately the Mid-range cores have a real RETURN instruction, which doesn't affect the W register.
If you do want to omit the data value, make sure any comments will follow the RETLW instruction after at least 10 spaces, otherwise the comment will be interpreted as return value.

Examples

        RETLW  $12,$34,$56
        RETLW          Comment after at least 10 spaces

CALL and GOTO address range

Some restrictions apply to the destination addresses of CALL and GOTO instructions due to the nature of the memory organization of the PIC Mid-range family of processors. Memory consists of up to 4 banks of 2048 instruction words each.

CALL and GOTO instructions may cause a call or jump to any location within the 8k word program memory, even though only 11 bits of the destination address can be supplied by the instruction itself. The upper two address bits are copied from bits 4 and 3 of the PCLATH register every time a CALL of GOTO instruction is executed. It's the programmer's responsibility to set these bits appropriately before executing the CALL or GOTO instruction.

The assembler does not check the address of the CALL or GOTO even though only the lowest 11 bits are actually used. This allows you to treat the 4 program memory banks as a linear address space.

Page crossing

The PIC processor's address counter will wrap around to 0 when it reaches past word address 2047. The SB-Assembler will not warn you if this happens.
Likewise, the SB-Assembler will not warn you if a data table spans across two 256 words sized pages of program memory. On the Mid-range processors this is not necessarily a problem, but you should make special arrangements to access such a split data table in program memory.

You can use the .OT and .CT directives in both cases to warn you for page crossings if you like.

Destination flag

Some instructions allow you to store the result in the W register or in the register file. You tell that to the assembler by adding ,W or ,F behind the operand of the instruction.
If you don't specify the destination flag the default destination will be F.

You can also use an expression as destination flag. The W register will be the destination if the expression evaluates to 0. And it will use the register file if the expression evaluates to 1. Any other value will generate a Out of range error.

Pseudo instructions

Microchip recommends a set of pseudo instructions. I have implemented them all. Please refer to the opcode test file to see them all.

Two of the pseudo instructions require a bit more explanation though, LCALL and LGOTO. These two instructions help you to reach the entire program memory more easily. You can view the entire 8k word of program memory as a linear address space, even though the PIC divides that space into 4 banks. If you want to jump or call to the first bank, both bits 4 and 3 of the PCLATH register have to be cleared. If you want to jump or call to the second bank, only bit 3 of PCLATH has to be set. Jumps or calls to the third bank requires you to only set bit 4. While both bits 4 and 3 have to be set if you want to jump to program memory bank 4.
That is exactly what the LCALL and LGOTO instructions will do for you. But they do it in a clever way. On devices with up to 2048 words of program memory, there is no need to set or clear any bits, so the LCALL and LGOTO instructions won't. On devices with up to 4096 words of program memory, only bit 3 of PCLATH needs to be set or cleared, and again that is exactly what the LCALL and LGOTO instructions will do. Finally on devices with more than 4096 words of program memory, both flags need to be set or cleared.
It goes without saying that the SB-Assembler needs to know the program memory size of your device in order to make the right decisions. You do that with the .MS directive.
One last note on how the bits of PCLATH are set and cleared. PCLATH holds other bits, which are used as high byte when PCL is written to. This register can hold any value when the instruction LCALL or LGOTO is found. Therefore the entire PCLATH register will be cleared, before bit 4 or 3 are set if required.

Please keep in mind that some of the Pseudo instructions are composed of multiple real instructions. This means that you cannot use such Pseudo instructions immediately following one of the Skip instructions! Such a Skip instruction will only skip the first instruction of a multi-instruction Pseudo instruction. This is hardly ever what you really want.

Extra Directives

The PIC Mid-range cross overlay requires some extra directives. These include directives to set some options and to check if a program memory page boundary has been crossed.

.CT     Close Table

Syntax:

        .CT

Function:

This directive signals the end of your table in memory. It will present a Table crossed page boundary error if this directive is located on a different page than which was stored by the previous .OT directive.

Explanation:

Tables of data will usually not allow the high byte of the program counter to change in between. That's why the .OT and .CT directives are used to signal the beginning and the end of a table, verifying that a page crossing hasn't occurred in between.

The .OT directive should be placed at the beginning of the table. It will memorize the memory page of the beginning of the table. At the end of the table the closing .CT directive should still be on the same memory page. There's only one exception to this rule, and that is when the .CT is at the first location of the next page, because then the table ended just in time on the previous page.
No error message is reported when both pages are equal. A Table crossed page boundary error is reported when they do differ during pass 2 of the assembly process. The error message is only reported during pass 2 to enable you to find out at what location the table crossed the page boundary in order to fix the problem by either moving the table or moving away some code in front of the table.
No error or warning is shown when there was no matching .OT directive found.

Example

The example below shows a typical lookup table routine. It converts a decimal digit to a seven segment pattern. I think it clearly demonstrates the use of the .OT and .CT directives.

            .OT                   Mark the beginning
SEGMENTS    RETLW #%0011.1111     0  Return with pattern for '0'
            RETLW #%0000.0110     1  Return with pattern for '1'
            RETLW #%0101.1011     2  Return with pattern for '2'
            RETLW #%0100.1111     3  Return with pattern for '3'
            RETLW #%0110.0110     4  Return with pattern for '4'
            RETLW #%0110.1101     5  Return with pattern for '5'
            RETLW #%0111.1101     6  Return with pattern for '6'
            RETLW #%0000.0111     7  Return with pattern for '7'
            RETLW #%0111.1111     8  Return with pattern for '8'
            RETLW #%0110.1111     9  Return with pattern for '9'
            .CT                   End of table, checking page

.CW     Config Word

Syntax:

        .CW  #expression

Function:

This directive is used to set the Config Word flags of the device which is used to select the oscillator type and other settings.

Explanation:

The Config Word is used to select the oscillator type, code protection, watchdog settings and various other options, depending on the device type you're using. So please consult the data sheet of your device to find out what bits to set or clear for your required options.
The Config word is usually located at word address $2007 (byte address $400E and $400F). You may put the .CW directive anywhere outside the program memory, but Microchip recommends you to put it at address $2007 so it can be understood by all chip programmers

The .CW directive is # symbol tolerant. This means that you may precede your bit pattern by a # symbol if you like, but you don't have to.
The bit pattern you enter may not exceed 14 bits. Entering larger numbers, or negative numbers, will result in a Out of range error.

The .CW directive is only allowed outside the program memory space, which is set-up by the .MS directive. If you do try to use the .CW directive while in program memory space you'll get a Directive only allowed beyond program memory error.

Examples

          .OR    $2007               Put CW in recommended address
          .CW    %0010.1010.100.0100 Just an arbitrary pattern of bits

.ID     ID words

Syntax:

        .ID  expression

Function:

This directive allows you to set-up the four 7-bit ID words of the device.

Explanation:

Every device contains four 7-bit ID words, which can be loaded with any value you like. They serve the purpose of identifying programmed parts. The ID words can only be read during program/verify mode and are not accessible during normal operation. Microchip recommends to place the 4 ID words at word addresses $4000 to $4003 (Byte addresses $4000 to $4007). The SB-Assembler will allow you to place the ID words anywhere outside the program memory.

You'll get a Directive only allowed beyond program memory error message if you try to use the .ID directive somewhere in program memory.
You should use the .ID directive only once in your program to set-up the ID words. No error is reported if you do have more occurrences of the .ID directive in your program though.

The directive will accept from 1 up to 4 expressions, which all should evaluate to a 7 bit value. If you don't supply all 4 ID words the remaining ID words will be written with 0. The range of the expression is not checked, the assembler will simply use the 7 least significant bits of each word.

Example

        .OR    $2000               Recommended location
        .ID    $12,$34,$56,$67
        .ID    #$12,#$34,#56,#$67  Directive is # tolerant
        .ID    $12,$34             Not all words are required

.MS     Memory Size

Syntax:

        .MS  expression

Function:

This directive is used to set the expected program memory size of your particular device.

Explanation:

The .MS directive is preferably used before any code is generated. Although the SB-Assembler is perfectly happy if you use the directive just in time before you actually run out of the default memory size of $0400 words.

There are several reasons why you are required to specify the program memory size (in words, not in generated bytes). The most important reason is to be able to determine where data bytes must be translated to RETLW instructions. Another reason is to verify if the .ID and .CW directives were not used in program memory space. It also allows the SB-Assembler to determine the proper translation for LCALL and LGOTO pseudo instructions. Finally it gives the SB-Assembler the opportunity to warn you if you try to use more memory than is actually available.

The .MS directive will accept any expression resulting in a value up to $2000 (i.e. 8k words). A Out of range error will be reported if the expression evaluates to a value above $2000 or below 256. The expression may not contain forward referenced labels.
Any other size is accepted, even silly ones.

The default memory size is $0400 words.

.OT     Open Table

Syntax:

        .OT

Function:

This directive signals the beginning of your table in memory.

Explanation:

.OT saves the current memory page address internally. Later on it should match the memory page address of the next .CT directive.

The .OT directive will automatically call the .CT function if you previously started a table without closing it properly with a .CT directive. After closing the previous table this way the new table will start at the current location as if you had inserted a .CT directive yourself. This way you can concatenate several tables after each other.
Remember though that it becomes more difficult to find a piece of memory large enough to hold all your concatenated tables if you rely on this automatic closing of previously opened tables.

Please note that the .OT directive would never generate the Table crossed page boundary error by itself if it wasn't for the automatic calling of the .CT routine when a previous table wasn't closed.

Extra Error Messages

A few extra error messages are added to the standard repertoire of error messages.

Table crossed page boundary

This error is generated by the .CT directive and sometimes by the .OT directive.
The error is only generated if the .CT directive is on a different memory page than the previous .OT directive. If the error occurs it will be reported during pass 2 of the assembly process only. This is to simplify the search for a suitable new place for your table. You would have no way of telling at what address the page was crossed if the error was reported during pass 1 instead.

Directive only allowed within Code memory

This error is triggered because you tried to generate instruction words beyond the end of the program memory space. You're only allowed to save data, ID words and a Config word beyond the end of the program memory space.
The program space boundary can be set by the .MS directive.

Directive only allowed beyond program memory

This error is reported because you tried to use the .ID or .CW directive while still being somewhere in program memory space.
Remember to set the program memory size (.MS) correctly!

Overlay Initialization

Five things are set while initializing the PIC14 overlay every time it is loaded by the .CR directive.

  • Little endian mode is selected for the data generating directives. This means that words or long words are stored with their low byte first.
  • Default memory size is set to $400.
  • The target factor is set to 2, meaning that the target address will always be twice as high as the program counter.
  • An internal pointer is changed so that generated data bytes will be stored as RETLW instructions.
  • Some extra error messages are added to the list of standard error messages.

Differences Between Other Assemblers

There are some differences between the SB-Assembler and other assemblers for the PIC14 family processor. These differences require you to adapt existing source files before they can be assembled by the SB-Assembler. This is not too difficult though, and is the (small) price you have to pay for having a very universal cross assembler.

  • The handling of ID words and the Config word may differ from other assemblers.
  • Not all assemblers translate data bytes into RETLW instructions automatically.
  • Not all assemblers will allow multiple literal values, or none at all, following one RETLW mnemonic.
  • The obvious differences in notation of directives and radixes common to all SB-Assembler crosses.
  • None of the SFR registers and bit names are pre-defined. They can be declared using normal labels though.