GBDK 2020 Docs  4.3.0
API Documentation for GBDK 2020
Coding Guidelines

Learning C / C fundamentals

Writing games and other programs with GBDK will be much easier with a basic understanding of the C language. In particular, understanding how to use C on "Embedded Platforms" (small computing systems, such as the Game Boy) can help you write better code (smaller, faster, less error prone) and avoid common pitfalls.

General C tutorials

Embedded C introductions

Game Boy games in C

Understanding the hardware

In addition to understanding the C language it's important to learn how the Game Boy hardware works. What it is capable of doing, what it isn't able to do, and what resources are available to work with. A good way to do this is by reading the Pandocs and checking out the awesome_gb list.

Writing optimal C code for the Game Boy and SDCC

The following guidelines can result in better code for the Game Boy, even though some of the guidance may be contrary to typical advice for general purpose computers that have more resources and speed.

Tools

GBTD / GBMB, Arrays and the "const" keyword

Important: The old GBTD/GBMB fails to include the const keyword when exporting to C source files for GBDK. That causes arrays to be created in RAM instead of ROM, which wastes RAM, uses a lot of ROM to initialize the RAM arrays and slows the compiler down a lot.

__Use of toxa's updated GBTD/GBMB is highly recommended.__

If you wish to use the original tools, you must add the const keyword every time the graphics are re-exported to C source files.

Avoid Reading from VRAM

In general avoid reading from VRAM since that memory is not accessible at all times. If GBDK a API function which reads from VRAM (such as get_bkg_tile_xy()) is called during a video mode when VRAM is not accessible, then that function call will delay until VRAM becomes accessible again. This can cause unnecessary slowdowns when running programs on the Game Boy. It is also not supported by GBDK on the NES platform.

Instead it is better to store things such as map data in general purpose RAM which does not have video mode access limitations.

For more information about video modes and VRAM access see the pan docs:

https://gbdev.io/pandocs/STAT.html#stat-modes

Variables

  • Use 8-bit values as much as possible. They will be much more efficient and compact than 16 and 32 bit types.
  • Prefer unsigned variables to signed ones: the code generated will be generally more efficient, especially when comparing two values.
  • Use explicit types so you always know the size of your variables. int8_t, uint8_t, int16_t, uint16_t, int32_t, uint32_t and bool. These are standard types defined in stdint.h (#include <stdint.h>) and stdbool.h (#include <stdbool.h>).
  • Global and local static variables are generally more efficient than local non-static variables (which go on the stack and are slower and can result in slower code).
    • An exception to this when there are a small number of local variables (one or two) and the code is not complex. Then the compiler may allocate those variables to CPU registers instead which may be faster.
    • Functions which use global or static local variables will loose re-entrancy. In most cases it is not a problem, but important to keep in mind.
    • In particular avoid putting big arrays on the stack, consider static local or global.
  • Keep the number of arguments passed to functions small (ideally one or two arguments at most). When there are a large number of arguments they get pushed onto the stack and result in more overhead for function calls. See the Calling Conventions in the SDCC compiler manual for details.
  • const keyword: use const for arrays, structs and variables with read-only (constant) data. It will reduce ROM, RAM and CPU usage significantly. Non-const values are loaded from ROM into RAM inefficiently, and there is no benefit in loading them into the limited available RAM if they aren't going to be changed.
  • Here is how to declare const pointers and variables:
  • For calculated values that don't change, pre-compute results once and store the result. Using lookup-tables and similar approaches can improve speed and reduce code size. Macros can sometimes help. It may be beneficial to do the calculations with an outside tool and then include the result as C code in a const array.
  • Use an advancing pointer (someStruct->var = x; someStruct++) to loop through arrays of structs instead of using indexing each time in the loop someStruct[i].var = x.
  • When modifying variables that are also changed in an Interrupt Service Routine (ISR), wrap them the relevant code block in a __critical { } block. See http://sdcc.sourceforge.net/doc/sdccman.pdf#section.3.9
  • When using constants and literals the U, L and UL postfixes can be used.
    • U specifies that the constant is unsigned
    • L specifies that the constant is long.
    • NOTE: In SDCC 3.6.0, the default for char changed from signed to unsigned. The manual says to use --fsigned-char for the old behavior, this option flag is included by default when compiling through lcc.

  • A fixed point type (fixed) is included with GBDK when precision greater than whole numbers is required for 8 bit range values (since floating point is not included in GBDK).

See the "Simple Physics" sub-pixel example project.

Code example:

  fixed player[2];
  ...
  // Modify player position using its 16 bit representation
  player[0].w += player_speed_x;
  player[1].w += player_speed_y;
  ...
  // Use only the upper 8 bits for setting the sprite position
  move_sprite(0, player[0].h ,player[1].h);

Code structure

  • Do not #include .c source files into other .c source files. Instead create .h header files for them and include those. https://www.tutorialspoint.com/cprogramming/c_header_files.htm
  • Instead of using a blocking delay() for things such as sprite animations/etc (which can prevent the rest of the game from continuing) many times it's better to use a counter which performs an action once every N frames. sys_time may be useful in these cases.
  • When processing for a given frame is done and it is time to wait before starting the next frame, vsync() can be used. It uses HALT to put the CPU into a low power state until processing resumes. The CPU will wake up and resume processing at the end of the current frame when the Vertical Blanking interrupt is triggered.
  • Minimize use of multiplication, modulo with non-powers of 2, and division with non-powers of 2. These operations have no corresponding CPU instructions (software functions), and hence are time costly.
    • SDCC has some optimizations for:
      • Division by powers of 2. For example n /= 4u will be optimized to n >>= 2.
      • Modulo by powers of 2. For example: (n % 8) will be optimized to (n & 0x7).
    • If you need decimal numbers to count or display a score, you can use the GBDK BCD (binary coded decimal) number functions. See: bcd.h and the BCD example project included with GBDK.
  • Avoid long lists of function parameters. Passing many parameters can add overhead, especially if the function is called often. Globals and local static vars can be used instead when applicable.
  • Use inline functions if the function is short (with the inline keyword, such as inline uint8_t myFunction() { ... }).
  • Do not use recursive functions.

GBDK API/Library

  • stdio.h: If you have other ways of printing text, avoid including stdio.h and using functions such as printf(). Including it will use a large number of the background tiles for font characters. If stdio.h is not included then that space will be available for use with other tiles instead.
  • drawing.h: The Game Boy graphics hardware is not well suited to frame-buffer style graphics such as the kind provided in drawing.h. Due to that, most drawing functions (rectangles, circles, etc) will be slow . When possible it's much faster and more efficient to work with the tiles and tile maps that the Game Boy hardware is built around.
  • waitpad() and waitpadup check for input in a loop that doesn't HALT at all, so the CPU will be maxed out until it returns. One alternative is to write a function with a loop that checks input with joypad() and then waits a frame using vsync() (which idles the CPU while waiting) before checking input again.
  • joypad(): When testing for multiple different buttons, it's best to read the joypad state once into a variable and then test using that variable (instead of making multiple calls).

Toolchain

  • See SDCC optimizations: http://sdcc.sourceforge.net/doc/sdccman.pdf#section.8.1
  • For details about default Compiler data types, see the SDCC Manual (follow links and scroll down 1 page)
  • Use profiling. Look at the ASM generated by the compiler, write several versions of a function, compare them and choose the faster one.
  • Use the SDCC --max-allocs-per-node flag with large values, such as 50000. --opt-code-speed has a much smaller effect.
    • GBDK-2020 (after v4.0.1) compiles the library with --max-allocs-per-node 50000, but it must be turned on for your own code.
      (example: lcc ... -Wf--max-allocs-per-node50000 or sdcc ... --max-allocs-per-node 50000).
    • The other code/speed flags are --opt-code-speed or --opt-code-size.
  • Use current SDCC builds from http://sdcc.sourceforge.net/snap.php
    The minimum required version of SDCC will depend on the GBDK-2020 release. See GBDK Release Notes
  • Learn some ASM and inspect the compiler output to understand what the compiler is doing and how your code gets translated. This can help with writing better C code and with debugging.

Constants, Signed-ness and Overflows

There are a some scenarios where the compiler will warn about overflows with constants. They often have to do with mixed signedness between constants and variables. To avoid problems use care about whether or not constants are explicitly defined as unsigned and what type of variables they are used with.

WARNING: overflow in implicit constant conversion

  • A constant can be used where the the value is too high (or low) for the storage medium causing an value overflow.
    • For example this constant value is too high since the max value for a signed 8 bit char is 127.
       #define TOO_LARGE_CONST 255
       int8_t signed_var = TOO_LARGE_CONST;
      
  • This can also happen when constants are not explicitly declared as unsigned (and so may get treated by the compiler as signed) and then added such that the resulting value exceeds the signed maximum.
    • For example, this results in an warning even though the sum total is 254 which is less than the 255, the max value for a unsigned 8 bit char variable.
        #define CONST_UNSIGNED 127u
        #define CONST_SIGNED 127
        uint8_t unsigned_var = (CONST_SIGNED + CONST_UNSIGNED);
      
    • It can be avoided by always using the unsigned u when the constant is intended for unsigned operations.
        #define CONST_UNSIGNED 127u
        #define CONST_ALSO_UNSIGNED 127u  // <-- Added "u", now no warning
        uint8_t unsigned_var = (CONST_UNSIGNED + CONST_ALSO_UNSIGNED);
      

Chars and vararg functions

Parameters (chars, ints, etc) to printf / sprintf should always be explicitly cast to avoid type related parameter passing issues.

For example, below will result in the likely unintended output:

printf(str_temp, "%u, %d, %x\n", UINT16_MAX, INT16_MIN, UINT16_MAX);
// Will output: "65535, 0, 8000"
#define UINT16_MAX
Definition: stdint.h:133
#define INT16_MIN
Definition: stdint.h:117
void printf(const char *format,...)

Instead this will give the intended output:

printf(str_temp, "%u, %d, %x\n", (uint16_t)UINT16_MAX, (int16_t)INT16_MIN, (uint16_t)UINT16_MAX);
// Will output: "65535, -32768, FFFF"
short int int16_t
Definition: stdint.h:44
unsigned short int uint16_t
Definition: stdint.h:52

Chars

In standard C when chars are passed to a function with variadic arguments (varargs, those declared with ... as a parameter), such as printf(), those chars get automatically promoted to ints. For an 8 bit CPU such as the Game Boy's, this is not as efficient or desirable in most cases. So the default SDCC behavior, which GBDK-2020 expects, is that chars will remain chars and not get promoted to ints when explicitly cast as chars while calling a varargs function.

For example:

unsigned char i = 0x5A;
// NO:
// The char will get promoted to an int, producing incorrect printf output
// The output will be: 5A 00
printf("%hx %hx", i, i);
// YES:
// The char will remain a char and printf output will be as expected
// The output will be: 5A 5A
printf("%hx %hx", (unsigned char)i, (unsigned char)i);

Some functions that accept varargs:

Also See:

When C isn't fast enough

For many applications C is fast enough but in intensive functions are sometimes better written in assembly. This section deals with interfacing your core C program with fast assembly sub routines.

Reusable Local Labels and Inline ASM

When functions are written assembly it's generally better to not mix the inline ASM with C code and instead write the whole function in assembly.

If they are mixed then descriptive named labels should not be used for inline ASM. This is due to descriptive labels interfering with the expected scope of the reusable local labels generated from the compiled C code. The compiler will not detect this problem and the resulting code may fail to execute correctly without warning.

Instead use reusable local symbols/labels (for example 1$:). To learn more about them check the SDAS manual section "1.3.3 Reusable Symbols"

Variables and registers

Getting at C variables is slightly tricky due to how local variables are allocated on the stack. However you shouldn't be using the local variables of a calling function in any case. Global variables can be accessed by name by adding an underscore.

Segments / Areas

The use of segments/areas for code, data and variables is more noticeable in assembler. GBDK and SDCC define a number of default ones. The order they are linked is determined by crt0.s and is currently as follows for the Game Boy and related clones.

  • ROM (in this order)
    • _HEADER: For the Game Boy header
    • _CODE: CODE is specified as after BASE, but is placed before it due to how the linker works.
    • _HOME
    • _BASE
    • _CODE_0
    • _INITIALIZER: Constant data used to init RAM data
    • _LIT
    • _GSINIT: Code used to init RAM data
    • _GSFINAL
  • Banked ROM
    • _CODE_x Places code in ROM other than Bank 0, where x is the 16kB bank number.
  • WRAM (in this order)
    • _DATA: Uninitialized RAM data
    • _BSS
    • _INITIALIZED: Initialized RAM data
    • _HEAP: placed after _INITIALIZED so that all spare memory is available for the malloc routines.
    • STACK: at the end of WRAM

Calling convention

The following is primarily oriented toward the Game Boy and related clones (sm83 devices), other targets such as sms/gg may vary.

SDCC in common with almost all C compilers prepends a _ to any function names. For example the function printf(...) begins at the label _printf::. Note that all functions are declared global.

Functions can be marked with OLDCALL which will cause them to use the __sdcccall(0) calling convention (the format used prior to SDCC 4.2 & GBDK-2020 4.1.0).

Starting with SDCC 4.2 and GBDK-2020 4.1.0 the new default calling convention is__sdcccall(1).

For additional details about the calling convetions, see sections SM83 calling conventions and Z80, Z180 and Z80N calling conventions in the SDCC manual.

Banked Calling Convention

The following is primarily oriented toward the Game Boy and related clones (sm83 devices), other targets such as sms/gg may vary.

Key Points:

  • Function arguments (if present) are always placed on the stack, right to left without particular alignment
  • A fixed stack offset (sm83:+4, z80:+3) is added by the Callee (to skip the pushed Caller Bank and additional Trampoline Return Address)
  • Return values follow the calling convention (__sdcccall(1), or __sdcccall(0) for OLDCALL)

Terminology:

  • Caller: the code which is calling the requested function
  • Callee: the function to be called (declared as BANKED or __banked)
  • Trampoline: The intermediary which performs the bank switching and does hand-off between Caller and Callee during the call and then return.

Banked Call Trampoline

  • Banked calls are performed via a trampoline in the non-banked region 0000-3ffff
  • The __sdcc_bcall_ehl trampoline is used by default
    • With it both calling conventions are supported: __sdcccall(1) (default) or __sdcccall(0) for OLDCALL
  • If --legacy-banking is specified to SDCC the __sdcc_bcall trampoline is used.
    • This may only be used with __sdcccall(0)

Process for a banked call (using __sdcc_bcall_ehl, the default)

  1. The Caller
    • Function arguments (if present) are always placed on the stack, right to left without particular alignment
    • The Bank of Callee function is placed into register E
    • The Address of Callee function is placed in HL
    • Calls the bank switch Trampoline (which adds Caller return address to the stack)
  2. The Trampoline
    • Saves the Current Bank onto the stack (pushed as AF, so 16 bits)
    • Switches to the Bank of Callee function (in register E)
    • Calls the Callee function address in HL (which adds Trampoline return address to the stack)
  3. The Callee Function
    • SDCC will use an offset to skip the first N bytes of the stack
      • For sm83 (GB/AP/DUCK): skip first 4 bytes
      • For z80 (GG/SMS/etc): skip first 3 bytes
    • Return values follow the calling convention (__sdcccall(1), or __sdcccall(0) for OLDCALL)
    • Executes a return to Trampoline
  4. The Trampoline
    • Switches to the Bank of the Caller saved on the stack (and moves Stack Pointer past it)
    • Executes a return to Caller
  5. The Caller
    • Cleans up the stack and uses return value (if present)