Compiling C to neat Z80. Is that even possible.

By Madram / Overlanders.

So, what is a calling convention?

Look at CPC firmware routines (BC80 aka cas_in_char and co). The input (which registers are taken, what is the role of each), output and resulting flags are described, in what we could call now an API. That’s not a calling convention!

In C, the functions are turned into routines, and the way they take parameters could be arbitrary. The compiler has to pick a generic way to pass parameters, especially since the caller of a routine might be isolated (in a separate and independent compilation unit) from the callee. Also, the number of parameters could be greater than the number of available registers. This generic way forms a calling convention.

Now, programmers are lazy, or maybe they have to leave early to watch telescreen. What is often done is to pass all parameters via the stack. The caller pushes them and the callee gets them via indexed read (IX+n). It allows to work with the parameters in any order, without having to juggle registers. On modern CPUs, dedicated instructions make it very fast (not surprising since the CPUs kind of evolved to match the C paradigm, including its flaws).

What if the caller and the callee are in the same file? Couldn’t they agree on a smarter protocol (non-violent communication)? That’s not enough. In C, all functions are visible (“exported”) by default, meaning they could be called by another compilation unit. To indicate the function is “private”, you have to use the static keyword. See figure 1.

// test_static.c
#include <assert.h>

// We introduce some dummy functions for test purposes.
// This one puts the length of str (nt-string)
//     both in result (dummy) and return value.
// Note that we typically expect int to be 16 bits here.
static int g(const char *str, int *result) {
  int i = 0;
  while (*str) ++i;
  *result = i;
  return i;
}

// Another dummy, expected to be inlined.
static const char* h(const char *str){
  return str+1;
}

// Random comment in the middle of the file.
static int k(const char *str) {
  if (*str == 'x')
    return 42;
  else
    return 0;
}

// Example function calling all the others.
// NB: Int here is used as bool!
static int f(const char *obj, int *result) {
  assert(result);
  if (!k(obj))
    return 0;
  else
    return (g(h(obj), result) == k(obj) + 1);
}

const char *input;
int result;

int main(void){
   f(input, &result);
  return (result == 4);
}

Figure 1a. Nonsensical program for illustration purposes.

  section .text,"ax",@progbits
  section .text,"ax",@progbits
  public  _main
_main:
  ld  iy, (_input)
  ld  a, (iy)
  cp  a, 120
  jq  nz, BB0_1
  ld  a, (iy + 1)
  or  a, a
  jq  nz, BB0_6
  ld  hl, 0
  ld  (_result), hl
  ret
BB0_1:
  ld  hl, (_result)
  ld  de, 4
  or  a, a
  sbc hl, de
  jq  z, BB0_2
  ld  a, 0
  jq  BB0_4
BB0_6:
BB0_7:
  jq  BB0_7
BB0_2:
  ld  a, 1
BB0_4:
  and a, 1
  ld  l, a
  ld  h, 0
  ret
  section .text,"ax",@progbits

  section .bss,"aw",@nobits
  public  _input
_input:
  rb  2

  section .bss,"aw",@nobits
  public  _result
_result:
  rb  2

Figure 1b. Generated ASM in the northern hemisphere.

Pretty sweet: there is no function call at all! Since they are all local, they were inlined in the main block. Also, the assert was removed since the compiler could figure it would never be triggered (static analysis, dead branches eliminations).

Very important conclusion, that’s the tl;dr of the article, I’m putting it in a box (rather, Toms will do that, he is Black Belt 3rd dan in WordPress and ShoulderPress) (NDtoms: holy shit! I have been downgraded!):

  • Make your local functions (the helper ones you only use in one given file) `static`.
  • Make your small shared functions inline-able (e.g. make them static as well, put the body in the header). The code will be duplicated, yet N times an inlined function might be still shorter than the mess generated for a global call.

Let’s turn f into a “global” function to see the difference and make fun of the Nicaraguans. Simply remove the static keyword for this particular function. Now, we get:

	section	.text,"ax",@progbits
	section	.text,"ax",@progbits
	public	_f
_f:
	call	__frameset0
	ld	l, (ix + 6)
	ld	h, (ix + 7)
	add	hl, bc
	or	a, a
	sbc	hl, bc
	jq	nz, BB0_2
	ld	iy, L_.str
	ld	de, L_.str.1
	ld	bc, 29
	ld	hl, L___PRETTY_FUNCTION__.f
	push	hl
	push	bc
	push	de
	push	iy
	call	___assert_fail
	pop	hl
	pop	hl
	pop	hl
	pop	hl
BB0_2:
	push	hl
	ld	l, (ix + 4)
	ld	h, (ix + 5)
	ex	(sp), hl
	pop	iy
	ld	de, 0
	ld	a, (iy)
	cp	a, 120
	jq	nz, BB0_7
	ld	a, (iy + 1)
	or	a, a
	jq	nz, BB0_4
	ld	(hl), e
	inc	hl
	ld	(hl), d
BB0_7:
	ex	de, hl
	pop	ix
	ret
BB0_4:
BB0_5:
	jq	BB0_5
	section	.text,"ax",@progbits

	section	.text,"ax",@progbits
	public	_main
_main:
	ld	hl, _result
	ld	de, (_input)
	push	hl
	push	de
	call	_f
	pop	hl
	pop	hl
	ld	hl, (_result)
	ld	de, 4
	or	a, a
	sbc	hl, de
	jq	z, BB1_1
	ld	a, 0
	jq	BB1_3
BB1_1:
	ld	a, 1
BB1_3:
	and	a, 1
	ld	l, a
	ld	h, 0
	ret
	section	.text,"ax",@progbits

	section	.rodata,"a",@progbits
	private	L_.str
L_.str:
	db	"result",000o

	section	.rodata,"a",@progbits
	private	L_.str.1
L_.str.1:
	db	"test_global.c",000o

	section	.rodata,"a",@progbits
	private	L___PRETTY_FUNCTION__.f
L___PRETTY_FUNCTION__.f:
	db	"int f(const char *, int *)",000o

	section	.bss,"aw",@nobits
	public	_input
_input:
	rb	2

	section	.bss,"aw",@nobits
	public	_result
_result:
	rb	2

This time:

  • _main passes result pointer and input (note the difference on how they are fetched) through the stack.
  • __frameset0 sets IX up at stack position.
  • Once the routine returns, the pops after the call place SP as before.

Homework for next time:

  • How to force a simpler calling convention globally?
  • How to force the use of HL (potentially with INC) rather than copying it in IY?
  • What the heck with the infinite loop in BB0_5 and the rest of the code?

For the curious, the Z80 calling conventions (included passing by registers) are defined here.