Skip to content

Hello World

First Program

The first program to write is to print Hello World!. Let's write the first RISC-V assembly code:

asm
.data
hello: .asciz "Hello World!\n"

.text
.global _start
_start:
    # invoke write(1, hello, 13) system call
    li a0, 1
    la a1, hello
    li a2, 13
    li a7, 64
    ecall

    # invoke exit(0) system call
    li a0, 0
    li a7, 93
    ecall

At first glance, this code may look cryptic, but we'll demystify it step by step!

Now, let's compile it with the cross-compiler.

console
$ riscv64-linux-gnu-as hello.S -o hello.o
$ riscv64-linux-gnu-ld hello.o -o hello.out

Finally, we can execute our first program.

console
$ ./hello.out
Hello World!

System Calls

In short, what we've done is just invoke system calls.

The main task of our program is printing a string. To print a string, we have to somehow interact with I/O devices. However, in this example, we used the write system call instead of accessing the devices directly.

asm
    # invoke write(1, hello, 13) system call
    li a0, 1
    la a1, hello
    li a2, 13
    li a7, 64
    ecall

A system call is an abstract interface provided by the operating systems between code and hardware devices. In other words, we delegated the printing task to functionality already implemented by the operating system.

The system call behaves much like ordinary function calls. In fact, the write system calls has the following function signature.

c
extern ssize_t write (int fd, const void *buf, size_t n);

In our assembly code, we passed the arguments through the a0, a1 and a2 registers, which assembly language uses to store values. In this case, a0 contains the value 1, which specifies stdout. It thus prints the string in hello with the length 13 to our terminal.

Finally, the ecall instruction invokes the system call, and the a7 register determines which system call will be invoked. We stored the value 64 to specify the write system call. It transfers control to the operating system, and the control is returned to our code once the system call is done.

Similarly, the following code invokes the exit system call.

asm
    # invoke exit(0) system call
    li a0, 1
    li a7, 93
    ecall

Entry Point

The label _start: in the following lines has special meaning.

asm
.text
.global _start
_start:

It represents the entry point where the program begins.

When compiling assembly code, we must tell the linker where the entry point is. This is exactly what the .global directive does: it makes the _start label globally visible.

Label

The _start is called label, and each label represents an address in the program. We can confirm this by disassembling the compiled binary.

console
$ riscv64-linux-gnu-objdump --disassemble hello.out

In the middle of the output, we can see the following.

console
Disassembly of section .text:

00000000000100e8 <_start>:

As shown here, the _start label corresponeds to a concrete address, which was determined during the linking process.

Sections

Now that we’ve disassembled the compiled binary, let’s take a closer look to understand the entire binary.

In the .text section, we can see instructions in the binary.

text
00000000000100e8 <_start>:
   100e8:	00100513          	li	a0,1
   100ec:	00001597          	auipc	a1,0x1
   100f0:	02058593          	addi	a1,a1,32 # 1110c <__DATA_BEGIN__>
   100f4:	00d00613          	li	a2,13
   100f8:	04000893          	li	a7,64
   100fc:	00000073          	ecall

Obviously, this code corresponds to the part of the assembly code that invokes the write system call. The la instruction turned into auipc and ddi, but we'll not discuss the details yet. However, note that it refers to the address 1110c labeled as __DATA_BEGIN__.

In the .data section, we can see this label.

text
Disassembly of section .data:

000000000001110c <__DATA_BEGIN__>:
   1110c:	6548                	.insn	2, 0x6548
   1110e:	6c6c                	.insn	2, 0x6c6c
   11110:	6f57206f          	j	84004 <__global_pointer$+0x726f8>
   11114:	6c72                	.insn	2, 0x6c72
   11116:	2164                	.insn	2, 0x2164
   11118:	000a                	.insn	2, 0x000a

You may notice this data came from the Hello World! string. The disassembled output may look strange since the disassembler attempted to interpret data as instructions.

asm
.data
hello: .asciz "Hello World!\n"

The .asciz directive declares a null-terminated string.