Introduction
Welcome to the mlibc User Guide.
mlibc is a fully featured C standard library designed with portability in mind. It provides a clean syscall abstraction layer for new operating system ports to plug into. This guide will explain how to integrate a new OS / syscall backend and get your first program up and running.
Adding a new OS port
As mlibc has been designed with portability in mind, adding a new port is a straightfoward process:
- Make sure your kernel has the prerequisites needed.
- Begin implementing the sysdeps to get access to mlibc headers.
- Build the toolchain for your OS.
- Finish implementing the sysdeps.
- Take a look at what to do next.
Make sure you have a directory for your operating system's sysroot and to store your toolchain. These will be referenced in code when needed as $SYSROOT_DIR and $TOOLCHAIN_DIR respectively.
The sysroot dir is where libraries and includes compiled for your OS will go. It is bascially a copy of your OS's root filesystem, and your compiler will look into it for libraries and header files when cross-compiling.
The toolchain dir will be where the cross-compilers will be installed.
Kernel prerequisites
Before you attempt to port mlibc, ensure your kernel supports these things:
- Writing text to the screen or a serial port
- Basic paging and virtual memory operations
- Loading an ELF program, mapping a stack and jumping to the entry point
- Handling syscalls from userspace
For this port, we'll use mlibc-demo-os, a small RISC-V kernel which supports the minimum functionality required for a mlibc 'hello world' program. Refer to this if you get stuck!
Note that a filesystem and storage drivers are not strictly necessary. In mlibc-demo-os, we include_bytes! the contents of the user-space program into the kernel's executable. To support multiple files, you could include a trivial read-only filesystem like tar instead.
Implementing your sysdeps (part 1)
As we need libc headers to build the toolchain, we will first implement enough of the sysdeps to install the headers.
Creating a Meson cross file
mlibc uses a build system called Meson. When cross-compiling, you must tell meson about your compiler and target via a cross file.
For mlibc-demo-os, our cross file looks like this:
[host_machine]
system = 'demo'
cpu_family = 'riscv64'
cpu = 'riscv64'
endian = 'little'
[binaries]
c = 'riscv64-demo-gcc'
cpp = 'riscv64-demo-g++'
ar = 'riscv64-demo-ar'
strip = 'riscv64-demo-strip'
ld = 'riscv64-demo-ld'
[built-in options]
c_args = ['-march=rv64gc', '-mabi=lp64d']
cpp_args = ['-march=rv64gc', '-mabi=lp64d']
Telling mlibc about your sysdeps
In mlibc's top level meson.build file, you'll see a large if/else chain that selects the right sysdeps subdirectory based on the system name. We'll add ourselves here:
elif host_machine.system() == 'demo'
subdir('sysdeps/demo')
# ...
Now, create the sysdeps/demo folder. At minimum it should contain a meson.build file. Here, all the source and include files are declared for the sysdep. For now, we will only add this:
sysdep_supported_options = {
'posix': true,
'linux': false,
'glibc': false,
'bsd': false,
}
The sysdep_supported_options tells mlibc which 'options' your sysdeps support. For example, the unistd.h header is only available when the POSIX option is enabled.
While sysdep_supported_options only states which options are supported to be used by the sysdeps, you are free to disable options with meson options at meson setup time. The default behavior is to enable supported options.
Selecting ABI headers
Your userland needs an ABI to communicate with your kernel. For example, we must define the layout of struct stat.
Re-using an existing ABI is strongly recommended. Many programs assume a particular ABI and will fail to compile or subtly break if you choose a different one (This is the program's fault and not mlibc's or your sysdeps').
Note that using an existing ABI does not imply you have to implement OS specific features. For example, even if you use the Linux ABI you do not have to implement Linux specific features.
For the demo OS, we'll symlink each ABI header to Linux's definition in abis/linux. This is so we don't have to define a whole new ABI for a brief demo. Then, we'll add the following to our meson.build:
if not no_headers
install_headers(
'include/abi-bits/auxv.h',
'include/abi-bits/auxv.h',
'include/abi-bits/blkcnt_t.h',
'include/abi-bits/blksize_t.h',
'include/abi-bits/clockid_t.h',
'include/abi-bits/dev_t.h',
'include/abi-bits/errno.h',
'include/abi-bits/fcntl.h',
'include/abi-bits/gid_t.h',
'include/abi-bits/ino_t.h',
'include/abi-bits/limits.h',
'include/abi-bits/mode_t.h',
'include/abi-bits/nlink_t.h',
'include/abi-bits/pid_t.h',
'include/abi-bits/seek-whence.h',
'include/abi-bits/signal.h',
'include/abi-bits/stat.h',
'include/abi-bits/uid_t.h',
'include/abi-bits/vm-flags.h',
'include/abi-bits/wait.h',
'include/abi-bits/riscv-hwprobe.h',
'include/abi-bits/sigevent.h',
'include/abi-bits/sigval.h',
'include/abi-bits/sa_family_t.h',
'include/abi-bits/sockaddr_storage.h',
'include/abi-bits/sig-limits.h',
'include/abi-bits/suseconds_t.h',
'include/abi-bits/access.h',
'include/abi-bits/socklen_t.h',
'include/abi-bits/socket.h',
'include/abi-bits/poll.h',
'include/abi-bits/resource.h',
'include/abi-bits/in.h',
'include/abi-bits/rlim_t.h',
'include/abi-bits/utsname.h',
'include/abi-bits/fd_set.h',
'include/abi-bits/sem.h',
'include/abi-bits/time.h',
'include/abi-bits/ipc.h',
'include/abi-bits/statvfs.h',
'include/abi-bits/fsblkcnt_t.h',
'include/abi-bits/fsfilcnt_t.h',
'include/abi-bits/shm.h',
'include/abi-bits/termios.h',
'include/abi-bits/msg.h',
'include/abi-bits/mqueue.h',
'include/abi-bits/utmp-defines.h',
'include/abi-bits/utmpx.h',
subdir: 'abi-bits',
follow_symlinks: true
)
endif
This is not an exhaustive list of ABI headers but should be enough to get started.
Note that the Linux ABI will be silently changed if it turns out that it is wrong - you end up subscribing to what amounts to ABI breaks if that (hopefully only rarely) happens.
For this reason, it is also strongly recommended to pin the commit or version of mlibc which will be used, as otherwise there could be changes which silently break your compiled software.
If subscribing to the Linux ABI is not desired, you can still fork it by creating a copy under abis/$os/ and symlinking to it instead of the Linux ABI.
Configuring mlibc for headers only
Now, you can run:
$ meson \
setup \
--cross-file=path/to/your.cross-file \
--prefix=/usr \
-Dheaders_only=true \
headers-build
The -Dheaders_only=true option tells meson that we only want the headers and not a full libc.
To install the headers into the sysroot, run:
DESTDIR=${SYSROOT_DIR} ninja -C headers-build install
We are now ready to start building the toolchain!
Building the toolchain
To compile mlibc and any userspace programs which link against it, you'll need a suitable compiler. For that, you will need to build a full OS Specific Toolchain. Using your host's compiler or a generic toolchain could lead into strange build failures and is not recommended nor supported.
For mlibc-demo-os, the riscv64-demo-gcc toolchain is used. The patches (with comments) to add support for it in Binutils and GCC are available here.
The first step is downloading the sources for Binutils and GCC, then applying your patches. The versions used in this guide are Binutils 2.45.1 and GCC 15.2.0.
Building Binutils
As GCC depends on a binutils for the target, we will build binutils first:
cd binutils-2.45.1
mkdir build
cd build
# --with-sysroot will tell the linker where to search for libraries.
# --enable-default-execstack=no will tell the linker to
# not use an executable stack by default.
../configure \
--target=riscv64-demo \
--prefix=/usr \
--with-sysroot="${SYSROOT_DIR}" \
--disable-werror \
--enable-default-execstack=no
make -j$(nproc)
DESTDIR="${TOOLCHAIN_DIR}" make install
Make sure you have ${TOOLCHAIN_DIR}/usr/bin in your $PATH. If you don't, you WILL run into issues when building GCC.
Building GCC
Now, build GCC:
cd gcc-15.2.0
mkdir build
cd build
# --with-sysroot will tell the compiler where to search for the libc headers
# during GCC compilation, as well as include dirs and libraries when compiling.
# --enable-languages=c,c++ will tell it to only build the C and C++ compilers
# --enable-threads=posix enables pthread support
# --disable-multilib disables building a multilib gcc
CFLAGS_FOR_TARGET="-march=rv64gc -mabi-lp64d" \
CXXFLAGS_FOR_TARGET="-march=rv64gc -mabi-lp64d" \
../configure \
--target=riscv64-demo \
--prefix=/usr \
--with-sysroot="${SYSROOT_DIR}" \
--enable-languages=c,c++ \
--enable-threads=posix \
--disable-multilib \
--enable-shared \
--enable-host-shared
make -j$(nproc) all-gcc all-target-libgcc
DESTDIR="${TOOLCHAIN_DIR}" make install-gcc install-target-libgcc
After that, you'll have a compiler that targets your OS and implicitly links against mlibc! However, if you try to compile a simple program using it, you will run into errors:
$ riscv64-demo-gcc -march=rv64gc -mabi=lp64d helloworld.c
ld: cannot find crt1.o: No such file or directory
ld: cannot find -lc: No such file or directory
This is because we still have not built mlibc, and so it cannot find needed libc files.
Implementing your sysdeps (part 2)
Let's try building mlibc:
$ meson \
setup \
--cross-file=path/to/your.cross-file \
--prefix=/usr \
-Ddefault_library=static \
-Dno_headers=true \
build
The -Ddefault_library=static option tells meson to only produce a statically linked library (libc.a). We suggest getting statically linked binaries to work before dynamically linked ones.
The -Dno_headers=true option tells meson to not install any headers, as we already did so earlier.
Now run ninja -C build to start the build. You'll likely get a large number of compilation errors at this point, and we'll work on fixing these in the next sections.
Implementing (S)crt1
We must provide a definition of the ELF entry point (traditionally named _start). Each ISA tends to require its own definition written in assembly; for example, RISC-V targets must initialise the gp register before jumping to C++.
Traditionally the file that defines _start is called crt1.S (or Scrt1.S for position independent executables). This produces an object file which has to be linked into every application.
Part of configuring an OS Specific Toolchain is specifying the location of (S)crt1.o so that it is linked automatically.
We recommend copying (S)crt1.S from an existing target like Linux. Then, we'll compile it by adding the following to our meson.build:
if not headers_only
crt = custom_target('crt1',
build_by_default: true,
command: c_compiler.cmd_array() + ['-c', '-o', '@OUTPUT@', '@INPUT@'],
input: 'crt1.S',
output: 'crt1.o',
install: true,
install_dir: get_option('libdir')
)
endif
Implementing a C++ entry point
Now, we'll implement the C++ entry point that we call from crt1.S.
#include <stdint.h>
#include <stdlib.h>
#include <mlibc/elf/startup.h>
extern "C" void __dlapi_enter(uintptr_t *);
extern char **environ;
extern "C" void __mlibc_entry(uintptr_t *entry_stack, int (*main_fn)(int argc, char *argv[], char *env[])) {
__dlapi_enter(entry_stack);
auto result = main_fn(mlibc::entry_stack.argc, mlibc::entry_stack.argv, environ);
exit(result);
}
The call to __dlapi_enter is used to perform initialisation in statically linked executables (but is a no-op in dynamically linked ones). For example, any global constructors in the program will be called from here.
Make sure to take a look at the ABI specification for your architecture for details like proper stack layout. People often forget to implement things like auxiliary vectors which are required by the spec.
Performing system calls
The final piece of infrastructure we require is the ability to invoke system calls.
For example, on RISC-V a system call is invoked via the ecall instruction and requires putting arguments in specific registers, which requires a bit of (inline) assembly. We recommend copying this glue from an existing target.
For the demo OS, this is provided by syscall.cpp and include/bits/syscall.h.
Implementing sysdeps
Finally we're ready to implement the actual sysdep functions. For a basic statically-linked hello world program, you'll need to provide definitions of the following sysdep functions:
mlibc::sys_libc_panicmlibc::sys_libc_logmlibc::sys_isattymlibc::sys_writemlibc::sys_tcb_setmlibc::sys_anon_allocatemlibc::sys_anon_freemlibc::sys_seekmlibc::sys_exitmlibc::sys_closemlibc::sys_futex_wakemlibc::sys_futex_waitmlibc::sys_readmlibc::sys_openmlibc::sys_vm_mapmlibc::sys_vm_unmapmlibc::sys_clock_get
Note that many of these functions are declared as weak symbols. You must include the relevant headers (e.g <mlibc/all-sysdeps.hpp>) before providing definitions.
Diverging from the declared sysdep signature will result in different mangling of the function (because they are not declared extern "C", but have C++ linkage instead), which makes it look like the sysdep is missing - this is a silent breakage that likely does not result in compiler errors or even warnings.
Most sysdep functions return an integer error code (0 for success, or a value that matches the sysdep's abi-bits errno definitions on failure) and return data via out parameters. Note that sysdeps shouldn't set errno directly - mlibc will set it from the error code you return.
As a general strategy, it's a good idea to stub whatever's required to make things compile, and then add proper implementations later. For example:
#define STUB() \
({ \
__ensure(!"STUB function was called"); \
__builtin_unreachable(); \
})
namespace mlibc {
int sys_close(int fd) { STUB(); }
}
Adding source files and includes to the build system
Finally, tell meson about your sources and includes:
rtld_sources += files(
'sysdeps.cpp',
'syscall.cpp',
)
rtld_include_dirs += include_directories('include')
libc_sources += files(
'entry.cpp',
'sysdeps.cpp',
'syscall.cpp',
)
libc_include_dirs += include_directories('include')
Compiling a test program
At this point, you should be able to compile and link mlibc itself. Congratulations!
Install mlibc into the sysroot:
DESTDIR=${SYSROOT_DIR} ninja -C build install
Now we'll compile a simple hello world program that we can run on our kernel:
$ riscv64-demo-gcc -march=rv64gc -mabi=lp64d helloworld.c -o helloworld
Troubleshooting
Something not working? Here are some common issues to look out for:
- Your kernel isn't loading the ELF file correctly. Double check that you're loading the contents from file to the right addresses and zeroing the uninitialised portions.
- Your
sys_anon_allocatefunction is broken. Try commenting outsys_anon_freeand see if that helps. Try printing the addresses to see if you're getting unique allocations. - The memory returned by
sys_vm_maporsys_anon_allocateis not zeroed. mlibc expects the memory returned by these to be zeroed. - You're pushing the arguments/environment/auxiliary vectors to the stack wrong. Double check you're pushing in the right order and everything is properly aligned.
- You're not saving and restoring the userspace state properly when a syscall happens.
- Your
syscallglue is passing arguments in the wrong registers. Double check that the kernel and userspace agree on the order of arguments.
When stuck, we recommend using GDB to step through the program.
If all else fails, feel free to hop in the Managarm Discord server and ask for help in #mlibc-dev.
Next steps
Dynamic linking
So far you have compiled and run a statically linked executable. If you desire dynamic linking, it is not much more work in mlibc compared to static linking.
To get it to work, you will need to reconfigure mlibc without the -Ddefault_library=static option, load both the ELF and its interpreter (stored in PT_INTERP), and jump to the interpreter's entry point.
Implementing more sysdeps
Most sysdeps in mlibc are defined as weak symbols, and so do not need to be defined right away. Whenever an unimplemented sysdep is hit, mlibc will log about it and return an error to the user application.
The list of sysdeps for every option can be found under its include directory. For example, the sysdeps for the POSIX option are declared in options/posix/include/mlibc/posix-sysdeps.hpp. As mentioned earlier, make sure your definitions match the ones in the header, as mlibc won't be able to find them otherwise.
Enabling more options
For the demo sysdeps, only the POSIX option is enabled. However, there are ports that will need more options enabled. The other toggleable options in mlibc are:
- Linux option, for Linux-specific system calls like
epoll_createandstatx. Note that this option requires you to provide Linux kernel headers. - glibc option, for glibc-specific extensions like
backtraceandgetopt_long. Make sure your gcc port hasgnu-user.hin itstm_fileif you enable this option. - BSD option, for BSD-specific extensions like
openptyandgetloadavg
Enabling more mlibc features
There are some configuration options which change how mlibc handles things like loading libraries:
MLIBC_MAP_DSO_SEGMENTS- By default, mlibc does not map files directly when loading libraries. This option enables the mapping of libraries in rtld, which is highly recommended if your OS supports mapping files to memory.MLIBC_MMAP_ALLOCATE_DSO- By default, mlibc will load libraries starting from a preset base memory address. This option makes it so that it will use the addresses returned by thesys_vm_mapsysdep instead.MLIBC_MAP_FILE_WINDOWS- Enables the memory mapping of files like/etc/localtimewhen used by mlibc.
To enable them, add them to the root meson.build under your sysdep like so:
elif host_machine.system() == 'demo'
internal_conf.set10('CONFIG_NAME', true)
# ...
Porting software
With mlibc working, you are now ready to start porting software. For your sanity, use a compiler which targets your OS and has a sysroot directory.
If you run into build failures, make sure that you have enabled the options needed by the application as well as defined the ABI correctly. If you truly believe something is an mlibc bug, please reach out to us on Discord in #mlibc-dev or open an issue.
Upstreaming your port
Once your port is reasonably stable, feel free to submit it to us upstream. This ensures that your sysdeps won't be broken by any internal refactorings.
Though we won't be able to test your kernel on our CI, we require you add your port to the 'Compile sysdeps' GitHub action which checks that compilation succeeds.
It's a good idea to include a .clang-format file so that any changes we make to your code will be formatted to your liking.
See the pull request adding Astral sysdeps for an example to follow.