Increased testing has identified a number of problems and is starting to lead to solutions. This post will go over the status of some recent changes as well as test results.
Filesystem
The filesystem update is now almost complete, with only around 400 lines of legacy code remaining to be replaced in filesystem syscall functions.
Testing revealed one major stability issue resulting in inode table exhaustion, this was a simple reference counting error and has been fixed.
Overall the filesystem code is mostly stable for simple deployments but extensions like support for large disks are still in progress. The filesystem code supports using filesystems from earlier versions, allowing for more stable & more full featured options.
Some VFS-like support also exists by way of a “drives” structure for multiplexing filesystems, which is designed to work with the container system, but this will be discussed in another post.
Compiler
Testing also revealed some problems when using the filesystem code with the new compiler backend, these have partly been worked around with rewrites but some compiler bugs are still affecting the filesystem code.
The compiler is likely to still have some known bugs at release, so will initially be a demo together with the new system (building the system with GCC will be more stable for the near future). Once features are finished more compiler testing will be enabled hopefully resulting in a very stable platform down the track.
Scheduler
A few tests have been crashing the scheduler, which sounds bad, but I have already written a new fallback scheduler implementation which should address any such stability issues while the main scheduler implementation is improved.
So as the kernel code stabilises there should actually be separate “stable” and “performance” scheduler options, and hopefully these will be usable side-by-side on the same system.
Container Security
I’ve avoided adding legacy security features like user/group support or fine-grained capabilities in favour of native support for containers. This should make for easier real-world security as the container model can easily be adapted for multi-user and multi-app security barriers, and this model fits the design of the rest of the system.
Container support half-exists in the kernel already so some things already go through the container layer, but proper management of containers & other security features won’t be fully enabled until later. Due to the system being designed around this model it should be fairly robust relative to early container implementations, with a different set of core kernel structures for each “container” and possibly eventually some address space isolation between them too (not yet implemented).
Hardware Ports
Testing on real hardware has finally commenced but has been slightly delayed by the need more more testing of core features (which I’ve been doing in an emulator), so I hope to get back to that shortly.
Initial testing has been done on the Ky X1 CPU with memory & multicore support fully or almost fully working as of the last tests but no hardware drivers for testing any real stuff (only a serial console & RAM disk). This work is expected to also be applicable for booting on other RISC-V 64 devices but I don’t know how similar other driver needs will be across devices yet.
Once I can get back to hardware work I hope to at least get GPIO drivers working on one or two boards and add more interrupt handling support so that drivers for simple robotics devices can start being implemented.
Networking & Graphics
Some work has been done on these subsystems at least in planning but probably won’t be fully enabled at release. Some demos may be possible soon, don’t hold your breath.
Kernel Audit
The kernel is currently around 64% new code. Old code continues to be replaced, however early development also swelled the original codebase a little leaving currently just under 5,000 lines of “old” code still left to replace.
The filesystem code is now almost entirely new and supports some extended format versions to start testing more heavy duty workloads. The few bits of old filesystem-related code are being steadily replaced now.
The scheduler code is partly new but leans heavily on old functions in proc.c, further scheduler improvements will probably be synchronised with the container support.
The rest of the kernel is about half new/half old with things like support for device tree blobs added recently but some other startup code & utility functions being old. Some unused or redundant code is still laying around (such as currently having two copies of some page table functions), so the code statistics are a little displaced by such things but represent the in-development codebase.
Userland & Tests
Some work has been done on new “userland” programs & ports, so testing has also been done with some almost-real-world programs but this will be covered in another post. The “libc” implementation is almost entirely my own code, it’s missing many functions but otherwise is working quite well.
The old test suite itself is still not completely working so around half the tests are still disabled, but support is improving. An additional test also exists for multithreading, which is partly stable and will probably be enabled at release but with limited functionality.
Summary
With the filesystem code relatively stable and stability of the scheduler improving I seem to be on track to deliver a stable and high performance core OS, but of course am behind schedule and need to finish many things.