We ported and benchmarked a flash file system to Linux running on an ARM board. Porting was done via FUSE, a user space file system mechanism where the file system module itself runs as a process inside Linux. The file I/O calls from other processes are eventually routed to the FUSE process via inter process communication. This IPC is enabled by a low level FUSE driver running in the kernel.
http://en.wikipedia.org/wiki/Filesystem_in_Userspace
The above diagram provides an overview of FUSE architecture. The ported file system was proprietary and was not meant to be open sourced, from this perspective file system as a user space library made a lot of sense.
Primary bottleneck with FUSE is its performance. The control path timing for a 2K byte file read use-case is elaborated below. Please note that the 2K corresponds to NAND page size.
1. User space app to kernel FUSE driver switch. – 15 uS
2. Kernel space FUSE to user space FUSE library process context switch. – 1 to15 mS
3. Switch back into kernel mode for flash device driver access – NAND MTD driver overhead without including device delay is in uS.
4. Kernel to FUSE with the data read from flash – 350uS (NAND dependent) + 15uS + 15uS (Kernel to user mode switch and back)
5. From FUSE library back to FUSE kernel driver process context switch. – 1 to 15mS
6. Finally from FUSE kernel driver to the application with the data – 15 uS
As you can see, the two process context switches takes time in terms of Milliseconds, which kills the whole idea. If performance is a crucial, then profile the context switch overhead of an operating system before attempting a FUSE port. Seems loadable kernel module approach would be the best alternative.