A simplified model of Fil-C

(corsix.org)

137 points | by aw1621107 7 hours ago ago

68 comments

  • whatsakandr 7 hours ago

    Fil-C is one of the most underrated projects I've ever seen. All this "rewrite it in rust for safety" just sounds stupid when you can compile your C program completely memory safe.

    • tialaramex 6 hours ago

      So, a few things, some of which others have touched on:

      1. Fil-C is slower and bigger. Noticeably so. If you were OK with slower and bigger then the rewrite you should have considered wasn't to Rust in the last ten years but to Java or C# much earlier. That doesn't invalidate Fil'C's existence, but I want to point that out.

      2. You're still writing C. If the program is finished or just occasionally doing a little bit of maintenance that's fine. I wrote C for most of my career, it's not a miserable language, and you are avoiding a rewrite. But if you're writing much new code Rust is just so much nicer. I stopped writing any C when I learned Rust.

      3. This is runtime safety and you might need more. Rust gives you a bit more, often you can express at compile time things Fil-C would only have checked at runtime, but you might need everything and languages like WUFFS deliver that. WUFFS doesn't have runtime checks. It has proved to its satisfaction during compilation that your code is safe, so it can be executed at runtime in absolute safety. Your code might be wrong. Maybe your WUFFS GIF flipper actually makes frog GIFs purple instead of flipping them. But it can't crash, or execute x86 machine code hidden in the GIF, or whatever, that's the whole point.

      • whatsakandr 4 hours ago

        Yes it's slower, but it works. It's being built by one single dad who focused on compatibility before speed.

        I'm not convinced that tying the lifetimes into the type system is the correct way to do memory management. I've read too many articles of people being forced into refactoring the entire codebase to implement a feature.

        • brucehoult 3 hours ago

          > built by one single dad

          Not some random dad, but a GC expert and former leader of the JavaScript VM team at Apple.

        • achierius 4 hours ago

          I can tell you that it's not that he's setting aside speed -- the fact that it's as fast as it is is an achievement. But there is a degree of unavoidable overhead -- IIRC his goal is to get it down to 20-30% for most workloads, but beyond that you're running into the realities of runtime bounds checks, materializing the flight ptrs, etc.

          • gmueckl 3 hours ago

            20% to 30% slower would be amazing for all the extra runtime work that is required in my limited understanding. This would be good enough for a whole lot of serious applications.

      • cxr 3 hours ago

        > Fil-C is slower and bigger

        It's not any slower or (proportionally) bigger compared to the experience you would have had 20 years ago running all sorts of utilities that happen to be the best candidates for Fil-C, and people got along just fine. How fast do ls and mkdir need to be?

      • up2isomorphism 2 hours ago

        The fundamental question is that if an address is “safe” is a runtime thing, which in some cases you can decide it in compile time but not always. To force that during coding is just handicapping oneself to be “safe”. Which you can do the same in C (or mostly any language if you want it)

    • gnabgib 6 hours ago

      Not here, lots of discussion:

      Fil-Qt: A Qt Base build with Fil-C experience (143 points, 3 months ago, 134 comments) https://news.ycombinator.com/item?id=46646080

      Linux Sandboxes and Fil-C (343 points, 4 months ago, 156 comments) https://news.ycombinator.com/item?id=46259064

      Ported freetype, fontconfig, harfbuzz, and graphite to Fil-C (67 points, 5 months ago, 56 comments) https://news.ycombinator.com/item?id=46090009

      A Note on Fil-C (241 points, 5 months ago, 210 comments) https://news.ycombinator.com/item?id=45842494

      Notes by djb on using Fil-C (365 points, 6 months ago, 246 comments) https://news.ycombinator.com/item?id=45788040

      Fil-C: A memory-safe C implementation (283 points, 6 months ago, 135 comments) https://news.ycombinator.com/item?id=45735877

      Fil's Unbelievable Garbage Collector (603 points, 7 months ago, 281 comments) https://news.ycombinator.com/item?id=45133938

    • omcnoe 6 hours ago

      The issue with Fil-C is that it's runtime memory safety. You can still write memory-unsafe code, just now it is guaranteed to crash rather than being a potential vulnerability.

      Guaranteed memory safety at compile time is clearly the better approach when you care about programs that are both functionally correct and memory safe. If I'm writing something that takes untrusted user input like a web API memory safety issues still end up as denial-of-service vulns. That's better, but it's still not great.

      Not to disparage the Fil-C work, but the runtime approach has limitations.

      • pizlonator 6 hours ago

        > write memory-unsafe code, just now it is guaranteed to crash

        If it's guaranteed to crash, then it's memory-safe.

        If you dislike that definition, then no mainstream language is memory-safe, since they all use crashes to handle out of bounds array accesses

        • omcnoe 6 hours ago

          I don't think that's a useful way of thinking about memory-safety - a C compiler that compiles any C program to `main { exit(-1); }` is completely memory-safe. It's easy to design a memory-safe language/compiler, the question is what compromises are being made to achieve it.

          Other languages have runtime exceptions on out-of-bounds access, Fil-C has unrecoverable crashes. This makes it pretty unsuitable to a lot of use cases. In Go or Java (arbitrary examples) I can write a web service full of unsafe out-of-bounds array reads, any exception/panic raised is scoped to the specific malformed request and doesn't affect the overall process. A design that's impossible in Fil-C.

          • wren6991 2 hours ago

            You need to distinguish safety properties from liveness properties.

          • dzaima 5 hours ago

            Then you run into the problem of infinite loops, which nothing can prevent (sans `main { exit(-1); }` or other forms of losing turing-completeness), and are worse than crashes - at least on crashes you can quickly restart the program (something something erlang).

            try-catch isn't a particularly complete solution either if you have any code outside of it (at the very least, the catch arm) or if data can get preserved across iterations that can easily get messed up if left half-updated (say, caches, poisoned mutexes, stuck-borrowed refcells) so you'll likely want a full restart to work well too, and might even prefer it sometimes.

          • wakawaka28 5 hours ago

            I don't think runtime error handling is impossible in Fil-C, at least in theory. But the use cases for that are fairly limited. Most errors like this are not anticipated, and if you did encounter them then there's little or nothing you can do useful in response. Furthermore, runtime handling to continue means code changes, thus coupling to the runtime environment. All of these things are bad. It is usually acceptable to fail fast and restart, or at least report the error.

            • pizlonator 5 hours ago

              I could have made Fil-C’s panic be a C++ exception if I had thought that it was a good idea. And then you could catch it if that’s what tickled your pickle

              I just don’t like that design. It’s a matter of taste

              • wakawaka28 an hour ago

                That's actually not a bad idea, since apparently it can be used for whole operating systems. But certainly, if you started doing that, any code using the exception would need to be exclusive to Fil-C to benefit from that.

      • ori_b 6 hours ago

        By that token, Rust is also memory unsafe: array bounds checks and stack overflow are runtime checks.

        • p1necone 5 hours ago

          Why are you talking like this is black and white? Many things being compile time checkable is better than no things being compile time checkable. The existence of some thing in rust that can only be checked at runtime does not somehow make all the compile time checks that are possible irrelevant.

          (Also I think the commenter you're replying to just worded their comment innacurately, code that crashes instead of violating memory safety is memory safe, a compilation error would just have been more useful than a runtime crash in most cases)

        • DetroitThrow 5 hours ago

          There are several ways to safely provide array bounds check hints to the Rust compiler, in-fact there's a whole cookbook. But for many cases, yep, runtime check.

      • 100ms 6 hours ago
        • zamadatix 6 hours ago

          .get() will bounds check and the compiler will optimize that away if it can prove safety at compile time. That leaves you 3 options made available in Rust:

          - Explicitly unsafe

          - Runtime crash

          - Runtime crash w/ compile time avoidence when possible

        • omcnoe 5 hours ago

          https://play.rust-lang.org/?version=stable&mode=debug&editio...

          Catch the panic & unwind, safe program execution continues. Fundamentally impossible in Fil-C.

          • uecker 5 hours ago

            I do not know how Fil-C handles this, but it could raise a signal that one can then catch.

            • jz391 31 minutes ago

              Reminds me of a commercial project I did for my old University department around 1994. The GUI was ambitious and written in Motif, which was a little buggy and leaked memory. So... I ended up catching any SEGVs, saving state, and restarting the process, with a short message in a popup telling the user to wait. Obviously not guaranteed, but surprisingly it mostly worked. With benefit of experience & hindsight, I should have just (considerably) simplified it: I had user-configurable dialogs creating widgets on the fly etc, none of which was really required.

          • wakawaka28 5 hours ago

            Seems like a niche use case. If it needs code to handle, it's also not apples to apples...

            • omcnoe 5 hours ago

              It's an apple to non-existent-apple comparison. Fil-C can't handle it even with extra code because Fil-C provides no recovery mechanism.

              I also don't think it's that niche a use case. It's one encountered by every web server or web client (scope exception to single connection/request). Or anything involving batch processing, something like "extract the text from these 10k PDFs on disk".

              • wakawaka28 5 hours ago

                Sure, it's not implemented in Fil-C because it is very new and the point of it is to improve things without extensive rewrites.

                Generally, I think one could want to recover from errors. But error recovery is something that needs to be designed in. You probably don't want to catch all errors, even in a loop handling requests for an application. If your application isn't designed to handle the same kinds of memory access issues as we're talking about here, the whole thing turns into non-existent-apples to non-existent-apples lol.

      • boredatoms 6 hours ago

        For some things the just-crash is ok, like cli usage of curl

      • forrestthewoods 6 hours ago

        Rust also has run-time crash checks in the form of run-time array bounds checks that panic. So let us not pretend that Rust strictly catches everything at compile-time.

        It’s true that, assuming all things equal, compile-time checks are better than run-time. I love Rust. But Rust is only practical for a subset of correct programs. Rust is terrible for things like games where Rust simply can not prove at compile-time that usage is correct. And inability to prove correctness does NOT imply incorrectness.

        I love Rust. I use it as much as I can. But it’s not the one true solution to all things.

        • omcnoe 5 hours ago

          Not trying to be a Rust advocate and I actually don't work in it personally.

          But Rust provides both checked alternatives to indexed reads/writes (compile time safe returning Option<_>), and an exception recovery mechanism for out-of-bounds unsafe read/write. Fil-C only has one choice which is "crash immediately".

          • uecker 5 hours ago

            What makes you think that one can not add an explicit bound check in C?

            • tialaramex 4 hours ago

              It's trickier than it looks because C has mutable aliases. So, in C our bounds check might itself be a data race! Make sure you cope

              • uecker 3 hours ago

                Depending on what you are doing, yes. But the statement I responded to "your only choice is crash" is certainly wrong.

            • omcnoe 4 hours ago

              If you can correctly add all the required explicit bounds checks in C what do you need Fil-C for?

              • kimixa 3 hours ago

                Same reason any turing complete language needs any constructs - to help the programmer and identify/block "unsafe" constructs.

                Programming languages have always been more about what they don't let you do rather than what they do - and where that lies on the spectrum of blocking "Possibly Valid" constructs vs "Possibly Invalid".

              • uecker 4 hours ago

                For temporal memory safety.

        • wakawaka28 5 hours ago

          >And inability to prove correctness does NOT imply incorrectness.

          And inability to prove incorrectness does NOT imply correctness. I think most Rust users don't understand either, because of the hype.

    • pizlonator 6 hours ago

      Thanks for the love man!

      > "rewrite it in rust for safety" just sounds stupid

      To be fair, Fil-C is quite a bit slower than Rust, and uses more memory.

      On the other hand, Fil-C supports safe dynamic linking and is strictly safer than Rust.

      It's a trade off, so do what you feel

      • masfuerte 6 hours ago

        Minor nitpick. Or confusion on my part. In the filc_malloc function the call to calloc doesn't seem to allocate enough memory to store an AllocationRecord for each location in visible_bytes. Should it be:

            ar->invisible_bytes = calloc(length, sizeof(AllocationRecord));
        • pizlonator 6 hours ago

          Note, I'm not the author of the OP.

          I am the author of Fil-C

          If you want to see my write-ups of how it works, start here: https://fil-c.org/how

          • masfuerte 6 hours ago

            Thanks, I did confuse you for the author of the article. Your InvisiCaps explanation is clearer than this "simplified" one.

    • dataflow 6 hours ago

      > Fil-C is one of the most unrated projects I've ever seen

      When's the last time you told a C/C++ programmer you could add a garbage collector to their program, and saw their eyes light up?

      • fulafel 2 hours ago

        A lot of the frameworks do it. There's RC in GNOME/GTK, C++ stdlib, and built into Objective-C.

        And of course it's easy to think of lots of apps that heavily use those or another form of GC.

        • dataflow 2 hours ago

          Regardless of what you consider a GC (let's not have that debate for the millionth time on the internet...), for the point I was trying to make, I was not including RC as a form of GC. And I don't think Fil-C relies solely on RC either.

      • FuckButtons 6 hours ago

        Exactly, the Venn diagram of programmers using c/c++ and programmers who can use a garbage collector for their workload is two circles.

        • fweimer an hour ago

          A lot of C++ programmers use C++ and garbage collection daily because their C++ compiler uses a tracing garbage collector.

          https://gcc.gnu.org/onlinedocs/gccint/Type-Information.html

        • brucehoult 3 hours ago

          Definitely not true. I've been using Boehm GC with my C/C++ programs for decades — since the 90s, at least.

          • dataflow 2 hours ago

            Does this also hold true when you look at codebases that others also worked on, rather than just you?

        • pizlonator 6 hours ago

          Except for:

          - Me. I'm a C++ programmer.

          - Any C++ programmer who has added a GC to their C++ program. (Like the programmers who used the web browser you're using right now.)

          - Folks who are already using Fil-C.

          • FuckButtons 5 hours ago

            I’m also a C++ programmer, I can’t even use half of the C++ stdlib for real time thread work, I certainly can’t use a GC.

            • pizlonator 5 hours ago

              There are many C++ programmers and we are not the same!

              My original foray into GCs was making real time ones, and the Fil-C GC is based on that work. I haven’t fully made it real time friendly (the few locks it has aren’t RT-friendly) but if I had more time I could make it give you hard guarantees.

              It’s already full concurrent and on the fly, so it won’t pause you

            • cxr 3 hours ago

              ∃ ≠ ∀

    • kbolino 6 hours ago

      Fil-C has two major downsides: it slows programs down and it doesn't interoperate with non-Fil-C code, not even libc. That second problem complicates using it on systems other than Linux (even BSDs and macOS) and integrating it with other safe languages.

      • pizlonator 6 hours ago

        You’re not wrong but both problems could be alleviated by sending patches :-)

        • kbolino 6 hours ago

          I would never say it's impossible, and you've done some amazing work, but I do wonder if the second problem is feasibly surmountable. Setting aside cross-language interop, BYOlibc is not really tolerated on most systems. Linux is fairly unique here with its strongly compatible syscall ABI.

          • pizlonator 6 hours ago

            You're right that it's challenging. I don't think it's infeasible.

            Here's why:

            1. For the first year of Fil-C development, I was doing it on a Mac, and it worked fine. I had lots of stuff running. No GUI in that version, though.

            2. You could give Fil-C an FFI to Yolo-C. It would look sort of like the FFIs that Java, Python, or Ruby do. So, it would be a bit annoying to bridge to native APIs, but not infeasible. I've chosen not to give Fil-C such an FFI (except a very limited FFI to assembly for constant time crypto) because I wanted to force myself to port the underlying libraries to Fil-C.

            3. Apple could do a Fil-C build of their userland, and MS could do a Fil-C build of their userland. Not saying they will do it. But the feasibility of this is "just" a matter of certain humans making choices, not anything technical.

      • kvemkon 6 hours ago

        > it slows programs down

        Interesting, how costly would be hardware acceleration support for Fil-C code.

        • kbolino 6 hours ago

          I think there's two main avenues for hardware acceleration: pointer provenance and garbage collection. The first dovetails with things like CHERI [1] but the second doesn't seem to be getting much hardware attention lately. It has been decades since Lisp Machines were made, and I'm not aware of too many other architectures with hardware-level GC support. There are more efficient ways to use the existing hardware for GC though, as e.g. Go has experimented with recently [2].

          [1]: https://en.wikipedia.org/wiki/Capability_Hardware_Enhanced_R...

          [2]: https://go.dev/blog/greenteagc

          • Findecanor 4 hours ago

            There are algorithms to align allocations and use metadata in unused pointer bits to encode object start addresses. That would allow Fil-C's shadow memory to be reduced to a tag bit per 8-byte word (like 32-bit CHERI), at the expense of more bit shuffling. But that shuffling could certainly be a candidate for hardware acceleration.

            There is a startup working on "Object Memory Addressing" (OMA) with tracing GC in hardware [1], and its model seems to map quite well to Fil-C's. I have also seen a discussion on RISC-V's "sig-j" mailing list about possible hardware support for ZGC's pointer colours in upper pointer bits, so that it wouldn't have to occupy virtual memory bits — and space — for those.

            However, I think that tagged pointers with reference counting GC could be a better choice for hardware acceleration than tracing GC. The biggest performance bottleneck with RC in software are the many atomic counter updates, and I think those could instead be done transparently in parallel by a dedicated hardware unit. Cycles would still have to be reclaimed by tracing but modern RC algorithms typically need to trace only small subsets of the object graph.

            [1]: "Two Paths to Memory Safety: CHERI and OMA" https://news.ycombinator.com/item?id=45566660

    • GaggiX 6 hours ago

      Fil-C is much slower, no free lunch, if you want the language to be fast and memory safe you need to add restrictions to allow proper static analysis of the code.

    • rvz 6 hours ago

      It makes more sense for new software to be written in Rust, rather than a full rewrite of existing C/C++ software to Rust in the same codebase.

      Fil-C just does the job with existing software in C or C++ without an expensive and bug riddled re-write and serves as a quick protection layer against the common memory corruption bugs found in those languages.

      • uecker 4 hours ago

        Even without Fil-C I do not think it is even clear that new software should be written in Rust. It seems to have a lot of fans, but IMHO it is overrated. If you need perfect memory safety Rust has an advantage at the moment, but if you are not careful you trade this for much higher supply chain risks. And I believe the advantage in memory safety will disappear in the next years due to improved tooling in C, simply by adding the formal verification that proves the safety which will work automatically.

    • DetroitThrow 5 hours ago

      I write C++ for my job everyday, and claiming Fil-C does the same thing as Rust (and that people who do rewrites in Rust are stupid) sounds braindead.

      I love Fil-C. It's underrated. Not the same niche as Rust or Ada.

      • andai 4 hours ago

        Tell me more about Ada!

    • adamnemecek 5 hours ago

      It really doesn't, Rust is a better language.

  • hsaliak 4 hours ago

    I made https://github.com/hsaliak/filc-bazel-template bazel target for people who may want to use these two together to make hermetic builds with it.

  • vzaliva 6 hours ago

    This is yet another variant of the "fat pointers" technique, which has been implemented and rejected many times due to either insufficient security guarantees, inability to cross non-fat ABI boundaries, or the overhead it introduces.