APFS in Detail: Conclusions

This series of posts covers APFS, Apple’s new filesystem announced at WWDC 2016. See the first post for the table of contents.

Summing Up

I’m not sure Apple absolutely had to replace HFS+, but likely they had passed an inflection point where continuing to maintain and evolve the 30+ year old software was more expensive than building something new. APFS is a product born of that assessment.

Based on what Apple has shown I’d surmise that its core design goals were:

Those are great goals that will benefit all Apple users, and based on the WWDC demos APFS seems to be on track (though the macOS Sierra beta isn’t quite as far along).

In the process of implementing a new file system the APFS team has added some expected features. HFS was built when 400KB floppies ruled the Earth (recognized now only as the ubiquitous and anachronistic save icon). Any file system started in 2014 should of course consider huge devices, and SSDs–check and check. Copy-on-write (COW) snapshots are the norm; making the Duplicate command in the Finder faster wasn’t much of a detour. The use case is unclear, it’s a classic garbage can theory solution, a solution in search of a problem, but it doesn’t hurt and it makes for a fun demo. The beach ball of doom earned its nickname; APFS was naturally built to avoid it.

There are some seemingly absent or ancillary design goals: performance, openness, and data integrity. Squeezing the most IOPS or throughput out of a device probably isn’t critical on watchOS, and it’s relevant only to a small percentage of macOS users. It will be interesting to see how APFS performs once it ships (measuring any earlier would only misinform the public and insult the APFS team). The APFS development docs have a bullet on open source: “An open source implementation is not available at this time.” I don’t expect APFS to be open source at this time or any other, but prove me wrong, Apple. If APFS becomes world-class I’d love to see it in Linux and FreeBSD–maybe Microsoft would even jettison their ReFS experiment. My experience with OpenZFS has shown that open source accelerates that path to excellence. It’s a shame that APFS lacks checksums for user data and doesn’t provide for data redundancy. Data integrity should be job one for a file system, and I believe that that’s true for a watch or phone as much as it is for a server.

At stability, APFS will be an improvement, for Apple users of all kinds, on every device. There are some clear wins and some missed opportunities. Now that APFS has been shared with the world the development team is probably listening. While Apple is clearly years past the decision to build from scratch rather than adopting existing modern technology, there’s time to raise the priority of data integrity and openness. I’m impressed by Apple’s goal of using APFS by default within 18 months. Regardless of how it goes, it will be an exciting transition.

Posted on June 19, 2016 at 7:37 pm by ahl · Permalink
In: Software · Tagged with: 

37 Responses

Subscribe to comments via RSS

  1. Written by Leon
    on June 20, 2016 at 9:34 am
    Permalink

    I strongly agree with you regarding checksumming; it’s a poor decision on Apple’s part to omit that, as it’s excellent at exposing both software/firmware bugs and hardware problems.

    On the other hand, I disagree regarding your assessment of the new copy operations. You mention making it harder to free up space, but snapshots lead to the some of the same problems and are less convenient to use (and may well capture more context than you really want). Hardlinks have the exact same problems but have an insane semantics. You could think of this as end-user-friendly sub-snapshots.

    And, if you make a physical copy, then you run out of disk space quicker, and still have to delete both copies in order to get back to the same disk usage as it would take to free up space under logical copy. To me, it seems difficult to argue that physical copies are not pareto-dominated by logical copies.

    As for applications, one thing that this technique might enable is a nicer approach to dynamic linking: if a dynamic linker can detect when two filenames refer to the same file for a cost that’s almost free; then you can get one of the major benefits of dynamic linking (sharing code objects in physical memory)

    I’m sure more applications will come along, honestly I’ve give an bit of thought about this feature over the years, and have long wondered why filesystems don’t do this.

    • Written by Leon
      on June 20, 2016 at 9:49 am
      Permalink

      I didn’t quite finish my thought regarding dynamic linking: with logical copies, you could also avoid some of the problems that lead to DLL Hell.

      In the study of programming languages, it’s well known that dynamic scope is problematic and should be used consciously and minimally, whereas static scope is far more robust and is the correct default.

      Today’s approaches to dynamic linking is analogous to dynamic scope, whereas static linking is analogous to static scope. And these analogies are actually rather robust; these are not a shallow analogies that falls apart on a slightly deeper inspection.

      However, with logical copying, you can have dynamic linking (and the benefits it brings, assuming the linker can detect logical copies for “free”) that behaves much closer to static scope.

      • Written by ahl
        on June 20, 2016 at 2:13 pm
        Permalink

        That’s a clever thought. What if every application were a Docker container that had storage-efficient copies of all of its dependencies!

        • Written by Andy Lawrence
          on June 24, 2016 at 8:19 pm
          Permalink

          This is similar to one of the features that I have designed into a new system I am developing. The idea is to have a simple ‘list’ object for every application. The list would contain the object IDs for every dependent object (executable, shared library, configuration setting, etc.) that could be checked independently by the system. So you wouldn’t have to run program X before you found out that DLL Y was missing that it needed.

          If multiple applications share the same library, they would just have the same library’s object ID in their list. You wouldn’t have to worry about someone installing a different DLL with the same name breaking your application because it would have a different Object ID.

          Every data object in the system can have multiple versions. So you can have 20 versions (#1 through #20) of the same DLL, each with its own unique object ID. An application could be tied to a specific version so that program X could use version 10 of the DLL while program Y uses version 15.

          Using this idea, it would be possible for example to have 10 different versions of an operating system installed on the same box (without requiring each to be in its own VM). Only one could run at a time of course, but they would all share the same logical container on disk. So if Windows 7, 8, and 10 all used the same DLL, you only need one copy of that DLL on disk to run all three versions of the OS. The same approach could be used to install 5 different versions of an application.

  2. Written by Tom Limoncelli
    on June 20, 2016 at 12:17 pm
    Permalink

    This paragraph contradicts itself:

    “I’m not sure Apple absolutely had to replace HFS+, but likely they had passed an inflection point where continuing to maintain and evolve the 30+ year old software was untenable. APFS is a product born of that necessity.”

    I think what you mean to say is that:

    “Why a new file system and not just more bandaids to HFS+? Apple passed an inflection point where continuing to maintain and evolve the 30+ year old software was untenable. APFS is a product born of that necessity.”

    • Written by ahl
      on June 20, 2016 at 2:15 pm
      Permalink

      Good point; fatigue was setting in; rephrased.

  3. Written by Karl
    on June 20, 2016 at 2:39 pm
    Permalink

    For Apple’s ecosystem, there’s a strong argument to make for checksumming even in the absence of redundancy. Many (most?) of the files on Apple devices are installed by the system (OS, App Store apps and iTunes-purchased media) and many others are regularly backed up to iCloud. When a local file goes bad, it’s very likely that it can be restored from Apple’s servers. Checksums would provide the means to know when a restore is needed.

    • Written by Abhi Beckert
      on June 21, 2016 at 12:06 am
      Permalink

      If Apple has the file stored in the cloud somewhere, then they can do checksums without saving them to disk.

      As part of checking if local data has changed and needs to be uploaded to the cloud, Apple can do a checksum and compare it to the cloud data. If the checksum fails but metadata tells you the local file hasn’t been changed since the last backup, then you download from the cloud and overwrite the local file.

      Data integrity is critical, but I’m not convinced it needs to be done at the filesystem level. Far better to integrate it somehow into whatever systems you have in place to recover from a device being destroyed in a fire.

  4. Written by Alberts
    on June 20, 2016 at 3:41 pm
    Permalink

    “When a local file goes bad, it’s very likely that it can be restored from Apple’s servers”

    This misses the point about data integrity. In a process called bit rot, you may end up with bad data, undetected for months and years, both locally and in your backups.

    • Written by Karl
      on June 20, 2016 at 4:21 pm
      Permalink

      It doesn’t miss the point. Checksums on data blocks (which I was advocating for) would address exactly that. My point was that there’s a lot to gain even without something akin to mirroring or raidz.

      • Written by Jim HashBackup
        on June 20, 2016 at 8:20 pm
        Permalink

        Disk checksums only make sense on ECC systems. On non-ECC systems:

        1. read correctly from disk into memory buffer
        2. bit gets flipped in memory buffer
        3. compute strongest checksum in the world (too late!)
        4. write correctly to disk
        5. go back and read it: no checksum error detected
        6. file copy does not match original

        First priority is to ship all systems with ECC memory. Then worry about disk checksums if you like.

        • Written by Rich Teer
          on June 21, 2016 at 11:30 pm
          Permalink

          I respectfully disagree (although I do advocate for ECC as much as possible) with your assertion that disk checksums only make sense on ECC systems. That is akin to stating that locks are only of use in building without windows!

          It is entirely possible that the disk subsystem will introduce detectable errors, even in the absence of ECC memory.

  5. Written by Fabien Besnard
    on June 20, 2016 at 4:47 pm
    Permalink

    So that’s it? years of hope for ZFS shipping nativaley for OSX and still nothing on track for APFS to ensure data integrity at all times?
    Having talked to the guys, do you have hope this could change until the product is shipped? What’s the point of a new filesystem if it doesn’t adress the majors problem of the precedent one…

    • Written by ahl
      on June 21, 2016 at 3:59 pm
      Permalink

      From Apple’s perspective APFS certainly addresses their problems with HFS+. I think the APFS team would be receptive to community feedback about checksums; we shall see!

  6. Written by SteveP
    on June 20, 2016 at 5:32 pm
    Permalink

    So very, very sad that Apple doesn’t provide user data integrity. I mean, come on, if you are starting fresh, DO IT RIGHT. Of course they want you to house all your data in the cloud, so integrity on the device isn’t a big deal, is it? {sarcasm liberally applied}.

    There is still time, let’s get something rolling to let Apple know this *is* important to users, and something I’ve hated about HFS for, well, forever. But at that time, nobody was doing it so it was OK. Now it is not.

  7. Written by Peter
    on June 20, 2016 at 6:26 pm
    Permalink

    This made me chuckle:

    HFS was built when 400KB floppies ruled the Earth (recognized now only as the ubiquitous and anachronistic save icon).

    Actually, for us old timers, MFS was built when 400KB floppies ruled the earth. HFS was built when the 20MB hard drive ruled the Earth.

    And, yes, I remember using a Bernoulli 5MB Cartridge drive formatted with MFS attached to my 128K Mac. Now get off my lawn!

  8. Written by Evan Rowley
    on June 20, 2016 at 6:35 pm
    Permalink

    Amazing analysis. Thanks for writing all of this.

  9. Written by Isaac Rozenfeld
    on June 20, 2016 at 7:18 pm
    Permalink

    +1 on the analysis bits being far from rotten

  10. Written by Gino Cerullo
    on June 20, 2016 at 9:33 pm
    Permalink

    Remember that APFS is meant to scale from watch to desktop. A good case could be made against the value of checksumming data on watchOS, tvOS or even iOS.

    On Apple Watch the user data is usually pulled or pushed from another device or the cloud so checksumming is of little to no value there since user data does not live on the device long enough.

    On Apple TV we’re talking about a device that is primarily used for streaming data from the internet. About the only persistent user data that might be stored on it is user’s game level data and even there I’m not sure if that isn’t also backed up to the user’s iCloud account.

    On iOS devices (iPad, iPhone, iPod Touch) most, if not all, user data is redundantly stored in the cloud either through syncing of data or by virtue of being backed up using iCloud backup. If they aren’t using iCloud Backup they should at least be using local backup to a desktop using iTunes.

    Now, on macOS, a better case could be made for the use of checksumming. But even there Apple may be expecting more people to be pushing more data to the cloud especially with the introductions of Desktop and Document syncing to iCloud Drive and the new Optimized Storage feature that pushes old and little used files to the cloud. The result being data redundancy in the cloud and few files living on the local drive long enough to worry about bit rot.

    The one area where checksumming would be invaluable would be persistent local external storage and Time Machine backups. For example, one might want to store a large collection of images on a large external drive. Chances are that data will be expected to live on that drive for the life of the drive which could be many years. And when it comes to Time Machine backups, that’s a no brainer. Backups are expected to live a very long time as well so data integrity is paramount.

    I guess what I’m getting at is that Apple probably doesn’t see the value of checksumming user data because, in most cases, the data doesn’t live on the device long enough and the data that does is usually synced elsewhere or stored redundantly on another device or in the cloud.

    So maybe the best thing to do is include checksumming for macOS only and make it an optional switch when setting up the drive in Disk Utility or make it a default for platter drives and optional for internal SSDs.

    • Written by Jussi Hagman
      on June 21, 2016 at 9:12 am
      Permalink

      > Remember that APFS is meant to scale from watch to desktop

      There is no reason to build the FS feature set staticly against the least common denominator. The FS can and should be flexible enough to have features (like checksumming, compression, etc.) that are turned off on products where they do not make sense for power or other reasons.

      The FS team of Apple seems very capable and on the WWDC talk they mentioned dynamic FS-structures, so I am quite confident they have not painted themselves in a corner regarding future features. Whether a feature will be implemented, when and where it is turned on is a complex problem involving not only technical but also business needs. There is only so much one can make in the first version, but I’m sure there is room to grow during the next 30 years.

    • Written by Graham Perrin
      on June 21, 2016 at 8:57 pm
      Permalink

      > … persistent local external storage …

      Precisely.

      Whilst it’s good to read of the confidence in Apple-procured internal storage devices – http://dtrace.org/blogs/ahl/2016/06/19/apfs-part5/ – let’s be realistic about customers’ use of storage that’s not procured by Apple.

      http://pastebin.com/EZuL6s37 for starters; recent errors on a USB flash drive from a very well-respected vendor.

  11. Written by Michael Tsai - Blog - Apple File System (APFS)
    on June 21, 2016 at 12:56 am
    Permalink

    [...] Conclusions: Those are great goals that will benefit all Apple users, and based on the WWDC demos APFS seems to be on track (though the macOS Sierra beta isn’t quite as far along).Apple File System (APFS) File System History Mac macOS 10.12 Sierra Solid-State Drive (SSD) Sun Microsystems ZFS 2 Comments [...]

  12. Written by Tyrone C. Miles
    on June 21, 2016 at 2:15 am
    Permalink

    I wonder how in an upgrade you would move from HFS+ to APFS on a production system?

    • Written by ahl
      on June 21, 2016 at 4:29 am
      Permalink

      Apple will have a built in upgrade utility. I probably should have mentioned that because it’s pretty cool/terrifying.

      • Written by Andy Norman
        on June 21, 2016 at 10:07 am
        Permalink

        I wonder if anyone else remembers that Windows can do FAT->NTFS upgrades in place (and has done since maybe Windows 2000, can’t remember exactly when they added it) ?

        http://windows.microsoft.com/en-gb/windows/convert-hard-disk-partition-ntfs-format#1TC=windows-7

        • Written by derek
          on June 22, 2016 at 5:04 am
          Permalink

          I remember doing this when i supported windows NT. I don’t remember why now, but we did a lot of installs onto FAT and converted to NTFS after.

        • Written by Andy Lawrence
          on June 24, 2016 at 8:32 pm
          Permalink

          They were doing this back in about 1997 or 1998 if my memory serves. I was working on a few disk utilities (PartitionMagic and Drive Image) at the time and we had to do some special things in order to make that conversion better.

  13. Written by zach
    on June 21, 2016 at 5:56 am
    Permalink

    thanks for the great posts! super interesting, even to someone not very well versed in file systems.

  14. Written by Ed
    on June 21, 2016 at 8:31 am
    Permalink

    Basically Apple is arguing,
    1. Hardware chosen by Apple are good enough
    2. Hardware does Error Checking already

    This is enough for consumers.

    Not sure if i agree, the other point is about ECC Memory.

    Basically i am not knowledgeable enough to add anything useful.

  15. Written by John Lockwood
    on June 21, 2016 at 9:51 am
    Permalink

    I have been watching the progress (or lack of) for BTRFS for quite sometime precisely because of a concern over Bit Rot. I had hoped by now that the various NAS makers would have been using this.

    (Some NAS makers half use BTRFS in that they use it for the file system layer but not the RAID layer and therefore only give some of the benefits, this is because the RAID 5/6 support in BTRFS is still a work in progress.)

    The implication that APFS has nothing to protect against Bit Rot is very worrying since APFS is likely to be used for many years if not decades by Apple.

    While SSDs may have checksumming type protection themselves the world will still also be using hard disks for many, many years to come. Is this a case of Apple forgetting the professional market again? Video editing and storage is not going to be possible in a pure SSD environment for the foreseeable future.

  16. Written by Graham Perrin
    on June 21, 2016 at 7:58 pm
    Permalink

    ahl you were (of course) right about presenting this as a series of posts. And, so nice to see the “even handed” comment. Thanks again!.

  17. [...] 该博客分为六个部分,原文地址: APFS in Detail: Overview APFS in Detail: Encryption, Snapshots, and Backup APFS in Detail: Space Efficiency and Clones APFS in Detail: Performance APFS in Detail: Data Integrity APFS in Detail: Conclusions [...]

  18. Written by Tanj Bennett
    on June 26, 2016 at 10:27 pm
    Permalink

    I’m shocked Apple still treats Flash as a block device with HDD semantics and layers abstractions like snapshots, metadata, transactional changes, RAID, sumchecks over top. Since they’ve been the largest buyer of raw flash chips for 5 years or more and acquihired at least one company with the flash controller expertise I would have expected them to be directly using the flash devices. More like an Open Channel approach.

    Fascinating to see such a retro approach in 2016.

  19. Written by Karl Ivar Dahl
    on June 29, 2016 at 6:33 am
    Permalink

    I believe that bit rot in combination with encryption is a particular devastating combination. Would not a single bitflip carry over to all the subsequent data when decrypted?

    • Written by Andy Lawrence
      on July 1, 2016 at 4:58 pm
      Permalink

      It would depend on the encryption algorithm. If you used a simple XOR operation (just obfuscation, not encryption in my book) then only the current byte would be affected.

      But if you used any of the encryption algorithms where the contents of the previous byte affects the encryption of the current byte, then a single bit flip of an encrypted byte stream could affect every byte after it.

      Often, large data streams are encrypted in ‘sections’ meaning you don’t have to go back to the start of the stream to decrypt some data at offset 1 billion. You just have to go back to the start of the current ‘encryption section’. In this case, the effects of a bit flip would be limited to a single section.

  20. Written by Sam
    on July 10, 2016 at 6:00 pm
    Permalink

    You forget that Apple does indeed have checksumming, and monitoring thereof, of code on mobile devices. iOS only executes cryptographically signed code. This is likely why they are confident in their flash integrity

  21. Written by Meda Pastor
    on July 25, 2016 at 10:14 am
    Permalink

    Howdy First of all I would like to say what a great post! I had a short question that I’d like to ask if you don’t mind. I was curious to understand how you center yourself and clear your thoughts prior to blogging. I’ve had distractions clearing my thoughts in getting my mind out. I do enjoy writing but it just seems like the first 10 to 15 minutes are unproductive simply just trying to figure out how to start. Any ideas or tips? Thank you!

Subscribe to comments via RSS