APFS in Detail: Space Efficiency and Clones

June 19, 2016

This series of posts covers APFS, Apple’s new filesystem announced at WWDC 2016. See the first post for the table of contents.

Space Efficiency

A modern trend in file systems has been to store data more efficiently to effectively increase the size of your device. Common approaches include compression (which, as noted above, is very very likely coming) and deduplication. Dedup finds common blocks and avoids storing them multiply. This is potentially highly beneficial for file servers where many users or many virtual machines might have copies of the same file; it’s probably not useful for the single-user or few-user environments that Apple cares about. (Yes, they have server-ish offerings but their heart clearly isn’t into it.) It’s also furiously hard to do well as I learned painfully while supporting ZFS.

Apple’s sort-of-unique contribution to space efficiency is constant time cloning of files and directories. As a quick aside, “files” in macOS are often really directories; it’s a convenient lie they tell to allow logically related collections of files to be treated as an indivisible unit. Right click an application and select “Show Package Contents” to see what I mean. Accordingly, I’m going to use the term “file” rather than “file or directory” in sympathy for the patient readers who have made it this far.

With APFS, if you copy a file within the same file system (or possibly the same container; more on this later), no data is actually duplicated. Instead a constant amount of metadata is updated and the on-disk data is shared. Changes to either copy cause new space to be allocated (so-called “copy on write” or COW).

I haven’t seen this offered in other file systems, and it clearly makes for a good demo, but it got me wondering about the use case (UPDATE: btrfs supports this and calls the feature “reflinks”–link by reference). Copying files between devices (e.g. to a USB stick for sharing) still takes time proportional to the amount of data copied of course. Why would I want to copy a file locally? The common case I could think of is the layman’s version control: “thesis”, “thesis-backup”, “thesis-old”, “thesis-saving because I’m making edits while drunk”.

There are basically three categories of files:

Files that are fully overwritten each time; images, MS Office docs, videos, etc.
Files that are appended to, mostly log files
Files with a record-based structure, such as database files.

For the average user, most files fall into that first category. So with APFS I can make a copy of my document and get the benefits of space sharing, but those benefits will be eradicated as soon as I save the new revision. Perhaps users of larger files have a greater need for this and have a better idea of how it might be used.

Personally, my only use case is taking a file, say time-shifted Game of Thrones episodes falling into the “fair use” section of copyright law, and sticking it in Dropbox. Currently I need to choose to make a copy or permanently move the file to my Dropbox folder. Clones would let me do this more easily. But then so would hard links (a nearly ubiquitous file system feature that lets a file appear in multiple directories).

Clones open the door for potential confusion. While copying a file may take up no space, so too deleting a file may free no space. Imagine trying to free space on your system, and needing to hunt down the last clone of a large file to actually get your space back.

APFS engineers don’t seem to have many use cases in mind; at WWDC they asked for suggestions from the assembled developers (the best I’ve heard is for copied VMs; not exactly a mass-market problem). If the focus is generic revision control, I’m surprised that Apple didn’t shoot for a more elegant solution. One could imagine functionality with APFS that allows a user to enable per-file Time Machine, change tracking for any file. This would create a new type of file where each version is recorded transparently and automatically. You could navigate to previous versions, prune the history, or delete the whole pile of versions at once (with no stray clones to hunt down). In fact, Apple introduced something related 5 years ago, but I’ve literally never seen or heard of it until researching this post (show of hands if you’ve clicked “Browse All Versions…”). APFS could clean up its implementation, simplify its use, and bring generic support for all applications. None of this solves my Game of Thrones storage problem, but I’m not even sure it’s much of a problem…

Side note: Finder copy creates space-efficient clones, but cp from the command line does not.

Next in this series: Performance

24 Responses

Brandon says:

June 20, 2016 at 10:28 am

One use case of multiple local copies could be a lame sort of redundancy. Multiple copies of the file in case something goes wrong. I’ve done this as a sort of ritual.

CoW kind of blows that out of the water then.
Harald Striepe says:

June 20, 2016 at 7:46 pm

Cloning will be very handy for VMs. It would be clearer and likely less risky than the built in VMware feature of this.

You write cp does not clone, does ditto?
1. ahl says:
  
  June 21, 2016 at 4:18 am
  
  Agreed: good for vms. ditto doesn’t use clone in the build I’m running.
Marc Haisenko says:

June 20, 2016 at 8:36 pm

Cloning will also be very handy for bundle-based (directory-based) file types. When saving a new document revision, the resources in the document like images may need to be copied. Cloning will help here. An app I’m maintaining in my company falls into this category, it needs to copy several (sometimes large) images on save and will likely benefit from cloning with slightly faster save times.
1. ahl says:
  
  June 21, 2016 at 4:19 am
  
  But is that still basically poor-man’s revision control?
D Hutchinson says:

June 20, 2016 at 9:05 pm

I’m thinking the file copy behavior you mentioned – no data is actually copied unless changes are made – is simply done because it is essentially free on a copy-on-write filesystem.

Beyond that, though, is FS replication across a network. Currently, if you change the metadata on a file, and backup over a network, often times the file will be re-copied. Or, if you’re using something like rsync, the server will generate a checksum (Requiring reading the data, which is a lot of work for a lot of files or for large files), the client generates a checksum (Same amount of work), and the checksums are compared before the data is sent over the network.

ZFS, however, tracks changes at the block level, so no matter how the metadata is changed, if the file is unchanged, it doesn’t need to be copied over the network, and it doesn’t even need to be read to make a comparison, since changes are already tracked at the FS level.

This comes with copy-on-write, and I’d expect to see this show up on the first Time Capsule that uses APFS, and would be surprised if it wasn’t there. After all, the underlying tech is mostly already there.
Aaron Meurer says:

June 20, 2016 at 11:28 pm

Cloning fixes an important problem with hard links. The issue with hard links is that some applications will write “into” hard links, effectively changing both “copies” of the file, and other applications will “break” the link (copy on write). You can see this even with text editors. Try making a simple file with a hard link and editing it with stock vim and with stock emacs. Vim writes into the hard link, and emacs breaks it.

So you really have to be in complete control of every application that uses each hard linked file for it to be effective. The most common instance of this is read-only files. For instance, hard links are often employed for making duplicate instances of shared libraries with different names, like libwhatever.dylib and libwhatever.1.2.3.dylib. Shared libraries are binary executables, and thus generally read only. They are also obviously used for backups, which are also read only.

Clones are like hard links, except they explicitly *always* break (emacs behavior). Traditional hard links are great for the “never break” behavior (except many applications do their own copy-on-write since filesystems don’t support it natively).

The personal use-case for hard links is multiple install prefixes via a package manager (conda environments). These generally work, since install prefixes are generally read only, but this isn’t enforced via permissions, and it does happen that one prefix can corrupt another if something writes into it. It would be awesome to have the same space sharing + extremely fast coping capabilities of hard links, with the ability to completely avoid cross corruption.

It’s also worth pointing out that hard links typically share the same metadata (such as file permissions). APFS clones assumedly would be less strict about this (e.g., allow two clones of the same data have different file permissions).
Christopher Smith says:

June 21, 2016 at 1:10 am

Yeah, cloning looks to me exactly like “hard links with simpler semantics”. It’s super useful for all kinds of transactional updates to the filesystem without risking data loss, like when installing software. You write everything into a temp space, and after that is ready, you just atomically copy it all in to the final destination. This is traditionally done using “mv”, but you run in to various bits of odd trouble with that. Being able to do it with a copy will, I suspect, make things a bit simpler.
Michael Martz says:

June 21, 2016 at 1:29 am

Apple may not even need to rework this much to get per-file time machine. They could just create a new “special” folder that uses clones internally for versioning but still look to the user like a file.
Stephan says:

June 21, 2016 at 1:51 am

Dropbox supports symlinks…
1. ahl says:
  
  June 21, 2016 at 4:20 am
  
  Well that’s nifty; scratch one more use case for APFS reflinks.
J Osborne says:

June 21, 2016 at 3:33 am

If you look at “productivity” apps that use packages (the directories that pretend to be files) they frequently have some sort of main file (XML or a plist) that may or may not encode the plain text, but images, sound clips, movies and things like that all live in their own files.

Doing a save on that will typically write the xml file anew, but leave unchanged image files and the like alone.

So it is extremely likely that saving a Pages file via a clone will involve making the clone, writing the smallish XML file, and leaving a bunch of JPGs and PNGs “as is”, then removing the old clone (or keeping it around for version control).

Apparently some (most?) of this is done automatically for you if you use NSDocument, but I’m more of an iOS & CoreData user, so I’m not sure how magically transparent the really is.
J03 says:

June 21, 2016 at 12:36 pm

Is it the case that some of the APFS features have greater applicability on devices over the desktop consideration?
Jake says:

June 21, 2016 at 1:40 pm

If you download an install image, will cloning eliminate the the time to recopy the files that have already been downloaded to the same volume? The Install should be almost instantaneous after download…
1. ahl says:
  
  June 21, 2016 at 3:57 pm
  
  Could be but I’d doubt it. If the installer is on a dmg then that’s a different file system / volume so doing an efficient copy would require extensive (probably prohibitively) magic.
Pingback: Adam Leventhal's blog » APFS in Detail: Encryption, Snapshots, and Backup
Chuck says:

June 22, 2016 at 2:06 pm

Pardon my newbieness with file systems, but wouldn’t cloning help on the user end with doing something like copying pictures into Photos? As it stands now, if I copy a picture from the Desktop into Photos, it copies the photo into the Photos file (directory), which is then taking up twice the memory. Wouldn’t cloning allow Photos to just instantly copy that data without completely rewriting it, so it would save time and storage, or am I understanding cloning incorrectly?
Anon says:

June 23, 2016 at 5:32 am

Re constant time cloning (btrfs reflink like) – perhaps it could be used to do poor man’s after the fact dedupe for identical files?
GaranceD says:

June 24, 2016 at 3:34 am

I think that Finder doing the automatic low-level duplication and COW semantics will be more useful than you expect. Even if you think of it as just a “poor man’s revision control system”, think of how much easier it will be for the average user than it is to use (say) ‘git’.

I share the concern that this low-level duplication means that after you “duplicate” a file/directory, you still have only one instance of the data for that file on the disk.

I assume that the low-level duplication is done by Apple’s programs & utilities, such as the Finder. Given that there isn’t any after-the-fact de-duplication going on, then I assume a generic utility such as ‘rsync’ would in fact create a second copy of the data (for those people who do want that duplicate copy).
GaranceD says:

June 24, 2016 at 3:41 am

I also very much wish that the filesystem included checksums on all the data.

I had one *extremely* frustrating case where I put some disks in a Drobo, which meant that data written to the drobo was written to two physical disks. It turned out that the disks I bought had silent I/O errors. Data would be written to the disk without any indication of error, and you could read it back without any indication of error, but the data you read back was not the same as what you had written.

This problem was with the disks, not the drobo. But the fact that the disks were in the drobo made it harder to figure out WTF was going on. I’d write data to the disk, and then when I’d read data the drobo would respond with the data from whichever copy was most convenient (according to the algorithms in the drobo). This meant that I would write data, and then when I’d read it I would *sometimes* get the correct data back, and sometimes get incorrect data back.

I ended up removing the disks from the drobo, and the same problem of silent I/O errors happened even when the disks were installed internal to a Mac Pro.

It took me over a month of working on it almost every day before I was certain I understood what was going on. Checksums at the file-system level would have saved me a whole lot of time and frustration!
GaranceD says:

June 24, 2016 at 3:47 am

Added note: On my previous comment, this issue happen when I bought two brand new hard disks and put them in a brand new drobo. My previous comment makes it sound like they were just some random disks that I had lying around. 🙂

So I had just spent what (for me) was a considerable amount of money *precisely* because I wanted to be sure I could trust the integrity of the data, and I ended up with something significantly worse than the 5-year-old disk that I intended to replace. I wouldn’t have even know about the problem if it wasn’t for the fact that I started out by filling the disks with random data, kept sha1-digests of the data as I wrote it, and then read back all that data & compared sha1-digests.
룸사롱 says:

June 26, 2016 at 1:53 pm

One thing I’d like to touch upon is that fat burning plan fast can be performed by the correct diet and exercise. People’s size not simply affects the look, but also the general quality of life. Self-esteem, despression symptoms, health risks, in addition to physical skills are affected in extra weight. It is possible to just make everything right but still gain. Should this happen, a medical problem may be the offender. While an excessive amount food but not enough exercising are usually accountable, common health conditions and widely used prescriptions can easily greatly enhance size. Kudos for your post here.
Dora l'Exploratric says:

June 28, 2016 at 8:38 pm

Deniz Yıldızı 445. Bölüm Tek Parça
Pingback: Michael Tsai - Blog - Apple File System (APFS)