Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

darwin: double-check that fsync should flush disk write cache using F_FULLFSYNC #9439

Closed
jorangreef opened this issue Nov 3, 2016 · 8 comments
Labels
fs Issues and PRs related to the fs subsystem / file system. libuv Issues and PRs related to the libuv dependency or the uv binding. macos Issues and PRs related to the macOS platform / OSX.

Comments

@jorangreef
Copy link
Contributor

I tried looking in the Node and libuv source but was unable to confirm whether fs.fsync() does in fact do F_FULLFSYNC on darwin?

According to http://xiayubin.com/blog/2014/06/20/does-fsync-ensure-data-persistency-when-disk-cache-is-enabled/ as of 14.04 Ubuntu has moved to a stronger fsync that includes writing through or flushing a disk cache if present.

To do this on darwin requires F_FULLFSYNC. Otherwise a Node app may call fsync with nothing actually being flushed to disk on darwin systems.

Obviously, fsync will always be broken on older operating systems, but at least the tendency is to move towards making stronger guarantees, not the other way round.

@jorangreef
Copy link
Contributor Author

It seems that at least the implementation for fdatasync() on darwin was updated to NOT use F_FULLFSYNC as of libuv/libuv@a4213b7. This is not good.

Node's documentation for fdatasync() references the Linux man page which makes the same guarantee as Ubuntu: This includes writing through or flushing a disk cache if present.

This is not in fact true for Node's fdatasync() on darwin.

@jorangreef
Copy link
Contributor Author

Or perhaps F_FULLFSYNC is no longer needed to flush the disk write cache on darwin?

@bnoordhuis
Copy link
Member

Libuv just forwards to the libc function or system call of the same name. If Apple has a broken implementation, that's their problem. You could check what Apple's Libc does.

@jorangreef
Copy link
Contributor Author

jorangreef commented Nov 3, 2016

Libuv just forwards to the libc function or system call of the same name

I thought libuv exists to wrap these kinds of cross-platform differences?

If Apple has a broken implementation, that's their problem.

I doubt Apple would consider their implementation broken. They provide fsync() (to flush kernel buffers) and F_FULLFSYNC (to flush kernel buffers and disk buffers) and leave it up to the user to decide what they need.

Most users think of fsync() to mean F_FULLFSYNC so Node should go with that. This makes more sense than requiring Node or a third-party binding to expose F_FULLFSYNC on darwin.

Would you or @saghul be open to a pull request from me to fix this in libuv? libuv used to call F_FULLFSYNC on darwin so this would just be fixing a regression rather than setting any precedent.

Otherwise we need to do a PR to make it clear in the docs that Node's fsync() and fdatasync() are no-ops on darwin.

@mscdex mscdex added fs Issues and PRs related to the fs subsystem / file system. macos Issues and PRs related to the macOS platform / OSX. libuv Issues and PRs related to the libuv dependency or the uv binding. labels Nov 3, 2016
@bnoordhuis
Copy link
Member

I think I misunderstood your point. I have no issue with uv_fs_fsync() having stricter guarantees on OS X but see below.

They provide fsync() (to flush kernel buffers) and F_FULLFSYNC (to flush kernel buffers and disk buffers)

That's not so different from other operating systems. fysnc() on Linux doesn't guarantee that data is on disk either (Ubuntu man pages notwithstanding), only that it's been handed off to the disk controller by the time the system call returns. Its exact semantics vary from file system to file system and is usually configurable by the system administrator.

libuv used to call F_FULLFSYNC on darwin so this would just be fixing a regression rather than setting any precedent.

That was for uv_fs_fdatasync() though, not uv_fs_fsync(), and only because there was no fdatasync system call in old XNU kernels.

@jorangreef
Copy link
Contributor Author

jorangreef commented Nov 3, 2016

I have no issue with uv_fs_fsync() having stricter guarantees on OS X

Thanks. Should fdatasync() be as safe as fsync() on OS X with the usual requirement that unnecessary metadata such as mtime etc. is not flushed? Or if that's not possible should the docs for fdatasync() highlight that it's not as safe as fsync() on OS X or fsync() and fdatasync() on Linux (see below)?

That's not so different from other operating systems. fysnc() on Linux doesn't guarantee that data is on disk either (Ubuntu man pages notwithstanding)

It's not just Ubuntu, but also Linux. There was an effort starting a few years back as far as I understand to make fsync do the right thing on Linux (http://www.mail-archive.com/linux-fsdevel@vger.kernel.org/msg12025.html). See the Linux man pages (http://man7.org/linux/man-pages/man2/fsync.2.html):

fsync() transfers ("flushes") all modified in-core data of (i.e., modified
buffer cache pages for) the file referred to by the file descriptor fd to the
disk device (or other permanent storage device) so that all changed information
can be retrieved even after the system crashed or was rebooted. This includes
writing through or flushing a disk cache if present.

In the same way, Apple's man pages encourage F_FULLFSYNC if the intention is to flush to disk:

Note that while fsync() will flush all data from the host to the drive (i.e. the
"permanent storage device"), the drive itself may not physically write the data
to the platters for quite some time and it may be written in an out-of-order
sequence.

Specifically, if the drive loses power or the OS crashes, the application may
find that only some or none of their data was written. The disk drive may also
re-order the data so that later writes may be present, while earlier writes are
not.

This is not a theoretical edge case. This scenario is easily reproduced with
real world workloads and drive power failures.

For applications that require tighter guarantees about the integrity of their
data, Mac OS X provides the F_FULLFSYNC fcntl. The F_FULLFSYNC fcntl asks the
drive to flush all buffered data to permanent storage. Applications, such as
databases, that require a strict ordering of writes should use F_FULLFSYNC to
ensure that their data is written in the order they expect. Please see fcntl(2)
for more detail.

Apple's fsync() syscall is not the same as Linux's fsync() syscall. It leaves fine-grained control for people who know they have battery backed write caches and who don't want to force a write to disk but just get the data to the write cache, and it provides F_FULLFSYNC for people who need to get to disk.

Thus what most people think of as Node's fs.fsync() should ultimately be implemented as:

Linux: fsync
Darwin: F_FULLFSYNC
Windows: FlushFileBuffers

That was for uv_fs_fdatasync() though, not uv_fs_fsync(), and only because there was no fdatasync system call in old XNU kernels.

Yes, thanks. I saw the change to uv_fs_fdatasync and that got me worrying about Node's fsync(). I couldn't find any reference to F_FULLFSYNC for fsync() which was the reason for the issue.

@bnoordhuis
Copy link
Member

Should fdatasync() be as safe as fsync() on OS X with the usual requirement that unnecessary metadata such as mtime etc. is not flushed?

I'm okay with (effectively) making uv_fs_fdatasync an alias for uv_fs_fsync on OS X.

It's not just Ubuntu, but also Linux. There was an effort starting a few years back as far as I understand to make fsync do the right thing on Linux

Yes, I'm aware but the current state of affairs is still a mixed bag. If you use one of the "big five" file systems you're probably safe but it's hit and miss with less popular ones. The HFS+ driver somewhat amusingly seems to be one of the file systems that doesn't get it right.

santigimeno pushed a commit to libuv/libuv that referenced this issue Nov 17, 2016
Apple's fsync and fdatasync explicitly do NOT flush the drive write
cache to the drive platters. This is in contrast to Linux's fsync and
fdatasync which do, according to recent man pages. F_FULLFSYNC is
Apple's equivalent for flushing buffered data to permanent storage.

PR-URL: #1124
Refs: nodejs/node#9439
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
@jorangreef
Copy link
Contributor Author

Thanks, done.

cjihrig added a commit to cjihrig/node that referenced this issue Jan 12, 2017
Refs: nodejs#9439
Fixes: nodejs#9464
Fixes: nodejs#9690
PR-URL: nodejs#10717
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Saúl Ibarra Corretgé <saghul@gmail.com>
italoacasas pushed a commit to italoacasas/node that referenced this issue Jan 18, 2017
Refs: nodejs#9439
Fixes: nodejs#9464
Fixes: nodejs#9690
PR-URL: nodejs#10717
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Saúl Ibarra Corretgé <saghul@gmail.com>
italoacasas pushed a commit to italoacasas/node that referenced this issue Jan 23, 2017
Refs: nodejs#9439
Fixes: nodejs#9464
Fixes: nodejs#9690
PR-URL: nodejs#10717
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Saúl Ibarra Corretgé <saghul@gmail.com>
italoacasas pushed a commit to italoacasas/node that referenced this issue Jan 25, 2017
Refs: nodejs#9439
Fixes: nodejs#9464
Fixes: nodejs#9690
PR-URL: nodejs#10717
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Saúl Ibarra Corretgé <saghul@gmail.com>
italoacasas pushed a commit to italoacasas/node that referenced this issue Jan 27, 2017
Refs: nodejs#9439
Fixes: nodejs#9464
Fixes: nodejs#9690
PR-URL: nodejs#10717
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Saúl Ibarra Corretgé <saghul@gmail.com>
MylesBorins pushed a commit that referenced this issue May 16, 2017
Refs: #9439
Fixes: #9464
Fixes: #9690
PR-URL: #10717
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Saúl Ibarra Corretgé <saghul@gmail.com>
MylesBorins pushed a commit that referenced this issue May 18, 2017
Refs: #9439
Fixes: #9464
Fixes: #9690
PR-URL: #10717
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Saúl Ibarra Corretgé <saghul@gmail.com>
andrew749 pushed a commit to michielbaird/node that referenced this issue Jul 19, 2017
Refs: nodejs/node#9439
Fixes: nodejs/node#9464
Fixes: nodejs/node#9690
PR-URL: nodejs/node#10717
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Saúl Ibarra Corretgé <saghul@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fs Issues and PRs related to the fs subsystem / file system. libuv Issues and PRs related to the libuv dependency or the uv binding. macos Issues and PRs related to the macOS platform / OSX.
Projects
None yet
Development

No branches or pull requests

3 participants