Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: Reimplement btrfs polling with benefits #20893

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

mvollmer
Copy link
Member

@mvollmer mvollmer commented Aug 16, 2024

This reduces the roundtrips and allows us to maintain "temporary" mount points for btrfs filesystems that are not otherwise mounted.

The process will be shutdown when the cockpit session is closed by cockpit-ws, so we get reliable cleanup.

Fixes #20855

Demo: https://youtu.be/aNsdXtHnUUM
Anaconda demo: https://youtu.be/EM-qqogOt6g

  • make btrfs-tool more robust so that the monitor never crashes because of problems with individual filesystems. E.g., manually unmounting /var/lib/cockpit/btrfs/$uuid should just stop reports for that filesystem, and not crash the monitor.
  • demo
  • move subvol creation and deletion to the tool?
  • unmount /var mount during tear down, so that we can erase etc.
  • only in Anaconda mode: don't mount secretly for listing, but if listing works because of non-secret mounts, allow creation everywhere via a temporary secret mount. Otherwise excuse is: "At least one subvolume needs to be mounted."
  • figure out editing the label. if we can find a rw mount, use that plus udevadm trigger, if not, but it is mounted ro, use that as the excuse, if not, use udisks method.
  • don't allow deleting the last mounted subvolume, that would kill the polling. damn.
  • there is a race where cockpit wants to tear down the tmp mount, but it is gone by the time it does it. happens in a Anaconda test, check that.
  • tests
  • coverage

pkg/storaged/btrfs/btrfs-tool.py Fixed Show fixed Hide fixed
pkg/storaged/btrfs/btrfs-tool.py Fixed Show fixed Hide fixed
pkg/storaged/btrfs/btrfs-tool.py Fixed Show fixed Hide fixed
pkg/storaged/btrfs/btrfs-tool.py Fixed Show fixed Hide fixed
@mvollmer mvollmer force-pushed the storage-btrfs-local-polling branch 2 times, most recently from 03b3bbf to f4fd732 Compare August 16, 2024 11:44
pkg/storaged/btrfs/btrfs-tool.py Fixed Show fixed Hide fixed
pkg/storaged/btrfs/btrfs-tool.py Fixed Show fixed Hide fixed
@mvollmer
Copy link
Member Author

mvollmer commented Aug 19, 2024

Some notes:

  • This will mount the default subvolume of a btrfs filesystem on /var/lib/cockpit/btrfs/$uuid if that is necessary to list its subvolumes.

  • The expectation is that this only happens as little as necessary. There is a lot of machinery in btrfs-tool.py to make this happen. Btrfs subvolumes need to be polled periodically (Cockpit does it every 5 seconds), and mounting a btrfs filesystem every five seconds is not acceptable. So we only mount it once when it becomes necessary and leave it mounted until the Cockpit session ends or the filesystem has been mounted by the user for real.

  • The tool can run concurrently without getting confused. This is necessary when more than one Cockpit session is active on the same machine.

  • We also use this to occasionally run a synchronous poll, for example right after creating a new subvolume in Cockpit. This synchronous poll runs concurrently with the monitor mode of the tool. We can't just trigger the monitor since we want to synchronously get the poll results.

  • The problem with all this is of course that this is a giant race condition for users working on the command line (and also for Cockpit itself): If you unmount a filesystem while someone is logged into Cockpit on the same machine, your filesystem will silently get mounted again and you can't do whatever you wanted to do with the unmounted filesystem. This is a strong argument to only offer the automounting in Anaconda mode.

  • But to counter the last point a bit, if the user discovers the mount in /var/lib/cockpit/btrfs/ and unmounts it manually, then Cockpit will not mount it back. It will just fall out of date with external changes. The synchronous polling will still work, so Cockpit will not fall out of sync with its own actions.

  • We need to mount the filesystem read-write, unfortunately. If we mount it read-only, nobody can mount any subvolume of it read-write. (The first mount of a btrfs filesystem determines the readonlyness of all of it. This also affects other mounts, of course, but they are not our problem here. E.g. if you mount subvol "root" on "/" read-only, you can't mount subvol "var" on "/var" read-write anymore.)

  • But because we mount read-write, we can use the mount point in /var/lib/cockpit/ also to create and delete subvolumes on otherwise unmounted btrfs filesystems.

  • BUT: since subvolumes can only be created opr deleted while the filesystem is mounted somewhere, we could just poll it once Cockpit starts, and then only while it is mounted, and stop polling when it is not mounted, with the assumption that nothing will change when it isn't actually mounted. But this is likely unreliable.

  • Another option is to put a explicit "refresh" button into the Cockpit UI. This would need to be clicked for Cockpit to pick up external changes.

@mvollmer mvollmer force-pushed the storage-btrfs-local-polling branch 3 times, most recently from 2062f54 to 37ba9ce Compare August 28, 2024 09:32
@mvollmer mvollmer added the no-test For doc/workflow changes, or experiments which don't need a full CI run, label Aug 28, 2024
@mvollmer mvollmer removed the no-test For doc/workflow changes, or experiments which don't need a full CI run, label Aug 28, 2024
@mvollmer mvollmer force-pushed the storage-btrfs-local-polling branch 2 times, most recently from 0d8374b to 58c912f Compare August 29, 2024 06:56
@mvollmer mvollmer changed the title WIP - poll btrfs with python storage: Reimplement btrfs polling with benefits Aug 29, 2024
if b['fstype'] == "btrfs":
uuid = b['uuid']
mps = list(filter(lambda x: x is not None and not x.startswith(TMP_MP_DIR), b['mountpoints']))
has_tmp_mp = len(list(filter(lambda x: x is not None and x.startswith(TMP_MP_DIR), b['mountpoints']))) > 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit, could be len(mps)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, `len(mps) < len(b['mountpoints']), yeah this code is sh*t.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you still want to rewrite this? :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

return filesystems


tmp_mountpoints = set()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not looking too deep, we have this global state and the db why do we not save the tmp_mountpoints in the DB?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tmp_mountpoints is per process state, just so that we remember which ones this very process has added, so that they can all be removed when it exits. The db is global state for all processes.

@mvollmer mvollmer marked this pull request as ready for review August 29, 2024 11:11
@mvollmer mvollmer requested a review from jelly August 29, 2024 11:11
if b['fstype'] == "btrfs":
uuid = b['uuid']
mps = list(filter(lambda x: x is not None and not x.startswith(TMP_MP_DIR), b['mountpoints']))
has_tmp_mp = len(list(filter(lambda x: x is not None and x.startswith(TMP_MP_DIR), b['mountpoints']))) > 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you still want to rewrite this? :)

pkg/storaged/btrfs/btrfs-tool.py Show resolved Hide resolved
pkg/storaged/btrfs/btrfs-tool.py Show resolved Hide resolved


@contextlib.contextmanager
def context_chdir(path):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, this now exists in 3.11 https://docs.python.org/3/library/contextlib.html#contextlib.chdir

We can't use it yet, but adding a note might be nice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I originally had contextlib.chdir here but had to stop using it bc not all our test images have it. I add a comment.

os.makedirs(TMP_MP_DIR, mode=0o700, exist_ok=True)
fd = os.open(path, os.O_RDWR | os.O_CREAT)
fcntl.flock(fd, fcntl.LOCK_EX)
data = os.read(fd, 100000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the most robust :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, forgot about that... thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually... this is pretty ok, no? Files will never have a short read, and the 100000 gives us a safety limit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a read_all function anyway.

pkg/storaged/btrfs/btrfs-tool.py Show resolved Hide resolved
pkg/storaged/btrfs/volume.jsx Show resolved Hide resolved
@@ -1245,6 +1245,8 @@ export const TeardownMessage = (usage, expect_single_unmount) => {
if (use.block) {
const name = teardown_block_name(use);
let location = use.location;
if (location && location.startsWith("/var/lib/cockpit/btrfs/"))
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, so we hide it here! Feels a bit random :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about introducing a new mount option x-cockpit-tmp-mount and then filtering based on that mount option.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is a bit random and also not totally trivial. We do want Cockpit to teardown this mount point because otherwise we couldn't format a block device with btrfs on it. The btrfs-tool can tolerate that. We just don't want to show it in the UI.

As to how to identify this mountpoint... I think just the path is enough and we don't need to identify more machinery. Also, we don't really have access to run-time mount options in Cockpit, only to the ones in fstab. Except for btrfs filesystems from the findmnt poll....

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, having mount options available... if that is btrfs specific then a general x-cockpit-tmp-mount which only works in btrfs case because of findmnt is not ideal. So we scratch the idea but might suggest adding a constant for it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but might suggest adding a constant for it.

Yes, good idea.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

pkg/storaged/btrfs/btrfs-tool.py Outdated Show resolved Hide resolved
path = TMP_MP_DIR + "/db"
os.makedirs(TMP_MP_DIR, mode=0o700, exist_ok=True)
fd = os.open(path, os.O_RDWR | os.O_CREAT)
fcntl.flock(fd, fcntl.LOCK_EX)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I open two browsers with the same user this can race, and flock() would throw so do we need a retry mechanism here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flock would block, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read the man page... https://manpages.debian.org/bookworm/manpages-dev/flock.2.en.html#DESCRIPTION

It's only non-blocking if you include LOCK_NB

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want it to block. I am afraid I don't understand what failure scenario you are after here.

We used to ignore all other mounts when the first one is for the root
filesystem. But filesystems can have more than one mount and we should
look at each one individually.

This also prepares the code for more cases.
@mvollmer mvollmer force-pushed the storage-btrfs-local-polling branch 4 times, most recently from 4e8fc61 to b1b434b Compare September 11, 2024 11:57
The subvolume polling is now done by a long-running monitor program
that only reports changes back to Cockpit. This reduces traffic in
general and allows us to do additional clever things.

The monitor program can optionally keep all btrfs filesystems mounted
in a "secret" location, so that we can list the subvolumes of all
btrfs filesystems, even those that are not officially mounted
otherwise.

This mode is only enabled in Anaconda mode: It is important there
since filesystems are normally not mounted when preparing storage for
Anaconda, and outside of Anaconda mode, keeping long standing secret
mounts is deemed to invasive.

The program can also carry out btrfs operations (such as creating
subvolumes) while temporarily mounting it. This allows Cockpit to
always create and delete subvolumes. The UI still only does it when
monitoring subvolumes works as well, however.
@@ -397,15 +290,91 @@ function btrfs_findmnt_poll() {
});
}

function btrfs_update(data) {
if (!client.uuids_btrfs_subvols)
client.uuids_btrfs_subvols = { };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This added line is not executed by any test.

if (!client.uuids_btrfs_subvols)
client.uuids_btrfs_subvols = { };
if (!client.uuids_btrfs_usage)
client.uuids_btrfs_usage = { };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This added line is not executed by any test.

if (!client.uuids_btrfs_usage)
client.uuids_btrfs_usage = { };
if (!client.uuids_btrfs_default_subvol)
client.uuids_btrfs_default_subvol = { };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This added line is not executed by any test.

Comment on lines +306 to +307
if (data[uuid].error) {
console.warn("Error polling btrfs", uuid, data[uuid].error);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 2 added lines are not executed by any test.

Comment on lines +365 to +366
channel.catch(err => {
throw new Error(err.toString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 2 added lines are not executed by any test.

Comment on lines +69 to +72
if (usage.Blocking) {
dialog_open({
Title: cockpit.format(_("$0 is in use"), name),
Body: BlockingMessage(usage)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 4 added lines are not executed by any test.

Title: cockpit.format(_("$0 is in use"), name),
Body: BlockingMessage(usage)
});
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This added line is not executed by any test.

// We don't complain about the rootfs, it's probably
// configured somewhere else, like in the bootloader.
if (m == "/")
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This added line is not executed by any test.


// This is the mount point used for monitoring btrfs filesystems.
if (m.startsWith(BTRFS_TOOL_MOUNT_PATH))
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This added line is not executed by any test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

storage: btrfs volumes that require mounting to perform an action should auto-mount
3 participants