Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement shared memory BSD locks #85

Closed
ncruces opened this issue Apr 28, 2024 · 5 comments · Fixed by #90
Closed

Implement shared memory BSD locks #85

ncruces opened this issue Apr 28, 2024 · 5 comments · Fixed by #90
Labels
enhancement New feature or request

Comments

@ncruces
Copy link
Owner

ncruces commented Apr 28, 2024

Both the BSDs and illumos support shared memory, but not OFD locks.

Some form of in-memory locks would allow them to (safely) support WAL mode, as long we lock the database file exclusively to prevent other processes from accessing it.

@ncruces ncruces added enhancement New feature or request help wanted Extra attention is needed labels Apr 28, 2024
@ncruces ncruces changed the title Implement in-memory WAL locks Implement shared memory WAL locks May 20, 2024
@ncruces ncruces changed the title Implement shared memory WAL locks Implement shared memory BSD locks May 20, 2024
@ncruces

This comment was marked as outdated.

@daenney
Copy link
Contributor

daenney commented Jun 3, 2024

So far switching to this library has allowed users on Illumos to now run GoToSocial and since in general this library has a much easier time supporting all platforms the Go compiler supports we'd like to complete that transition.

Since this issue affects Illumos too I'm actually a bit worried that user might now have a problem as our default is db-sqlite-journal-mode: "WAL" in settings. Do we need to tell them to switch to TRUNCATE for the time being?

However, as noted in our previous discussion, we would like to keep the FreeBSD and OpenBSD ports working and we generally prefer to use WAL mode as it tends to be more performant in our case.

Could you elaborate a bit on what it would take to implement this? I wouldn't mind taking a stab at it, but I'm not entirely sure where to begin. An implementation built on Go's synchronisation primitives is probably the easiest on me personally, but I can look at other things too if you've got some pointers for me.

@ncruces
Copy link
Owner Author

ncruces commented Jun 3, 2024

There are a few sides to this.

It should be noted that, even if this is fixed, the performance, for most OSes/architechtures, won't be amazing. The wazero compiler is amd64/arm64 only, and Linux/macOS/FreeBSD/Windows only. Improving on that is out-of-scope here.

Switching to TRUNCATE is definitely the easiest solution today. Performance under load won't be great, though, as BSD locks have worse concurrency than OFD locks.


As for the actual issue, I'd like to solve this, but it's definitely challenging.

The major issue is how badly broken POSIX locks are. BSD locks have much better semantics, but don't support multiple locks per file. OFD locks are the best of both worlds.

For rollback journal, you can get away with a single lock per file, if you accept: (1) reduced concurrency, (2) writer starvation.
For WAL mode, you really need multiple locks per file. And on most unixes, that's POSIX locks, period.

So, to synchronize with other processes, we need some kind of file locks, and to synchronize internally we need something in-memory. That's very close to what SQLite did, and it's a PITA to implement.

I've spent some time trying to find an angle that's easier but, TBH, haven't found any. Both my suggestions above quickly break down.

@ncruces
Copy link
Owner Author

ncruces commented Jun 4, 2024

Sleeping over it, I may have a solution that I'm not sure doesn't work. Writing this down so I don't forget. I'll take a look, time permitting.

The idea is to make a shm.go for flock.

shmMap must open the file, always O_RDWR. Then it goes through a list (a global variable), and checks if os.SameFile as an already opened file. If one matches, it uses that instead (increasing a ref count). If none matches, it locks the file exclusively, and does the DMS truncate branch. If it can't lock, returns BUSY; otherwise it carries on. The exclusive file lock is never released.

shmLock must "lock" on an object stored in the global list. Locks that may be shared must be ints, locks that are only locked exclusively can be bools; something like that. A mutex around access to these fields is simplest, and not a performance issue.

shmUnmap must release any held locks, decrease the ref count, and close/delete the file.

And vfsShm must also now have an explicit Close that also releases any held locks, and decreases the ref count (this accounts for a connection that's force closed).

This should be possible to test under Linux and macOS with the tag sqlite3_flock. All tests (including mptest) should reliably pass.

On macOS we must also test if the sqlite3 CLI can open a database that's opened like this: it shouldn't (on Linux we can't run this test). This is important to prevent corruption.

@daenney
Copy link
Contributor

daenney commented Jun 4, 2024

It should be noted that, even if this is fixed, the performance, for most OSes/architechtures, won't be amazing. The wazero compiler is amd64/arm64 only, and Linux/macOS/FreeBSD/Windows only. Improving on that is out-of-scope here.

I think that's fine for GoToSocial at least. We don't really expect anyone to be running this on something other than amd|arm64. Maybe risc64 at some point in the future. We only target Linux and the BSDs, anything beyond that is great if it works but not something we make promises about.

@ncruces ncruces mentioned this issue Jun 4, 2024
daenney added a commit to superseriousbusiness/gotosocial that referenced this issue Jun 4, 2024
Adds documentation and a configuration validation check to ensure that
when folks run with the WASM SQLite build on anything other than Linux
we don't allow them to run with journaling set to WAL. That will result
in database corruption until ncruces/go-sqlite3#85 is resolved. It's
otherwise safe to use the WASM build.

Fixes #2962
daenney added a commit to superseriousbusiness/gotosocial that referenced this issue Jun 4, 2024
Adds documentation and a configuration validation check to ensure that
when folks run with the WASM SQLite build on anything other than Linux
we don't allow them to run with journaling set to WAL. That will result
in abysmal performance until ncruces/go-sqlite3#85 is resolved.

Fixes #2962
@ncruces ncruces removed the help wanted Extra attention is needed label Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants