Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persist and serve user preferences #1184

Merged

Conversation

brontolosone
Copy link
Contributor

@brontolosone brontolosone commented Sep 4, 2024

This is a companion to getodk/central-frontend#1024, let's hold high level discussions in there for clarity (but simple on-point code comments etc can go in here)

Related: issue getodk/central#689
Related: PR getodk/central-frontend#1024

@lognaturel lognaturel changed the title Backend for getodk/central#689 Persist and serve user preferences Sep 6, 2024
@lognaturel
Copy link
Member

I took the liberty of renaming the PR so it captures what it does, feel free to edit again!

Copy link
Member

@ktuite ktuite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In retrospect, we should have chatted more about backend on friday! But taking some time to think about it allowed all these thoughts to percolate. It's an opinionated code base, and I wanted to unpack some of that.

Coupling of Backend and Frontend

We talked about some of the reasons backend and frontend are split into separate repos and developed semi-independently some of the time(e.g. some API changes have no bearing on frontend or some frontend redesigns need no or very little backend changes).

But we also talked about how they're tightly coupled, and how there is a lot baked into Backend that is specifically built for our Frontend. Some users seem to use Central without the frontend..... I was going to write that no one uses a custom frontend with backend, but that's not exactly true... some users do use their own custom frontends, but in that case, they mainly use the data API parts of backend, not the frontend API parts of backend. If someone else makes their own frontend, it'll have it's own separate backend/db/whatever in addition to talking to Central. (So many frontends/backends in this last paragraph 😅)

Basically... while the ODK team usually puts a lot of effort into making certain things within the ODK ecosystem generic and interchangeable (e.g. non-Central backends working with Collect, or just getting things to be compatible across combinations of Central/Collect/Android/web-based forms/versions of forms/etc.), the frontend-parts of backend don't really fall under that umbrella and are allowed to be super specific.

We always release Central with Backend and Frontend in sync, and there's generally enough communication along the way to keep them in sync during development in between releases, too. If someone is working on something full stack that touches both parts, they are thoughtful about the timing of getting both separate parts merged in. A frequent pattern is either working on and merging the backend first, or working on both parts in parallel, getting something functional, refining the backend and merging it, then refining the frontend and merging it. If there's a change in backend that breaks something in frontend, we try to get the frontend fix done ahead of time or asap.

What that means for this specific user preferences PR... is that it's okay and likely desirable for backend to have a deeper, structured understanding of the user preferences frontend needs to work with. See below! This is all up for discussion and pushback, we're just trying to have folks on the Central team in agreement about what's getting stored and how.

Structure of User Preferences

In the frontend code, I'm seeing preferences with the keys projectSortMode and formTrashCollapsed. In the backend, could there be a more explicit link between a project and a project-specific preference? We have an Actees table that holds acteeIds for projects, forms, users, the site (*)

The issue talks about site-wide preference and project-specific preferences, and in addition to imagining multiple preferences at each level, we could also possibly have form-specific or dataset-specific user preferences in the future. (I know only the 2 of project-sort-order and form-trash-collapsing are in scope for this PR.)

What if we had a user_preferences table with something like

  • actorId (pointed to the actor record of a user)
  • acteeId (pointed to a project actee ID or * for site-wide)
  • preferences (jsonb)

and the API response returned the combination of multiple rows for a given user?

Or it could go further (taking inspiration from config) and store a actorId, acteeId, key, value, setAt? Splitting every preference for every user into its own row might be overkill, though, or maybe it's not much worse than already splitting things out by actee/project if there are only ever a small number of preferences per actee.

On the other hand, maybe a generic approach is simpler in the long run and quicker to implement.

I think the desire to at least split it by actee (site and per project) is that through inspecting the database and code, we'd have a better idea of what the preferences were about. We don't really want our users to be able to store random stuff in their database through this endpoint. And we'd be able to tidy things up in the future if we needed to. E.g. preferences about a deleted project or form could also be deleted. Not that you can delete projects now (only archive) or have user preferences about forms...

Tests

I was reflecting on the config tests because they are the closest, but they are also not quite the right match. It's an acceptable intro to the kinds of tests we see for a lot of our API endpoints, though!

https://github.com/getodk/central-backend/blob/master/test/integration/api/config.js

Tests in that file should inspire tests in a user-preferences file:

  • should reject if user can't do a thing

    • doesn't apply here because endpoint is about setting one's own preferences?
    • any user can set their own preferences? what if they are using the api and authenticated through an app user token or something, though? i can't remember how the auth is plumbed together in those cases right now.
  • should reject for unknown config / user preferences?

    • maybe we should check that they're setting a known preference key on a known resource (site, project, etc.) so we have a better idea of the shape of what we're storing?
    • possible new test: reject for unknown resource (e.g. project)?
  • should set the config / should patch the config

    • frontend is only going to send one kind of request? it shouldn't care if its setting new configs or not
    • test cases within this scenario (adding totally new user preferences, adding to existing prefernces, updating existing preferennce)
  • should there be a way to delete a preference? (unsetting site-wide configs was something we had to do, but I don't know if it's necessary here for user preferences. could add it later if it becomes a thing.)

lib/model/migrations/20240806-01-add-userpreferences.js Outdated Show resolved Hide resolved
lib/model/migrations/20240806-01-add-userpreferences.js Outdated Show resolved Hide resolved
lib/model/migrations/20240806-01-add-userpreferences.js Outdated Show resolved Hide resolved
lib/model/migrations/20240806-01-add-userpreferences.js Outdated Show resolved Hide resolved
lib/resources/user-preferences.js Outdated Show resolved Hide resolved
lib/resources/user-preferences.js Outdated Show resolved Hide resolved
lib/resources/user-preferences.js Outdated Show resolved Hide resolved
lib/model/query/user-preferences.js Outdated Show resolved Hide resolved
@ktuite
Copy link
Member

ktuite commented Sep 11, 2024

Plan

  1. return existing user preferences with /users/current extended metadata
{ 
    projectSort: 'alphabetical',
    projects: { 1: {formTrashCollapse: true} } 
}
  1. PATCH existing preferences (even empty preferences) with specific preference (property name, property value)

PUT /user-preferences/projectSort body: {propertyValue: 'alphabetical'}

PUT /user-preferences/projects/4/sortOrder {value: 'alphabetical'}

db tables: user_project_preferences, user_site_preferences
with cascading deletes! and relational integrity!
unique on id and property name

  1. DELETE a specific key within user preferences?

DELETE /user-preferences/project/3/formTrashCollapse

@brontolosone brontolosone force-pushed the central_689-userpreferences-rb branch 2 times, most recently from 4d8a013 to 817eaca Compare September 13, 2024 15:01
lib/http/endpoint.js Outdated Show resolved Hide resolved
Copy link
Member

@ktuite ktuite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The core of this PR is looking really good. I like where the database organization of the user preferences ended up. I hope its working well for frontend! It seems like it would be easy to extend to include form-specific preferences if we ever needed those.

My main comments are about tests and seeing if there are things we can hold off on including in this PR (unless you think they're really important).

lib/resources/users.js Outdated Show resolved Hide resolved
lib/resources/user-preferences.js Outdated Show resolved Hide resolved
.expect(400); // project 123 doesn't exist
}));

it('PUT/DELETE preference storage operations', testService(async (service) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see more/smaller tests!

In this big test, the main themes i'm getting out of it are:

  • "it should store site-wide and project-specific preferences of varying complexity"
  • "it should not return preferences that do not belong to the user"
  • "it should group project preferences by project"

I think you should keep that test, or maybe split it up into 2 or 3 tests, but maybe there could also be stupidly basic tests like:

  • "it allows a site-wide preference to be set"
  • "it allows a project-specific preference to be set"
  • "it allows a preference to be deleted"
  • "it returns the (non blank) preferences in the extended metadata for the current user"

And one of our original concerns that the design of the api alleviates:

  • "it allows a preference to be modified without overwriting another preference""

I'd also be interested in tests of the failure cases

  • no propertyValue supplied
  • empty string value supplied
  • (anything else that tests the database constraints or other error handling in the resources)

Copy link
Contributor Author

@brontolosone brontolosone Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would've liked smaller tests too! Yet the way the DB transaction control works in this test framework, combined with how the GET-end of the API is an aggregate (and thus to be fully tested, needs some stuff to aggregate over...) just leads down the path of incrementally building up a whole bunch of DB state in one transaction. Unless I'm missing something that's just the nature of this thing. If I want to test the aggregation I need to do a whole bunch of PUTs. While I'm at that, I might as well test those PUTs (and then I may as well test whether DELETEd properties don't appear in the aggregate). Ideally I'd have some transaction control so I can declare DB savepoints to roll back to after an expected failure.

I'll have a look to see if there's something to be done about that!

On to your points:

  • "it should store site-wide and project-specific preferences of varying complexity"
  • "it should not return preferences that do not belong to the user"
  • "it should group project preferences by project"

Yes, it indeed tests for that, plus it tests whether deletion and overwriting work.

And one of our original concerns that the design of the api alleviates:

  • "it allows a preference to be modified without overwriting another preference"

The atomicity boundary in the API is by HTTP resource, and those map 1:1 to the DB primary keys. So we follow HTTP semantics (intuitively understood by API consumers as well), and those are already tested during the incremental buildup of DB state...

I'd also be interested in tests of the failure cases

  • no propertyValue supplied

You mean {"propertyValue": } ? That'd be invalid json, the framework catches that, and I shouldn't be testing the framework here. But I did find that a null propertyValue was not working (for now I'm blaming (our version of) slonik for not doing the right thing for sql.json(null)), fixed in e514b52, with regression test ;-)

I've added some informative error messages (and tests for those), in ed4b92d.

  • empty string value supplied

An empty json-string (thus {"propertyValue": ""} if I understand you correctly) is valid json, nothing special about it, and thus a valid preference propertyvalue per the contract (yet to be documented, but as I mentioned: value and semantics of the propertyValue are completely up to the frontend). At some point we need to trust the DB, DB driver (*cough* e514b52), and framework to pass things through correctly, including 0-length strings.

  • (anything else that tests the database constraints or other error handling in the resources)

There's a test that triggers a foreign key violation (well... until someone creates a fixture for a project with ID 123...). The propertyName nonzero length DB check constraint is impossible to violate via the HTTP API because you can't supply a zero-length pathname component. I suppose I could test a PUT to /v1/user-preferences/site//, but what would I actually be testing at that point? The Express framework's URL path normalization perhaps? Theoretically they could change it and it would be nice if when they do (would they? It'd cause mayhem) it'd break our tests, but then I'd say that if that is a reasonable expectation then it's worth testing for their URL path normalization explicitly rather than testing for it as a side effect of testing for zero-length propertyName URLs.

I'll have a look at chopping up the tests into more it()s, but as I mentioned, I have to build up state one way or another (either with many PUTs, or by writing to the DB directly, but that has no place in an API test, yet I still want to test the /users/current API endpoint with that DB content...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is pretty much the situation for all Central tests.

...just leads down the path of incrementally building up a whole bunch of DB state in one transaction. Unless I'm missing something that's just the nature of this thing. If I want to test the aggregation I need to do a whole bunch of PUTs....

Send a bunch of requests about forms/submissions to build up the state. If it gets super repetitive, we make special setup functions for specific batches of tests, but it's still making the requests for every test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding some of the smaller tests! e.g.

  • validates the request body stringently
  • can store a JS null propertyValue
  • should not allow storing preferences for nonexistent projects

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can still see an advantage of smaller tests so i can just look at an individual test and grok the 1 thing its trying to do instead of having to follow the whole journey, but eh, it's fine for now!

lib/model/migrations/20240910-01-add-user_preferences.js Outdated Show resolved Hide resolved
lib/model/migrations/20240910-01-add-user_preferences.js Outdated Show resolved Hide resolved
lib/resources/users.js Outdated Show resolved Hide resolved
lib/model/query/users.js Outdated Show resolved Hide resolved
// Yet the string 'null' (as distinct from the *jsonb* string '"null"' one would get with sql.json('null') !)
// gets properly casted by PostgreSQL to a jsonb null (as distinct from an SQL NULL), so we use that in this case.
const preparedPropertyValue = (propertyValue === null) ? 'null': sql.json(propertyValue);
const values = [userId, propertyName, preparedPropertyValue]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if you wanted to trim the propertyName. I noticed i'm able to make properties named ' ' and ' '.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 No, in the end I think that would be bad. I don't think I should treat the property name as anything else but an opaque token, as it's just an identifier. Spaces in the URL path component for the property name (encoded though of course) are valid.

Tangentially, Github markdown squished (trimmed) your ' ' and ' ' whitespace property name examples. That's sort of illustrative: it's because Markdown is for presentation, and it is documented to do that. Yet for the property names in the DB and API I don't see a benefit in constraining the property name (beyond refusing a zero-length string). They're chosen by developers, and of course a developer can choose a bad property name (including ' ') just like they can choose bad variable names, but presumably we'd catch that in review, just like we catch bad variable names.

Comment on lines +127 to +129
// { expected: "list of expected properties", actual: "list of provided properties" }
unexpectedProperties: problem(400.33, ({ expected, actual }) => `Expected properties: (${expected.join(', ')}). Got (${actual.join(', ')}).`),

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there another existing Problem we could use instead? I see this only comes up if you pass propertyValue AND extra stuff like {'propertyValue': 'meow', 'cats': ' '}. Maybe we just accept propertyValue and ignore the rest.

I'm also ok leaving it as is with this new Problem because there is a test for it :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did look at the existing Problems, and they had their problems. There's unexpectedAttributes but its comments and problem string then proceed to talk about arguments rather than attributes. I didn't feel like fixing that because due to the ambiguity some use sites will use it "because it's about arguments" and some others will use it "because it's about attributes" so that would require unraveling the original intent at each use site.

I could probably have lived with using the existing unexpectedAttributes error even though its name implies we're talking about "attributes" rather than "properties" ("properties" is what we're dealing with in json), I'm not that zealous, but I drew the line at "arguments" as in unexpectedAttributes's error text. Because talking about "arguments" will make for a confusing error message in the context of posting a JSON body, and I value a precise and appropriate error message more than I value saving a line of code in this case ;-)

.expect(400); // project 123 doesn't exist
}));

it('PUT/DELETE preference storage operations', testService(async (service) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is pretty much the situation for all Central tests.

...just leads down the path of incrementally building up a whole bunch of DB state in one transaction. Unless I'm missing something that's just the nature of this thing. If I want to test the aggregation I need to do a whole bunch of PUTs....

Send a bunch of requests about forms/submissions to build up the state. If it gets super repetitive, we make special setup functions for specific batches of tests, but it's still making the requests for every test.

.expect(400); // project 123 doesn't exist
}));

it('PUT/DELETE preference storage operations', testService(async (service) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding some of the smaller tests! e.g.

  • validates the request body stringently
  • can store a JS null propertyValue
  • should not allow storing preferences for nonexistent projects

.expect(400); // project 123 doesn't exist
}));

it('PUT/DELETE preference storage operations', testService(async (service) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can still see an advantage of smaller tests so i can just look at an individual test and grok the 1 thing its trying to do instead of having to follow the whole journey, but eh, it's fine for now!

const { testService } = require('../setup');

describe('api: user-preferences', () => {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was playing around with this new endpoint (with a notebook and pyodk) and saw that i could make properties named and whitespace and so on. I think the property values being pretty unrestricted is fine but having prop be different from prop seems potentially problematic.

Would love to see a test showing propertyNames get trimmed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please see here for my opinion on property names. Otherwise, what's next? somepropertyname and sօmepropertyname are also distinct labels (if your font doesn't show a subtle difference, just look at the bytes!) but could easily be confused, shouldn't I prevent that kind of pitfall too then in order to keep everyone safe 🦸? I'm happy to do the work, it sounds like fun, but is it actually a real requirement? Note also that any check should be duplicated and implementations synced in the frontend, because ideally that's where you'd want to get dinged during development when you're inventing a bad 🦹 property name... 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were talking XForm spec (Forms, Submissions, Datasets, Entities), we'd want to (and do) get very clear about what counts as the same or different values, because we want to be as clear and as nice to form designers as possible and not have something not work as expected because the casing or whitespace was subtly different.

But you're right that in this context, we've talked about how this user preferences endpoint is mainly for Frontend to use, so it makes sense to keep it simple and use the property name as is.

@brontolosone
Copy link
Contributor Author

I can still see an advantage of smaller tests so i can just look at an individual test and grok the 1 thing its trying to do instead of having to follow the whole journey, but eh, it's fine for now!

The one thing I could do to skip building up state this way is to mass-create state via the DB, but then the test will fail if the DB layout would change, and these tests are supposed to test the API and not the DB specifics, as that is an implementation detail.

In terms of grokkability, the comments I've added in 4f6f7ba should help?

@ktuite
Copy link
Member

ktuite commented Oct 8, 2024

The one thing I could do to skip building up state this way is to mass-create state via the DB, but then the test will fail if the DB layout would change, and these tests are supposed to test the API and not the DB specifics, as that is an implementation detail.

We definitely want to test with the API and not build up database state directly because of the DB changing. And some of the core logic is so complicated in the database it would be / is not fun to manually build it up.

I have done a bit of that for trying to test certain database migrations, which is tricky because you can't actually use the API if the DB schema is changing mid-test. This file is one of my pain points; I'm glad to have been able to test some of these migrations but the whole situation feels a little off.

Copy link
Member

@ktuite ktuite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woohoo, let's merge it!

@matthew-white matthew-white mentioned this pull request Oct 8, 2024
2 tasks
@brontolosone brontolosone merged commit 9351bf2 into getodk:master Oct 9, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants