Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variable tags #1122

Open
nikhilwoodruff opened this issue Apr 29, 2022 · 4 comments
Open

Variable tags #1122

nikhilwoodruff opened this issue Apr 29, 2022 · 4 comments

Comments

@nikhilwoodruff
Copy link
Contributor

It would be really useful to be able to assign a set of tags to variables, for example an benefit variable might have the tags: `["benefit", "means-tested"]. This would enable applications which use APIs of country models (for example, PolicyEngine's API explorer) to filter by tags, and I think it would be very low-maintenance to implement (I'd be happy to). Would welcome any feedback (cc @benjello, @sandcha) - I'm planning to add this as a monkey-patch to the UK and US systems (e.g. like we have with other metadata fields), but it would probably be nicer to be able to have a standard interface with other country models.

@benjello
Copy link
Member

Cc @MattiSG

@MattiSG
Copy link
Member

MattiSG commented May 5, 2022

I understand this as a specific case of #1071. @nikhilwoodruff please correct me if I got it wrong!

Specifying this does not solve anything (even less so because I'm very conscious of how #1071 is currently stuck), but referencing this here aims at helping with centralising the discussion 🙂 It is useful to have this issue as a way to distinguish between use cases.

In order to understand the value for the wider community, could you please elaborate a bit more on the end user use case you have? What is the aim of consuming these tags? Does it help with sorting in a UI? What does your example means-tested tag mean? 🙂

@nikhilwoodruff
Copy link
Contributor Author

Thanks @MattiSG - yes, I can see why that issue got a bit more complex: perhaps it might be easier to separate the question of metadata which informs some additional computation (e.g. the min attribute implies some extra validation), versus metadata which only aims to better organise data.

Yes, I agree it's a specific case of #1071 - but I think this issue falls in the second case, in that it wouldn't need any extra validation or computation (except maybe to test that tags is of type List[str]?).

The main use case here is to enable users of the API to more easily find or explore the set of variables, without knowing their names or having to look through the source code. For example, let's say I want to find the set of variables which are involved in computing a particular benefit (e.g. Universal Credit in the UK). If I could filter variables to those including the Universal Credit tag, that'd speed up the process - much like how on GitHub, I can filter all issues/PRs to those including the kind:solution tag. means-tested is one tag we might use - it refers to any transfer to households or individuals which is reduced with income - certain benefits and certain tax allowances.

@MattiSG
Copy link
Member

MattiSG commented May 5, 2022

enable users of the API to more easily find or explore the set of variables

Gotcha! 👌

I believe @openfisca/france-contrib implement this by using long variable names, akin to namespacing (except that it is not really since there are no “namespaces”). In your case, it would work with prefixing your sets of variables with universal_credit_ or uc_.
I am not saying this solution is the best, I'm just sharing it to illustrate how it currently is. This way does have some value in the sense that tagging would not prevent collisions: if you for example have an eligible variable tagged with universal-credit, if would not prevent its name from colliding with eligible tagged with child-care-benefit, so you will always end up having to distinguish them somehow anyway, and I assume the additional information needed for that is likely to be comparable to what you would tag the variable with. Please correct me if that last assumption is wrong!

I do see the added value of exposing more information to reusers for discovery. Would variable namespacing also fit the bill for you? Do you see many cases where you would like to add several tags? Please keep this issue updated with the results of your monkey-patched experimentation, it's a great way to accumulate examples and decide to go towards normalisation based on them! 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants