Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let FSRS control short term schedule #3375

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

L-M-Sherlock
Copy link
Contributor

No description provided.

@user1823
Copy link
Contributor

user1823 commented Aug 22, 2024

While I appreciate the efforts you put into this, I don't think that FSRS would be able to assign reasonable short-term intervals unless significant changes are made to the FSRS algorithm.

One important problem with the current algorithm:

I used v5.0.3 of the python optimizer to generate the v5 parameters for my collection. I got:
[1.1729, 5.1092, 32.3766, 77.043, 7.6279, 0.779, 2.3255, 0.0, 1.9067, 0.2494, 1.4454, 1.5644, 0.2035, 0.5184, 1.2658, 0.0008, 6.0, 1.0076, 0.7047]

Here, S0(1) > 1d. So, I won't see a new card again on the same day even if I press Again when I see it for the first time.

I saw that you mentioned "According some cases I researched, the stability of again could be larger than 1d." on the Forums. However, I think that you might have misinterpreted the result (assuming that the calculation of S0 in FSRS 5 is essentially the same as FSRS 4.5).

stability_for_pretrain.tsv contains:

first_rating delta_t y mean y count
1 1 0.9365 4473
3 1 0.9832 3631

If FSRS is using this to calculate S0(1), it is not considering that these cards with Again as the first rating also have one (or more) additional same-day reviews that can also affect the stability and, thus, the recall rate on the next day.

In short, FSRS is confusing the stability after the last learning step with the initial stability.

Without the changes in this PR, this is not a significant problem because the learning steps force Anki to show the card again on the same day. However, if the scheduling is going to be based on the FSRS stability only, we need to make changes that allow FSRS to better estimate stabilities in the short-term.

@L-M-Sherlock
Copy link
Contributor Author

You would be right. But the feature only takes effect when the user has 0 learning step (the learning step field is empty).

Besides, to model the short-term memory more accurately, we need more short-term review data with long steps. However, the default settings make this kind of data very rare.

For the confusion between the stability after the last learning step and the initial stability, I don't think so. Because FSRS-5 has considered the short-term reviews. The the stability after the last learning step is the initial stability multiplied by the SInc which is based on the ratings of the same-day reviews.

assuming that the calculation of S0 in FSRS 5 is essentially the same as FSRS 4.5

They are different. Admittedly, FSRS-5 and FSRS-4.5 both both are using the same pretraining function. But in FSRS-5, I removed the initial stability freezer to allow the optimizer to tune initial stability during training.

@user1823
Copy link
Contributor

user1823 commented Aug 22, 2024

But the feature only takes effect when the user has 0 learning step (the learning step field is empty).

Ok, good to know. In that case, this problem won't be as important as I thought previously.

For the confusion between the stability after the last learning step and the initial stability, I don't think so.

Sorry if I wasn't clear enough. I was talking specifically about pretraining. In pretraining, the optimizer seems to calculate the stability after the last intraday learning step but then assume that this is equal to the initial stability.

Please let me know if I am interpreting it wrongly.

But in FSRS-5, I removed the initial stability freezer to allow the optimizer to tune initial stability during training.

This seems to be a good decision, but it likely won't solve the problem I mentioned above.


Regarding the scarcity of data for modelling short-term memory, I sympathize with you. However, I don't have any solution.

And this implies that Anki can't remove (re)learning steps altogether, at least not in the near future.

@brishtibheja
Copy link
Contributor

In pretraining, the optimizer seems to calculate the stability after the last intraday learning step

Then I'd assume from your params you almost always get 1d interval for cards with first rating as Again? If SInc is taken into account.

@Expertium
Copy link
Contributor

Expertium commented Aug 22, 2024

The the stability after the last learning step is the initial stability multiplied by the SInc which is based on the ratings of the same-day reviews.

Then maybe it's better to use S0*SInc(average number of same-day reviews, average grade of those reviews) as the first interval?
EDIT: for Easy it would be exactly the same as S0, since there are no learning steps after Easy. For Good it could be different, depending on how the user has configured learning steps in the past. For Again and Hard the result will definitely differ from S0.
EDIT 2: there is a problem. These two averages would have to be stored somewhere (in the memory state, I guess), and both the scheduler's code and the optimizer's code would have to change. And something like this would have to be done for the relearning steps as well.

@user1823

This comment was marked as off-topic.

@Expertium
Copy link
Contributor

If FSRS is using this to calculate S0(1), it is not considering that these cards with Again as the first rating also have one (or more) additional same-day reviews that can also affect the stability and, thus, the recall rate on the next day.

I thought about this more, and I think @user1823 is right. S0 for Again isn't actually "stability after the user pressed Again", it's "stability after the user pressed Again and went through all of the <1d learning steps". @L-M-Sherlock thoughts? You talked about this before, but maybe you misunderstood the problem.
Imagine that S0(Again) as calculated by FSRS is 1 day. But the user has several learning steps, and as he goes through them, stability increases. So the true S0 is much lower. So an interval scheduled by FSRS would be too long.

@L-M-Sherlock
Copy link
Contributor Author

Imagine that S0(Again) as calculated by FSRS is 1 day. But the user has several learning steps, and as he goes through them, stability increases. So the true S0 is much lower. So an interval scheduled by FSRS would be too long.

If the user has several learning steps, FSRS-5 will optimize S0(Again), W[17], W[18] in the same time. And S0(Again) is less than f(S0(Again)) where f is the short-term memory function.

@Expertium
Copy link
Contributor

Expertium commented Sep 4, 2024

I'll just leave some thoughts here:
I came to agree with Dae about Again < Hard < Good < Easy. Having two identical intervals may be more optimal mathematically, but it's also confusing. So we should ensure that each >=1d interval is different by at least one day.
The next part may or may not be more difficult to implement, I don't know. For <1d intervals, we need to do the same thing, but with hours. For example, if Again = Hard = 6 hours, make Hard 7 hours.
The really tricky part is dealing with day rollovers. Again < Hard < Good < Easy inequality should hold for any combination of intervals, after accounting for day rollovers.

@brishtibheja
Copy link
Contributor

For reference, I'll paste my comment from the previous PR:


  1. No subday intervals and intervals are in proper order.

blank

  1. No subday intervals but at least two intervals match.

Is there any reason subtracting days haven't been considered? If this is hard to do in code, sure it's a bad idea. But ideally, if you are dealing with a person who keeps DR to CMRR recommendations, you're better off subtracting a few days than adding a few. Also, consider the opposite situation of a person with 0.99 DR.

I'm not necessarily suggesting peeking at CMRR. In a simple situation, we can simply look at DR and have 0.86 or 0.9 as threshold.

  1. a. At least one subday interval but it doesn't cross day boundary. Intervals are in order.

blank

  1. b. At least one subday interval that crosses day boundary. Intervals would still be in order.

blank

  1. At least one subday interval that crosses day boundary. At least two intervals will match.

In this situation, we end up with two intervals that match. In your examples, an easier solution seems subtracting a few hours from the lower interval so that it doesn't cross the day boundary. Sure, this solves the optics problem we dont want and still allows the user to study at intervals closer to what they want.

But consider another situation. Again = 12h and Hard = 23h at 3:00 AM. In this situation, converting Hard to 2d is optimal instead of trying to make Again not cross the day boundary. This is because Hard is closer to a day boundary compared with how close Again is to a day boundary.

I am not completely sure about this situation though. For example, the level of precision I'm seeking would require answering whether 1d means a >24h interval or not.

@Expertium
Copy link
Contributor

@dae sorry for pinging you all the time, but I have a question. Well, two.

  1. What's your opinion on FSRS handling short-term (<1d) intervals? In general.
  2. What's your opinion on postponing the next release until this PR is merged?

@dae
Copy link
Member

dae commented Sep 10, 2024

What's your opinion on FSRS handling short-term (<1d) intervals? In general.

I'm skeptical at the moment. I think it has the potential to cause a lot of user confusion if not done carefully, and it looks like the details are still being figured out. I'd be pushing back strongly if this were a forced change, but since it will only affect users without learning steps, it hopefully won't be too disruptive (but I'm sure it will confuse some users who have accidentally or deliberately blanked out their learning steps, and have grown used to the default 1 minute step).

Before we get to the merging-in stage though, I'd like someone to briefly explain what the goal is with this code. Is it just to generate more data for future analysis, or is there some degree of confidence that the intervals FSRS calculates will be close enough to optimal that users might actually wish to opt in?

What's your opinion on postponing the next release until this PR is merged?

That depends on how long it's expected to take. I can hold off on a beta for a week or two if it's close and you guys feel it is important to land it in the first beta, or it could be added in a future beta - I suspect we'll need a number of them this time around.

@Expertium
Copy link
Contributor

Expertium commented Sep 10, 2024

Is it just to generate more data for future analysis, or is there some degree of confidence that the intervals FSRS calculates will be close enough to optimal that users might actually wish to opt in?

More of the latter, I'd say. @L-M-Sherlock your input would be valuable here
Btw, Dae, you still haven't said when exactly the beta is expected to come out. And we still need Jake to work on Easy Days.

@dae
Copy link
Member

dae commented Sep 10, 2024

My main blocker is I'm hoping to land #3292 before the first beta, but I need to give it some more testing first. Easy days will likely need to wait for a future beta or release, if it's not about to drop already.

@Expertium
Copy link
Contributor

@jakeprobst what are the chances that you will finish Easy Days within a week or two weeks at most?

@dae
Copy link
Member

dae commented Sep 10, 2024

Is landing it in the first beta actually important, or just a nice-to-have? If it's the latter, I don't think Jake should feel under any pressure.

@Expertium
Copy link
Contributor

I'd like all algorithmic improvements - FSRS-5, disabling learning steps, new fuzz, Easy Days - to come withing the same release, like Anki 24.10 or Anki 24.12. They can be added inbetween several stages of beta-testing, I guess, but that feels weird to me.

@dae
Copy link
Member

dae commented Sep 10, 2024

It's not uncommon for features to land in later betas. Waiting for everything might be "cleaner" (and it's certainly less work for me!), but it means less time than the existing changes can be tested for, or a longer beta testing period.

@jakeprobst
Copy link
Contributor

jakeprobst commented Sep 10, 2024

@jakeprobst what are the chances that you will finish Easy Days within a week or two weeks at most?

unlikely!

@L-M-Sherlock
Copy link
Contributor Author

More of the latter, I'd say. @L-M-Sherlock your input would be valuable here

To be honest, I‘m not confident when I have analyzed the short-term recall from the real data recently. Here is an initial result:

If a user clicks "again" when first learning a new card, the probability of recalling it the second time that day is 74.6%. If they click "hard," the probability is 91.3%. If they click "good," it's 95.7%. For reviewing old cards, if they click "again," the probability of recalling it the second time that day is 85.1%.

Seems the short-term memory has very differently behaviours than the long-term memory. If our goal is to achieve the 90% retrievability in the short-term, the current learning step seems too long for cards rated as again and too short for cards rated as good. And FSRS is unlikely to provide such short intervals as the default learning steps module does now.

Here is the code: https://github.com/open-spaced-repetition/Anki-button-usage

@Expertium
Copy link
Contributor

@L-M-Sherlock question: I assume FSRS will not schedule <1d intervals if the user has any learning steps?
Example: suppose that the optimal first intervals for Again, Hard and Good are all <1d. The user has a 1m learning step. What will the intervals look like?

@brishtibheja
Copy link
Contributor

Seems the short-term memory has very differently behaviours than the long-term memory.

I'm not so sure pressing good on a card the first time you see it entails learning. If someone is pressing good or easy, it's lot more likely that they have already learned the card outside Anki and they're retrieving from LTM.

@mlidbom
Copy link

mlidbom commented Sep 20, 2024

is there some degree of confidence that the intervals FSRS calculates will be close enough to optimal that users might actually wish to opt in?

To be honest, I‘m not confident when I have analyzed the short-term recall from the real data recently. Here is an initial result

Let's remember that what FSRS is competing with are static learning steps, not some near optimal scheduling. I very much doubt my learning steps perform anywhere near as well as FSRS does in these statistics. I'm quite eager to try it out. I really hope it gets merged.

Edit: Also, given emptied out learning steps and the recently merged change, what FSRS short time scheduling is competing with here is no sub day scheduling at all. Again, I strongly suspect this code is an improvement on that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants