Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use TimerTask in the 3scale batcher policy #786

Merged
merged 8 commits into from
Jul 2, 2018

Conversation

davidor
Copy link
Contributor

@davidor davidor commented Jun 22, 2018

This PR replaces the usages of ngx.timer.every with TimerTask instances.

This solves the issue of not being able to cancel timers. Recurrent tasks created using TimerTask can be canceled, but we cannot cancel timers created with ngx.timer.every. This was causing a bug, Timer would continue running even after a config reload.

@davidor davidor requested a review from a team as a code owner June 22, 2018 16:26
@davidor davidor changed the title Use TimerTask in the 3scale batcher policy [WIP] Use TimerTask in the 3scale batcher policy Jun 22, 2018
local function ensure_report_timer_on(self, service_id, backend)
local check_timer = self.semaphore_report_timer:wait(0)
local function ensure_timer_task_created(self, service_id, backend)
local check_timer_task = self.semaphore_report_timer:wait(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this to .new ? This policy should not be initialized in init phase, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't because we don't have access to the service ID in .new()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Maybe better to leave it for later, butI think we should think about changing the data structure to remove the service_id dependency. It should be as simple as create timer in new that gets reports from reports batcher. There should be no need for service_id because reports batcher instance is specific to this policy and timer, so it will ever contain only reports from this service. I see the reports batcher actually uses that service_id to do a lock on shmem. Hopefully we can solve that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. That part can be simplified a lot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we generate some UUID instead of using service id ? Then every policy would take care just of its own data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is related to an idea that we discussed some time ago.

Right now, pending reports are stored in a shared dictionary and every instance of the policy creates a timer (per worker) to report all the pending reports for its service ID.

After introducing TimerTask I see it more clearly that it might be better to adopt a different strategy. We could store the pending reports in a table of the instance, as we know that there'll only be one instance of the policy for a given service ID per Apicast worker. Pros: no locks, simpler code. Cons: possibility of losing reports if a worker dies, which in practice I don't think it'll be an issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave this for a future PR.
In this one, I'd like to focus on switching to using TimerTask to be able to cancel timers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, definitely 👍

@davidor davidor force-pushed the use-timertasks-in-batcher-policy branch 6 times, most recently from 11ccd3d to 511cce3 Compare June 28, 2018 11:11
@davidor davidor force-pushed the use-timertasks-in-batcher-policy branch from 511cce3 to bc4c74a Compare June 29, 2018 14:52
@davidor davidor changed the title [WIP] Use TimerTask in the 3scale batcher policy Use TimerTask in the 3scale batcher policy Jun 29, 2018
@davidor davidor changed the title Use TimerTask in the 3scale batcher policy [WIP] Use TimerTask in the 3scale batcher policy Jun 29, 2018
@davidor davidor force-pushed the use-timertasks-in-batcher-policy branch 2 times, most recently from 0b1cddd to 74ba6b1 Compare July 2, 2018 13:03
@davidor davidor changed the title [WIP] Use TimerTask in the 3scale batcher policy Use TimerTask in the 3scale batcher policy Jul 2, 2018
local usage = context.usage
local service = context.service
local service_id = service.id
self.service_id = service_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it is being assigned to self ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this is a leftover from some test. Same with self.backend.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

function _M:schedule_one(delay)
-- Need to wrap the task in a function that discards the first param (the
-- "premature" one sent by ngx.timer.at)
ngx.timer.at(delay or 0, function(_, ...) self.task(...) end, unpack(self.args))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to allocate a function for each call. Is that necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does no longer apply. This method has been replaced.


-- Can't check all the arguments of ngx.timer.at because it calls an
-- private function but at least we can check the interval (first arg)
assert.equals(interval, ngx_timer_stub.calls[1].vals[1])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can change this to use matchers and verify it was called with some function (matcher.function) and exact second parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the function is private, so I don't think we can do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Matcher can match the type like:

assert.spy(ngx_timer_stub).was_called_with(matcher.is_function(), 1)

http://olivinelabs.com/busted/#matchers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm familiar with matchers, but I misunderstood you.

I thought that you wanted to check that ngx_timer_stub was called with a specific function and I was saying that it was not possible because it's a private one. Now I see that you suggested just to check that it's a function no matter which one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a check for this 👍


if self.timer_task then
self.timer_task:schedule_one(1)
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we will rely on cancelling the timer by __gc callbacks ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works, but let's cancel the timer_task here to be explicit about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

-- run before that so we do not leave any pending reports.

if self.timer_task then
self.timer_task:schedule_one(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why delay 1 and not 0 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:)

From the point of view of the policy, I think it does not matter. However, busted crashes with a segfault when this is 0. Not sure why exactly. It might be related to how busted collects the garbage after all the tests. It does not crash with a delay of 0.1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then this is definitely worth a comment :-D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end I changed this and implemented it as we discussed off-line: ba883c1

@davidor davidor force-pushed the use-timertasks-in-batcher-policy branch 2 times, most recently from f3d949e to 5398758 Compare July 2, 2018 16:41
…er to real usage

Before, the tests set a high reporting interval so a report was not
triggered in the middle of the test. The tests relied on calling a
specific endpoint to force a report to backend. Now, instead of doing
that, we let the policy to run report jobs as it would normally do, we
aggregate the data we receive in the reporting endpoint, and at the end
of the test, we verify it. This approach is closer to a real usage of
the policy.
@davidor davidor force-pushed the use-timertasks-in-batcher-policy branch from 5398758 to b04d254 Compare July 2, 2018 16:43
Copy link
Contributor

@mikz mikz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@davidor davidor merged commit a95ebdd into master Jul 2, 2018
@davidor davidor deleted the use-timertasks-in-batcher-policy branch July 2, 2018 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants