Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a custom session handler with Cloud Datastore. #213

Merged
merged 21 commits into from
Nov 1, 2016

Conversation

tmatsuo
Copy link
Contributor

@tmatsuo tmatsuo commented Oct 21, 2016

It's great if you give early feedback!

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Oct 21, 2016

public function open($savePath, $sessionName)
{
$this->datastore = new DatastoreClient(

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

@tmatsuo
Copy link
Contributor Author

tmatsuo commented Oct 21, 2016

Added a constructor :)

I will also add test. Maybe I can add unit test with mocking. Do you want to have a system test as well?

@tmatsuo
Copy link
Contributor Author

tmatsuo commented Oct 22, 2016

Added the unit test.

@tmatsuo tmatsuo changed the title [WIP] Adding a custom session handler with Cloud Datastore. Adding a custom session handler with Cloud Datastore. Oct 22, 2016
@tmatsuo
Copy link
Contributor Author

tmatsuo commented Oct 25, 2016

Maybe we should not ignore the $savePath and the $sessionName.

How about to use $savePath for the Datastore namespace and $sessionName for the Datastore kind?

@tmatsuo
Copy link
Contributor Author

tmatsuo commented Oct 25, 2016

I started to use $sessionName for the kind.

However, utilizing $savePath while allowing the constructor injection of the DatastoreClient is tricky because there's no way changing the namespace afterwards. Do you have any recommendations?

@dwsupplee
Copy link
Contributor

Looking really nice @tmatsuo, thank you for this.

While it is not documented , it is possible to pass a namespaceId through when constructing a key. ( @jdpedrie do you think it would be worthwhile to add documentation for that?)

$key = $datastore->key('key', 1234, [
    'namespaceId' => 'Namespace'
]);

The idea of using the $savePath to set that namespace sounds pretty slick to me.

A few comments:

Have you considered a form of locking at all?

I'm a little concerned about the garbage collection. According to the docs gc is called randomly. That could lead to some unexpected charges for users if they aren't familiar with the way this is configured to work. Do you think it would make sense to document that, or provide an alternative way for users to perform the garbage collection on their own terms?

sprintf('Datastore lookup failed: %s', $e->getMessage()),
E_USER_NOTICE
);
return '';

This comment was marked as spam.

This comment was marked as spam.

@tmatsuo
Copy link
Contributor Author

tmatsuo commented Oct 25, 2016

The idea of using the $savePath to set that namespace sounds pretty slick to me.

A few comments:

Have you considered a form of locking at all?

Good point. Maybe we can use the Datastore transaction? But it is a bit difficult with the current interface design. I think there are the same issues with any other session implementations.

I'm a little concerned about the garbage collection. According to the docs gc is called randomly. That could lead to some unexpected charges for users if they aren't familiar with the way this is configured to work. Do you think it would make sense to document that, or provide an alternative way for users to perform the garbage collection on their own terms?

Good point. Just for reducing the deletion cost (I think it's cheaper leaving entities than deleting them), maybe we can stop querying when gcLimit == 0 and document that. How does it sound?

Also, when I deployed a sample app with this handler, App Engine health check is constantly creating a new session entities in the datastore. We should also recommend that users use this handler only for needed handlers.

@tmatsuo
Copy link
Contributor Author

tmatsuo commented Oct 25, 2016

While it is not documented , it is possible to pass a namespaceId through when constructing a key.

Also, is it same for query (which is needed in gc)?

@jdpedrie
Copy link
Contributor

@tmatsuo yep, I added the ability to set a namespaceId on Datastore::runQuery() and Transaction::runQuery() in #214. It's not in a release yet, but you should have no trouble against dev-master.

@jdpedrie
Copy link
Contributor

@dwsupplee I left it out of the documentation purposely because the right way to do it really is in the config. Most applications would want to use the same namespace across an entire project. If you think we should document it with a caveat I'm cool with that too.

@tmatsuo
Copy link
Contributor Author

tmatsuo commented Oct 26, 2016

@jdpedrie Ah OK, great, I'll try to use that.

BTW, another viable solution for the session in cloud is to use Google Cloud Storage with gcs stream wrapper. I will also try to implement the stream wrapper.

Thanks,

@dwsupplee
Copy link
Contributor

But it is a bit difficult with the current interface design. I think there are the same issues with any other session implementations.

Actually, I believe the default session handler locks.

Just for reducing the deletion cost (I think it's cheaper leaving entities than deleting them), maybe we can stop querying when gcLimit == 0 and document that. How does it sound?

That sounds like a solid idea. I wonder if it would be best to default the limit to 0 so that a user has to explicitly ask for the query/deletion to occur. What do you think?

@tmatsuo
Copy link
Contributor Author

tmatsuo commented Oct 26, 2016

Actually, I believe the default session handler locks.

I see. There's no exclusive lock in datastore. Maybe we can create a transaction on open and commit in write?

That sounds like a solid idea. I wonder if it would be best to default the limit to 0 so that a user has to explicitly ask for the query/deletion to occur. What do you think?

Yeah I will do that.

@tmatsuo
Copy link
Contributor Author

tmatsuo commented Oct 26, 2016

Now it uses the session.save_path for the namespaceId, and also uses transaction for data consistency. I still have a little concern on what happens when collision happens. Presumably it fails on write so a PHP_NOTICE error will be triggered, and usually you can see them in your logs.

Does it sound reasonable?

public function open($savePath, $sessionName)
{
$this->kind = $sessionName;
$this->namespaceId = str_replace('/', '_', $savePath);

This comment was marked as spam.

This comment was marked as spam.

{
$this->kind = $sessionName;
$this->namespaceId = str_replace('/', '_', $savePath);
$this->transaction = $this->datastore->transaction();

This comment was marked as spam.

This comment was marked as spam.

@dwsupplee
Copy link
Contributor

Hmm, I don't think we would just want the write to fail if we run into concurrency issues. I will do a little more research after having pulled down and played with the code a little bit.

@jdpedrie
Copy link
Contributor

A session handler that requires more than register, call session_start() is more than we should ask of users. It should be a drop in replacement for the default handler. Of course error handling is always a good thing, but expecting people to implement it on a part of their code that's expected to "just work" is another matter.

Yes, because write can also fail with other reasons. Only way to make sure whether the write succeeds is to have the error handler (or try-catch if we change it to throw exceptions).

Everything can fail, and obviously people should handle issues that can arise. That said, errors in the session handler are exceedingly rare. I can't recall ever running into one in the default implementation.

I'd rather not accept an implementation which we know can fail in certain (not uncommon) circumstances. Documenting how to work around those errors doesn't strike me as a compromise that is in the best interests of our users.

I think a Datastore session handler is a great idea, but as always the devil is in the details. As a user of the library, I would be worried about a session handler which was expected to fail in certain cases and which could very well cause problems for my users.

@tmatsuo
Copy link
Contributor Author

tmatsuo commented Oct 27, 2016

@dwsupplee
Fair point, but as long as we use Datastore as the session store, it is still fair to say keep the write operation as few as possible in terms of reducing the cost :) Which also helps to avoid contention.

@jdpedrie
The default file implementation can fail too. Any other network based session implementations will fail more often of course due to network errors. Ignoring those errors is OK for some use cases, but I still think handling errors for session's write is a good practice with the current design of PHP session, especially if you don't want to lose the data.

But yeah, If you are not comfortable with this handler, I will just put it somewhere else :)

@tmatsuo
Copy link
Contributor Author

tmatsuo commented Oct 27, 2016

In other words, all the peculiarity you feel from this handler, comes from the characteristic of the Datastore API itself. It provides the data consistency feature via it's transaction. It is optimistic lock which is not very well suited with the current PHP session's interface.

The reason why this handler can not be a complete drop-in replacement for every use-case (like heavily complex data in session for complex ajax application) is, well, in the most part, it's due to the limit of the current PHP session interface and the default behavior.

However, it's perfectly fine if you use this library just for sign-in and sign-out, even without setting the error handler, because awesome google-cloud-php will retry upon failure, so actually failures other than conflict are really rare.

Also it's perfectly fine if you use this library with the error handler installed, if you have complex session data, and you don't want to lose them.

I can't recall ever running into one in the default implementation.

Hmm, I think you're lucky! Even filesystem based simple session handler can fail. In my opinion, if you don't want to lose any session data, only reliable way is to handle errors like I do in the sample, it applies to any other session implementations.

@dwsupplee
Copy link
Contributor

The reason why this handler can not be a complete drop-in replacement for every use-case (like heavily complex data in session for complex ajax application) is, well, in the most part, it's due to the limit of the current PHP session interface and the default behavior.

However, it's perfectly fine if you use this library just for sign-in and sign-out, even without setting the error handler, because awesome google-cloud-php will retry upon failure, so actually failures other than conflict are really rare.

I think this right here is what I'm looking for in the docs :). My only real concern at this point is someone thinking they can use it as a drop in replacement and being surprised by issues. If we make it clear what the best use cases are then I feel a lot better about putting it into people's hands.

@tmatsuo
Copy link
Contributor Author

tmatsuo commented Oct 27, 2016

@dwsupplee
Good call. I will try to come up with a good documentation.

@tmatsuo
Copy link
Contributor Author

tmatsuo commented Oct 27, 2016

It is my first attempt for the docblock update! PTAL @dwsupplee @jdpedrie

Copy link
Contributor

@dwsupplee dwsupplee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The added documentation looks really helpful. Thanks for adding it!

@@ -26,6 +26,31 @@
*
* {@see http://php.net/manual/en/class.sessionhandlerinterface.php}
*
* Instead of storing the session data in a local file, it stores the data to
* Cloud Datastore. Biggest benefit of doing this is the data can be shared by

This comment was marked as spam.

This comment was marked as spam.

* The downside of using Cloud Datastore is the write operations will cost you
* some money, so it is highly recommended to minimize the write operations
* with your session data with this handler. In order to do so, keep the data
* in the session as few as possible; for example, it is ok to put only

This comment was marked as spam.

This comment was marked as spam.

* operations will cause the Datastore write and then it will cost you lot of
* money.
*
* This handler doesn't provide pesimistic lock for session data. Instead, it

This comment was marked as spam.

This comment was marked as spam.

*
* If you are building an ajax application which may issue multiple requests
* to the server, please design the session data carefully, in order to avoid
* possible data contentions. Also please see the 2nd exmaple in this docbock

This comment was marked as spam.

This comment was marked as spam.

@tmatsuo
Copy link
Contributor Author

tmatsuo commented Oct 28, 2016

@jdpedrie Thanks for the review, all done.

*
* If you are building an ajax application which may issue multiple requests
* to the server, please design the session data carefully, in order to avoid
* possible data contentions. Also please see the 2nd exmaple in this docbock
* for how to properly handle errors on `write` operations.
* possible data contentions. Also please see the 2nd exmaple below for how to

This comment was marked as spam.

This comment was marked as spam.

*
* Example without error handling:
* ```
* use Google\Cloud\Datastore\DatastoreClient;

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

* session_save_path('sessions');
* session_start();
*
* # Then read and write the $_SESSION array.

This comment was marked as spam.

This comment was marked as spam.


/**
* @param DatastoreClient $datastore Datastore client.
* @param int [optional] $gcLimit A number of entities to delete in the

This comment was marked as spam.

This comment was marked as spam.

}

/**
* @param string $savePath The value of `session.save_path` setting will be

This comment was marked as spam.

return true;
}

public function close()

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

@jdpedrie
Copy link
Contributor

code looks good to me, but there are some docs issues still to be addressed. :)

Thanks for working through these issues with us, @tmatsuo.

@jdpedrie
Copy link
Contributor

You can test whether the parser fails by running this command:

$ composer google-cloud docs

@tmatsuo
Copy link
Contributor Author

tmatsuo commented Oct 31, 2016

@jdpedrie Alright, now it seems like it compiles, PTAL

*
* $cloud = new ServiceBuilder();
* $datastore = $cloud->datastore();
* // or just $datastore = new DatastoreClient();

This comment was marked as spam.

This comment was marked as spam.

* used here. It will use this value as the Datastore kind.
* @return bool
*/
public function open($savePath, $sessionName)

This comment was marked as spam.

This comment was marked as spam.

@tmatsuo
Copy link
Contributor Author

tmatsuo commented Oct 31, 2016

@jdpedrie @bshaffer PTAL

/**
* Custom session handler backed by Cloud Datastore.
*
* {@see http://php.net/manual/en/class.sessionhandlerinterface.php}

This comment was marked as spam.

This comment was marked as spam.

@jdpedrie
Copy link
Contributor

Looks good! Thanks @tmatsuo

@dwsupplee dwsupplee merged commit c71c796 into googleapis:master Nov 1, 2016
@jdpedrie jdpedrie mentioned this pull request Nov 10, 2016
@tmatsuo tmatsuo deleted the datastore-session branch April 24, 2017 22:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: datastore Issues related to the Datastore API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants