Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

propose some changes to the OCM API to move forward to a version 1.0 #37

Merged
merged 2 commits into from
Jul 1, 2020

Conversation

schiessle
Copy link
Collaborator

@schiessle schiessle commented May 29, 2018

some of you might remember my talk at the last CS3 conference where I presented Nextcloud's plans to implement the OCM API. As I already mentioned there and discussed with multiple stakeholders we can't implement the current version 0.3.0 directly but need to add some small improvements and clarification to meet our ambitious goal of a general cloud federation platform. Other cloud solutions seems to have similar challenges. Over the last weeks and months we worked at a first implementation at Nextcloud which lead to some (mostly small) changes and improvements which I would like to discuss with you. Many of this changes could be optional, I will mention it in the details below. But I would highly suggest to make most parameters mandatory as otherwise all implementation will have to distinguish between many different combination of parameters and handle them correctly. A complexity I would like to avoid in favor of a user-friendly API.

End-Point discovery:

OCM should be used by a wide range of cloud solutions, as such we can't assume that everyone can implement a tld/share end-point. Sometimes this might not be possible for technical reasons, some projects might have dedicated paths for their public APIs and would like to keep it for consistency, some might already use this end-point for something different, etc. Therefore I think we need some kind of end-point discovery. My suggesting:

Sending a GET request to /ocm-provider/ should return:

{
  "enabled":true,
  "apiVersion":"1.0-proposal1",
  "endPoint":"http://localhost/server/index.php/ocm",
  "resourceTypes": [
    {
      "name": "file",
      "shareTypes": ["user", "group"],
      "protocols": {
        "webdav":"http://localhost/server/public.php/webdav/"
      }
    }
  ],
}

In this example http://localhost/server/index.php/ocm would be the root for all the end-points defined in the API specification. You can easily find out which API version and which share types are supported because in the future at least some implementations don't want to limit themselves to files. Additional we use it to tell the server which protocol is supported to exchange the shared data and the end-point of the protocol.

Sending a share:

I modified the POST request to send a share to look like this:

{
  "shareWith": "peter.szegedi@geant.org",
  "name": "spec.yaml",
  "description": "This is the Open API Specification file",
  "providerId": "7c084226-d9a1-11e6-bf26-cec0c932ce01",
  "owner": "dimitri@apiwise.nl",
  "sender": "john@apiwise.nl",
  "ownerDisplayName": "Dimitri",
  "senderDisplayName": "John Doe",
  "shareType": "user",
  "resourceType": "file",
  "protocol": {
    "name": "webdav",
    "options": {
      "sharedSecret": "hfiuhworzwnur98d3wjiwhr"
      "permissions": "{http://open-cloud-mesh.org/ns}share-permissions",
    }
  }
}

Distinguish between sender and owner

In case of a re-share i think it is often beneficial to distinguish between the person who created the share and the person who owns the resource. Therefore I added a "sender", this could be optional and we just fall-back to sender=owner if no explicit sender is given.

Allow display names

While the technical user ID's "user@server" are perfect to identify the user and the location, they are not really user-friendly. In the user interface you most likely want to show a real nane. Therefore I
added "senderDisplayName" and "ownerDisplayName"

shareType

We should be able to distinguish between different share types, the most obvious are "user" and "group" shares. But maybe in the future there might be more pop up. Therefore I added "shareType", if the server receives a share with a unknown share type, the server can just return 501 (not implemented)

During discussions at the CS3 many stakeholders raised the wish t use something more human readable then integers. That's why I added now the "permissions" filed which contains just a array with the values "read", "write", "share". Of course every share should at least come with "read" permissions.

resourceType

This specifies the resource type. So far we only have file sharing, in case no resource type is given we could fall back to "file", although I would prefer to have this always explicite. In the future some implementations might provide additional resourceTypes, for which they might specify the "protocol" part so that other could implement it as well in a interoperable way. If a server received a share for a unknown resourceType it returns 501.

Protocol

For the webdav protocol we should make sure that we chose a specification which is backward compatible to all the shares that already exists between Pydio, ownCloud, CernBox and Nextcloud. Other protocol can be implemented freely in the future. If a implementation want to provide additional "resourceType", they also need the specify the correspondig protocl block.

sharedSecret

This is used as both username and password when mounting a federated share via webdav. It will be also used for notifications to authenticate against the remote server.

Permissions

For permissions I would suggest to define a webdav property, so that the receiving side can query the permissions with a propfind at any time. We could either let the sender define the webdav property (as defined in this example) or we agree on one property name everybody should use.

During discussion at the CS3 many people raised the wish to move away from the integer based permissions to something more human readable. So I would suggest that the propfind should return a JSON encoded array which can have the parameters "read", "write", and "share".

Return codes

On success we return 201, the body can contain the display name of the recipient for general user experience improvements (this way user facing strings can show the nice human readable display name instead of the technical cloud id). The providerId (in combination with the server URL of the owner) and the shared secret are always the two identifier of a specific share on both sides.

If a resourceType or a shareType is not known we return 501, not implemented together with a message which parameter is unknown.

Send a notification

I modified the post request to send a notification to look like this:

{
  "notificationType": "SHARE_REMOVED",
  "resourceType": "file",
  "providerId": "7c084226-d9a1-11e6-bf26-cec0c932ce01",
  "notification": {"message" : "I don't want to use this share anymore."}
}

notificationType

the type of the notification, same as in version 0.0.3

resourceType

like above we specify to which kind of resource the message belong to, for now only "file" is supported but it might change in the future for some implementation. If a message is received for a unknown resource, the server can just return 501.

providerId

specifies to which share the notification belong to.

notification

depending on the share type and and notification we might send additional information, this goes in the "notification", see some examples below. That's basically the payload of the notification.

message

A human readable message which the receiving side could display to the user.

Return codes

On success we return 201. Optionally the response body can contain a JSON object with some response data.

Defining notifications

While notifications should be optional, I think we should define a set of notifications for the file sharing case. Otherwise everyone will implement their own slightly different notification and we will end up in a incompatible mess of notifications. Same goes for other resourceTypes, as soon as someone introduces a new resource type they should define a set of possible notifications which can be used for communication.

For file sharing we suggest this set of notifications:

{
  "notificationType": "SHARE_ACCEPTED",
  "resourceType": "file",
  "providerId": "7c084226-d9a1-11e6-bf26-cec0c932ce01",
  "notification": {
                    "sharedSecret": "hfiuhworzwnur98d3wjiwhr",
                    "message": "Recipient accepted the share"
                  }
}
{
  "notificationType": "SHARE_DECLINED",
  "resourceType": "file",
  "providerId": "7c084226-d9a1-11e6-bf26-cec0c932ce01",
  "notification": {
                    "sharedSecret": "hfiuhworzwnur98d3wjiwhr",
                    "message": "Recipient declined the share or unshared it from themself"
                  }
}
{
  "notificationType": "SHARE_UNSHARED",
  "resourceType": "file",
  "providerId": "7c084226-d9a1-11e6-bf26-cec0c932ce01",
  "notification": {
                    "sharedSecret": "hfiuhworzwnur98d3wjiwhr",
                    "message": "File was unshared"
                  }
}
{
  "notificationType": "REQUEST_RESHARE",
  "resourceType": "file",
  "providerId": "7c084226-d9a1-11e6-bf26-cec0c932ce01",
  "notification": {
                    "sharedSecret": "hfiuhworzwnur98d3wjiwhr", 
                    "shareWith": "peter.szegedi@geant.org",
                    "senderId": "9r48fh78-8r94f893-89483u89fh3",
                    "message": "Recipient of a share ask the owner to reshare the file with peter.szegedi@geant.org"
                  },
}

If the owner performs the requested reshare it will return the sharedSecret it created for the recipient together with the providerId. The secret is used both, to communicate with the receiver directly and to authenticate any follow-up message with the owner regarding the re-share

{
  "notificationType": "RESHARE_UNDO",
  "resourceType": "file",
  "providerId": "7c084226-d9a1-11e6-bf26-cec0c932ce01",
  "notification": {
                    "sharedSecret": "hfiuhworzwnur98d3wjiwhr",
                    "message": "Tell the owner (or the sender of a reshare) that the reshare was unshared"
                  }
}
{
  "notificationType": "RESHARE_CHANGE_PERMISSION",
  "resourceType": "file",
  "providerId": "7c084226-d9a1-11e6-bf26-cec0c932ce01",
  "notification": {
                    "sharedSecret": "hfiuhworzwnur98d3wjiwhr",
                    "permission": ["read", "write", "share"],
                    "message": "Tell the owner (or the sender of the reshare) that the permmissions changed"
                  },
}

As a acknowledgment that the share was received we return 201. If a notificationType or a resourceType is unknown on the receiving side we return 501 together with a message which kind of parameter is unknown.

GET requests

For the first proposal I removed all the GET requests. During implementation I had no real use case for it and I also can't think about any. Keeping a history of all messages, share requests, etc could even be seen of a privacy issue. So why store stuff and provide an end-point to retrieve them again, if it is not really needed? If I missed something and someone come up with some real use cases, I'm happy to bring them back. Until then, let's keep the API as light as possible.

General

Please see this as a first draft open for discussion. Stuff can and most likely will change in this process. The main goal was to come up with something which works with all the existing implementations out there in a backward compatible way and to clarify some stuff from version 0.0.3. As a outcome it would be great to have a version 1.0 of the specification, implemented (or at least possible to implement) by the key stakeholders. I'm looking forward to your feedback.

Signed-off-by: Bjoern Schiessle <bjoern@schiessle.org>
@schiessle
Copy link
Collaborator Author

I just finished the implementation for Nextcloud, based on this specification. The nice thing is, that it is completely backward compatible how Nextcloud's, CernBox's, Pydio's and ownCloud's work until now. I also didn't touched the internal data representation. If you communicate with a server which speaks the new API, it will be used, if not it will fall back to the old API. Even shares created before (with the old API) can handle notifications with the new API after a server was migrated to Nextcloud 14.

I think this kind of backward compatibility is extremely important, because there is already an existing ecosystem which we have to keep in mind. Feel free to try it out. 🙂 nextcloud/server#9345

@sbuller
Copy link

sbuller commented Jul 13, 2018

End-point discovery sounds like a good match for https://tools.ietf.org/html/rfc5785, which I believe Nextcloud already uses for caldav and carddav. TL;DR—use "/.well-known/"

@tomneedham
Copy link

Hi @schiessle - whilst implementing this at ownCloud we noticed that the permissions management is moved to webdav - can you clarify this was the intention? Is it expected that then each file transfer mech supplies its own permissions model?

@schiessle
Copy link
Collaborator Author

schiessle commented Oct 5, 2018

Hi @tomneedham, great to hear that you are working on a implementation as well. 🙂

Regarding the permission, yes we made this decision by purpose. It keeps the API clean and easy, no additional REST calls to propagate permission changes, which you would have to do even across multiple servers if you think about complex re-share structures. No need to keep the information in sync in different places and no need to deal with errors, e.g. when a REST call to announce a permission change didn't reach all servers because on server was in maintenance mode at this point in time, because of a bug, etc.

The storage is the natural place, that's also the place where the permission will be finally checked when you perform a operation. With a normal propfind to get all possible information you get always the up-to-date permissions without any overhead. Every storage can use it's default way to announce permissions which allows a tight integration in the storage and a natural workflow.

@diocas
Copy link
Collaborator

diocas commented Nov 5, 2018

Hi,
From CERNBox side, the changes are ok. We discussed with @tomneedham during the UP2U GA, and the only pointers we have is: enforce user@domain for the sender and owner (it's not crystal clear in the specification).

We also had some doubts about the shareType, as it looks like we have different conceptions of what those types are. So we should specify here. I.e. for me it was not clear that "file" is used even when a folder is shared. In CERNBox we only have folder sharing and not file share, so maybe we could distinguish? edit: it looks like this was already being discussed here: #33

For future updates to the implementation, we should also discuss how to do "housekeeping". By removing the shares endpoint, we lose the option to check the status of shares shared with us so we need something else. Should we implement a new shared endpoint? Should the server send a heartbeat? ...
We also propose the usage of "alias" for the remote servers #38.

After the changes (the .well-known and the user@domain), can this be merged?

@tomneedham
Copy link

Our current implementation of this PR is here: owncloud/core#33027 - bringing ownCloud up to hopefully what will be the latest version of the spec. We're testing with various partners at the moment, and if we can get this PR merged and a version of the spec released then we'd like to release the ownCloud implementation :)

@schiessle
Copy link
Collaborator Author

schiessle commented Nov 5, 2018

enforce user@domain for the sender and owner (it's not crystal clear in the specification).

Yes, sender and owner should always be a full qualified federated share ID . I thought the example makes this clear, but sure we can make this more obvious in the description.

We also had some doubts about the shareType, as it looks like we have different conceptions of what those types are. So we should specify here. I.e. for me it was not clear that "file" is used even when a folder is shared. [..] it looks like this was already being discussed here: #33

So this is solved, right? I think sticking to the unix philosophy makes sense here, because also from the back-end code it makes absolutely no difference if you mount a file or folder. So let's keep the API as minimal as possible.

After the changes (the .well-known and the user@domain), can this be merged?

regarding the well-known thing I'm undecided. It is not that easy to add it for us (and most likely for ownCloud as well) in a way that it always works. We have some well-know routes in the .htaccess defined but this only works if mod-rewrite is enabled, which again only works on Apache. As we don't design a API on the drawing board but have to keep existing implementations and an existing ecosystem in mind, I would go with the current and easier solution.

@schiessle
Copy link
Collaborator Author

@tomneedham good to hear that you are making good progress. Are you also testing against Nextcloud? I think this would be really useful in order to spot possible problems as early as possible.

@tomneedham
Copy link

Initial testing OC<->NC with this PR is good - however there are some issues when testing the fallback implementations. Needs further analysis of where the problems lie.

@schiessle
Copy link
Collaborator Author

@tomneedham great, don't hesitate to contact me if you need any help from our side.

@killing
Copy link

killing commented Feb 13, 2019

Hi @schiessle

This is Jonathan from Seafile. Thanks for the proposal. That makes things much clearer.

Currently I have one question that may not be related to your proposal. How is the providerId connected to some meaningful information about the provider, e.g. server url, server name?

@schiessle
Copy link
Collaborator Author

Hi @killing,

by default the server name has to be known, because what you enter in your share dialog is user@server. We at Nextcloud have some solutions to make this easier with "trusted servers" and the "lookup server" but that's not part of the OCM API.

About endpoint discovery see the corresponding paragraph at the top.

@killing
Copy link

killing commented Feb 19, 2019

Hi @schiessle

Thanks for the reply. OK, I think I misunderstood something. So the providerId is actually not practically important. The server just uses the information included in the "protocol" section from the share request to interact with the share provider.

Seafile has its own RESTful API for accessing the data. Even though WebDAV is also provided, but it's not the preferred way. Is it possible to declare a "seafile" protocol in endpoint discovery and share sending? Is it possible to declare two protocols at the same time? This way the "seafile protocol" can be used to share between Seafile servers, while NC an OC can still use webdav to access data from Seafile server.

@moscicki
Copy link
Collaborator

@killing: Alternative protocols for data transfer may be added and this may be useful to connect two Seafile sites via OCM. On the other hand it would be nice that we all agree on a minimal common data transfer protocol such that actually we can connect Seafile, Nextcloud, Owncloud sites. For this we should agree on something minimal together.

@schiessle
Copy link
Collaborator Author

@killing yes, the provider ID is "just" used to identify the share so that the involved servers can assign the message to the right share. This can be any unique identifier how you identify the shared file/folder.

@schiessle
Copy link
Collaborator Author

Regarding the protocol for the file exchange. Yes, you could define your own protocol, as said by @moscicki But it would only work from one seafile instance to another one at the moment. For interoperability it would be good if you could implement webdav (as well) as this is the common protocol shared by all implementations at the moment.

schiessle added a commit to nextcloud/server that referenced this pull request Feb 19, 2019
see: cs3org/OCM-API#37

Signed-off-by: Bjoern Schiessle <bjoern@schiessle.org>
@killing
Copy link

killing commented Feb 23, 2019

@schiessle @moscicki WebDAV is already supported in the current version of Seafile. So I think there should be not many problems to work with OC and NC. Some minor changes may be required.

@VicDeo
Copy link

VicDeo commented Mar 1, 2019

@schiessle as I see in the proposal the discovery endpoint returns an absolute URL for WebDav.

"protocols": {
        "webdav":"http://localhost/server/public.php/webdav/"
      }

While your implementation has a relative URL https://github.com/nextcloud/server/blob/master/apps/cloud_federation_api/lib/Capabilities.php#L56

Also it goes against the implementation we both had before your proposal. Did you accidentally make this URL absolute here or it was an intentional change?

@diocas
Copy link
Collaborator

diocas commented Mar 6, 2019

From the tests we did with OC-CBox, there are another clarifications needed:

  • Can ProviderId be int? In the example is a string (with the uid);
  • Can the owner id have the protocol ("http(s)://")? Or it should be a format similar to an email? (although we accept a pathname besides the host, correct?)

@labkode
Copy link
Member

labkode commented Mar 6, 2019

@diocas the providerId MUST be a string. Any implementation that uses an integer or other type is not compliant with the specification.

@diocas
Copy link
Collaborator

diocas commented Mar 6, 2019

For me no problem, because I expect strings.

@moscicki
Copy link
Collaborator

moscicki commented Mar 7, 2019 via email

@labkode
Copy link
Member

labkode commented Mar 7, 2019

It is.

providerId string Required Identifier to identify the resource at the provider side. This is unique per provider.

@hodyroff
Copy link

@schiessle as I see in the proposal the discovery endpoint returns an absolute URL for WebDav.

"protocols": {
        "webdav":"http://localhost/server/public.php/webdav/"
      }

While your implementation has a relative URL https://github.com/nextcloud/server/blob/master/apps/cloud_federation_api/lib/Capabilities.php#L56

Also it goes against the implementation we both had before your proposal. Did you accidentally make this URL absolute here or it was an intentional change?

@schiessle can you clarify this one?

@hodyroff
Copy link

What else is missing to move to a final version and have it implemented across?
@killing are you ready to announce support and did you test it working with oC and NC?

@schiessle
Copy link
Collaborator Author

schiessle commented Nov 11, 2019

@schiessle as I see in the proposal the discovery endpoint returns an absolute URL for WebDav.

"protocols": {
       "webdav":"http://localhost/server/public.php/webdav/"
     }

While your implementation has a relative URL https://github.com/nextcloud/server/blob/master/apps/cloud_federation_api/lib/Capabilities.php#L56

Also it goes against the implementation we both had before your proposal. Did you accidentally make this URL absolute here or it was an intentional change.

You are right. Actually I don't have a strong feeling here. For ease of use I think having the full URL here makes sense so the other side doesn't has to construct the final URL and guess whats the right way to concatenate both parts. For this reasons I tend to say that we should stick to the complete URL and fix it in our implementations.

@labkode
Copy link
Member

labkode commented Dec 2, 2019

@schiessle @killing @hodyroff
Last year in the CS3 Conference we had an awesome presentation from @schiessle about the advancements done on the OCM APIs towards v1.0: https://indico.cern.ch/event/726040/contributions/3248796/

@schiessle do you plan on presenting this year also something along those lines?

We could also do a joint presentation between the different stakeholders about OCM, that would highlight even more the vendor-neutral inter-operability aspect of these APIs.

Anyways, OCM is a very important enabler technology for the future of federated clouds and we should have a presentation in the next CS3 event in January.

Let me know what you think :)

@labkode
Copy link
Member

labkode commented Feb 19, 2020

@schiessle Would be still difficult for you to add support for .wellknown information instead of /ocm-provider?

@labkode
Copy link
Member

labkode commented Jul 1, 2020

@schiessle no one has vetoed the PR, please click the merge button ;)

@schiessle
Copy link
Collaborator Author

🎉 🎆

@glpatcern
Copy link
Member

glpatcern commented Jul 24, 2023

@schiessle as I see in the proposal the discovery endpoint returns an absolute URL for WebDav.

"protocols": {
       "webdav":"http://localhost/server/public.php/webdav/"
     }

[...]

You are right. Actually I don't have a strong feeling here. For ease of use I think having the full URL here makes sense so the other side doesn't has to construct the final URL and guess whats the right way to concatenate both parts. For this reasons I tend to say that we should stick to the complete URL and fix it in our implementations.

For the record: eventually the implementations did not get fixed, that's why in #76 we've "fixed" the specification to match reality, that is protocols' values are just paths and the other side has to build the final URL. Not the nicest way indeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants