Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional path to ResourceId #125

Closed
wants to merge 2 commits into from
Closed

Conversation

dragotin
Copy link

@dragotin dragotin commented May 6, 2021

This is needed to implement a storage space concept where we can address resources through storage spaces directly.

This is needed to implement a storage space concept where we can address resources through storage spaces directly.
@dragotin dragotin requested a review from labkode as a code owner May 6, 2021 15:41
@butonic
Copy link
Contributor

butonic commented May 6, 2021

@dragotin
Copy link
Author

dragotin commented May 6, 2021

@refs
Copy link
Member

refs commented May 14, 2021

could you run make to generate docs?

@labkode
Copy link
Member

labkode commented May 17, 2021

@dragotin please generate the docs in this PR as well.

This change allows for Dropbox-like path resolution: https://www.dropbox.com/developers/documentation/http/documentation

Path formats
Paths are relative to an application's root (either an app folder or the root of a user's Dropbox, depending on the app's access type). The empty string ("") represents the root folder. All other paths must start with a slash (e.g. "/hello/world.txt"). Paths may not end with a slash or whitespace. For other path restrictions, refer to the help center.

Every file and folder in Dropbox also has an ID (e.g. "id:abc123xyz") that can be obtained from any endpoint that returns metadata. These IDs are case-sensitive, so they should always be stored with their case preserved, and always compared in a case-sensitive manner. Some endpoints, as noted in the individual endpoint documentation below, can accept IDs in addition to normal paths. A path relative to a folder's ID can be constructed by using a slash (e.g. "id:abc123xyz/hello.txt").

@labkode labkode self-requested a review May 17, 2021 08:12
@butonic
Copy link
Contributor

butonic commented May 17, 2021

@labkode @ishank011 @dragotin @C0rby @refs should we be a little more radical and get rid of the one of in the Referencemessage like so:

// The mechanism to identify a resource in the CS3 namespace.
// It can represent path based references as well as combined references:
// The storage registry uses the storage space id to determine the responsible storage provider. When the storage space id is not available it will use the path.
// In a URL the different components can be represented in a string using the following layout:
// <space_id>!<root_id>:<path>
// the space_id can be composed of a storage provider id and an opaque id which is seperated by $, the full layout then becoming:
// <storage_provider_id>$<opaque_id>!<root_id>:<path>
message Reference {
  // OPTIONAL.
  // The id of the storage space. Used by the storage registry to determine the responsible storage provider.
  // For simple routing purposes the space_id can consist of a storage_provider_id and an opaque_id.
  string space_id = 1;
  // OPTIONAL.
  // The id used by service implementor to
  // uniquely identity the resource in the internal
  // implementation of the service.
  string root_id = 2;
  // OPTIONAL.
  // When starting with '/' represents an absolute path. In this case the space id and the root_id must be empty.
  // When not starting with `/` represents a path relative to the space_id and root_id. When root_id is empty the path is considered relative to the root of the storage space.
  string path = 3;
}

The space_id really should be a uuid, but to remain backwards compatible (also for migration purposes) I suggest using a storage_provider_id and an opaque_id, which would allow using the same routing as we use today: either space id based or path based.

We should not transport the space_id in the root_id, because then the root_id would carry two different meanings:

  1. when space_id is set to a storage_provider_id, the root_id is needed to identify the actual storage space.
  2. when space_id is set to a storage_space_id, the root_id is used to identify a resource where the relative path traversal begins.

In 1. the root_id is used by the storage provider to determine the storage space. root_id cannot be used to reference a node. references would have to use paths relative to the storage space root, leaking path segments.
In 2. the space _id is used by the storage provider to determine the storage space. the root_id can be used to skip path segments before traversing by path.

I tried to clarify this by giving the Reference a space_id and not a storage_id. This new Reference can be used to represent path as well as id based references and will make the code a little easier to read. We can get rid of the Reference_Id message, the Reference_Path and Reference_Id seperation will go away. Yes that means a lot of changes to the code, but IMO it would make the if statements to distinguish the case a lot easier.

Thoughts?

@dragotin
Copy link
Author

Documentation is updated.

@butonic
Copy link
Contributor

butonic commented May 17, 2021

@labkode @dragotin this has become a blocker for us: without a combined reference we cannot implement a CreateContainer function in a spaces context. It currently takes a path as an argument and we need to replace it with a combined id to split the parent part from the new dir name.

@ishank011
Copy link
Contributor

@butonic I'd prefer the approach in #126 to this, but I'm not sure I understand your explanation.

The space_id really should be a uuid, but to remain backwards compatible (also for migration purposes) I suggest using a storage_provider_id and an opaque_id

Do you mean that space_id can also have the syntax storage_id:opaque_id?

root_id cannot be used to reference a node

Why not? So the root ID and opaque ID for the same container are different? In my understanding, if I need to access via the old opaque ID, my space_id should be something like storage_id:opaque_id. But at the same time, with the new registry, the opaque ID of a folder (space) is present in the gateway as the space_id, so we know which provider to go to. What does root_id store in this case? path would store the relative path with respect to the space_id but I'm not sure I understand what root_id is in this case.

The idea of exposing both the old resource ID and path in the reference makes sense to me but I don't fully understand your comments. Can we have a discussion regarding this?

To unblock your development, you can generate the go bindings using make go, push that to your own repo and replace the go-cs3apis version in the reva mod file. But I'd be against merging #126 until all the changes are present in reva to tackle it, as it would block future changes to cs3apis

@butonic
Copy link
Contributor

butonic commented May 17, 2021

@butonic I'd prefer the approach in #126 to this, but I'm not sure I understand your explanation.

Yes, #126 would be awesome!

The space_id really should be a uuid, but to remain backwards compatible (also for migration purposes) I suggest using a storage_provider_id and an opaque_id

Do you mean that space_id can also have the syntax storage_id:opaque_id?

yes, but with $ as a seperator. That way we can still distinguish the components when URL encoding them. For migration purposes, when merging two instances, the storageid could be oc10a and oc10b. The opaque_id could be the storage id from the oc10 oc_filecache table / oc_storages table, leading to a storage space_id of e.g.: oc10a$1234 or oc10b$5678.

root_id cannot be used to reference a node

Why not? So the root ID and opaque ID for the same container are different? In my understanding, if I need to access via the old opaque ID, my space_id should be something like storage_id:opaque_id. But at the same time, with the new registry, the opaque ID of a folder (space) is present in the gateway as the space_id, so we know which provider to go to. What does root_id store in this case? path would store the relative path with respect to the space_id but I'm not sure I understand what root_id is in this case.

Think of root_id as an anchor that is used to start traversing the optional path component of a reference. On tho one hand, we still need to reference shares by an id that does not change when a file or folder is renamed the root_id property of a reference. On the other hand, we need a path, or more general, a hierarchy that is represented by a path to model parent/child relationships.

So why call the id property root_id? In a path based gateway we cannot let a share recipient access a deep folder without exposing the full path to the share. When you share /users/i/ishank/some/deep/path/with/confidential/or/embarrasing/text/in/the/path/like/boring/projects/fizzbuzz to @labkode the current shared with me listing would expose the full path in the global namespace. By using the root_id property of a reference, the listing would contain https://cloud.example.com/dav/spaces/<space_id>!<root_id> as the webdav url that clients can use to access that specific share. Treating shares like a storage space allows jailing users into a namespace, similar to a chroot jail. They can then traverse the path using the optional path component of references.

I hope that clarifies why I think having a <root_id> with a relative path makes sense. It is optional btw, if it is unset (eg. for the owner of a share) traversal happens purely by path, relative to the storage space root.

The idea of exposing both the old resource ID and path in the reference makes sense to me but I don't fully understand your comments. Can we have a discussion regarding this?

I'd love to!

To unblock your development, you can generate the go bindings using make go, push that to your own repo and replace the go-cs3apis version in the reva mod file. But I'd be against merging #126 until all the changes are present in reva to tackle it, as it would block future changes to cs3apis

#126 would require more changes, yes. We can work towards implmenting them using a temporary branch. Will dig into that tomorrow.

@butonic
Copy link
Contributor

butonic commented Jun 14, 2021

superseded by #130

@dragotin
Copy link
Author

Superseded by other PR

@dragotin dragotin closed this Jun 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants