-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[docs-only] Docs storage backend cephfs #2494
Conversation
Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few minor typos, and a question. Other than that thanks for documenting the CephFS future work!
Co-authored-by: Alex Unger <6905948+refs@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments and wishes for changes, but given the time pressure I am good.
In the original approach the driver was based on the localfs driver, relying on a locally mounted cephfs. It would interface with it using the POSIX apis. This has been changed to directly call the Ceph API using https://github.com/ceph/go-ceph. It allows using the ceph admin APIs to create subvolumes for user homes and maintain a file id to path mapping using symlinks. | ||
|
||
## Implemented Aspects | ||
The recursive change time built ino cephfs is used to implement the etag propagation expected by the ownCloud clients. This allows oCIS to pick up changes that have been made by external tools, bypassing any oCIS APIs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, this is a bit brief - it requires knowledge about Ceph etc. Is there any high level documentation from the Ceph side available? Maybe add a link?
├── .fileids | ||
│ ├── 50BC39D364A4703A20C58ED50E4EADC3_570078 -> /tmp/cephfs/reva/einstein | ||
│ ├── 571EFB3F0ACAE6762716889478E40156_570081 -> /tmp/cephfs/reva/einstein/Pictures | ||
│ └── C7A1397524D0419B38D04D539EA531F8_588108 -> /tmp/cephfs/reva/einstein/welcome.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not use /tmp
here - that confuses people who know the linux filesystem layout (I assume these files are not temporary). The right location would be /var/lib/reva/
or another "neutral" dir like /data
.
## Implemented Aspects | ||
The recursive change time built ino cephfs is used to implement the etag propagation expected by the ownCloud clients. This allows oCIS to pick up changes that have been made by external tools, bypassing any oCIS APIs. | ||
|
||
Like other filesystems cephfs uses inodes and like most other filesystems inodes are reused. To get stable file identifiers the current cephfs driver assigns every node a file id and maintains a custom fileid to path mapping in a system directory: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AGain, I think that is a bit brief.
Questions people might come up with and which can be answered here:
- Why is it a problem if inodes are reused?
- What is an inode and what is ownCloud using it for?
- Where does the Ceph driver has a "file Id" from and why is that better?
- Currently we persist using a single json file. | ||
- As it basically provides two lists, *shared with me* and *shared with others*, we could persist them directly on cephfs! | ||
- If needed for redundancy, the share manager can be run multiple times, backed by the same cephfs | ||
- To save disk io the data can be cached in memory, and invalidated using stat requests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a implementation detail that I would not mention here.
## Future work | ||
- The spaces concept matches cephfs subvolumes. We can implement the CreateStorageSpace call with that, keep track of the list of storage spaces using symlinks, like for the id based lookup. | ||
- The share manager needs a persistence layer. | ||
- Currently we persist using a single json file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rather: "The share manager uses a file system based persistance layer that is scaled accordingly. For single user scope as user shares, two files shared_with_me.json
and shared_with_others.json
can be considered sufficient, as the amount of shares per user does not grow infinite."
│ └── sharedWithOthers.json | ||
└── marie | ||
└── sharedWithMe.json | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here - do not use /tmp
at top level.
@dragotin @micbar @C0rby @refs @aduffeck Just a heads up on what I found upstream and where I see room for imprevement.