Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs-only] Docs storage backend cephfs #2494

Merged
merged 6 commits into from
Sep 14, 2021
Merged

Conversation

butonic
Copy link
Member

@butonic butonic commented Sep 13, 2021

@dragotin @micbar @C0rby @refs @aduffeck Just a heads up on what I found upstream and where I see room for imprevement.

Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
Copy link
Member

@refs refs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor typos, and a question. Other than that thanks for documenting the CephFS future work!

docs/ocis/storage-backends/cephfs.md Outdated Show resolved Hide resolved
docs/ocis/storage-backends/cephfs.md Outdated Show resolved Hide resolved
docs/ocis/storage-backends/cephfs.md Outdated Show resolved Hide resolved
docs/ocis/storage-backends/cephfs.md Outdated Show resolved Hide resolved
Co-authored-by: Alex Unger <6905948+refs@users.noreply.github.com>
@butonic butonic requested a review from refs September 14, 2021 07:52
@refs refs merged commit 5283bf8 into master Sep 14, 2021
@refs refs deleted the docs-storage-backend-cephfs branch September 14, 2021 07:53
ownclouders pushed a commit that referenced this pull request Sep 14, 2021
Merge: 47c2c3a 6bf80da
Author: Alex Unger <6905948+refs@users.noreply.github.com>
Date:   Tue Sep 14 09:53:23 2021 +0200

    Merge pull request #2494 from owncloud/docs-storage-backend-cephfs
Copy link
Contributor

@dragotin dragotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments and wishes for changes, but given the time pressure I am good.

In the original approach the driver was based on the localfs driver, relying on a locally mounted cephfs. It would interface with it using the POSIX apis. This has been changed to directly call the Ceph API using https://github.com/ceph/go-ceph. It allows using the ceph admin APIs to create subvolumes for user homes and maintain a file id to path mapping using symlinks.

## Implemented Aspects
The recursive change time built ino cephfs is used to implement the etag propagation expected by the ownCloud clients. This allows oCIS to pick up changes that have been made by external tools, bypassing any oCIS APIs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this is a bit brief - it requires knowledge about Ceph etc. Is there any high level documentation from the Ceph side available? Maybe add a link?

├── .fileids
│ ├── 50BC39D364A4703A20C58ED50E4EADC3_570078 -> /tmp/cephfs/reva/einstein
│ ├── 571EFB3F0ACAE6762716889478E40156_570081 -> /tmp/cephfs/reva/einstein/Pictures
│ └── C7A1397524D0419B38D04D539EA531F8_588108 -> /tmp/cephfs/reva/einstein/welcome.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not use /tmp here - that confuses people who know the linux filesystem layout (I assume these files are not temporary). The right location would be /var/lib/reva/ or another "neutral" dir like /data.

## Implemented Aspects
The recursive change time built ino cephfs is used to implement the etag propagation expected by the ownCloud clients. This allows oCIS to pick up changes that have been made by external tools, bypassing any oCIS APIs.

Like other filesystems cephfs uses inodes and like most other filesystems inodes are reused. To get stable file identifiers the current cephfs driver assigns every node a file id and maintains a custom fileid to path mapping in a system directory:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AGain, I think that is a bit brief.

Questions people might come up with and which can be answered here:

  • Why is it a problem if inodes are reused?
  • What is an inode and what is ownCloud using it for?
  • Where does the Ceph driver has a "file Id" from and why is that better?

- Currently we persist using a single json file.
- As it basically provides two lists, *shared with me* and *shared with others*, we could persist them directly on cephfs!
- If needed for redundancy, the share manager can be run multiple times, backed by the same cephfs
- To save disk io the data can be cached in memory, and invalidated using stat requests.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a implementation detail that I would not mention here.

## Future work
- The spaces concept matches cephfs subvolumes. We can implement the CreateStorageSpace call with that, keep track of the list of storage spaces using symlinks, like for the id based lookup.
- The share manager needs a persistence layer.
- Currently we persist using a single json file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rather: "The share manager uses a file system based persistance layer that is scaled accordingly. For single user scope as user shares, two files shared_with_me.json and shared_with_others.json can be considered sufficient, as the amount of shares per user does not grow infinite."

│ └── sharedWithOthers.json
└── marie
└── sharedWithMe.json
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here - do not use /tmp at top level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants