Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal.generate sorting #1215

Closed
swiftgist opened this issue Jul 6, 2018 · 9 comments · Fixed by #1302
Closed

proposal.generate sorting #1215

swiftgist opened this issue Jul 6, 2018 · 9 comments · Fixed by #1302
Labels

Comments

@swiftgist
Copy link
Contributor

Description of Issue/Question

The current yaml files generated by Stage 1 are pseudo random. As a result, regenerating a configuration may result in devices appearing in a different order. For standalone Filestore and Bluestore configurations, this does not matter. For split Filestore and Bluestore configurations, the new configuration is valid, but a migration may be necessary.

One suggestion from a couple of discussions is to sort the devices based on the /dev/disk/by-path entries. That would prevent accidental shuffling for disk replacements.

@denisok
Copy link

denisok commented Jul 12, 2018

@jan--f @jschmid1 what are your thoughts on this?

@jschmid1
Copy link
Contributor

One suggestion from a couple of discussions is to sort the devices based on the /dev/disk/by-path entries. That would prevent accidental shuffling for disk replacements.

This is what the discussions resulted in. The downside of this approach is that a user will have to do initial migrations(probably) after changing to the new sorting scheme.

@jan--f
Copy link
Contributor

jan--f commented Jul 20, 2018

I believe the profile generation should be deterministic, i.e. all things equal the code should spit out identical profiles. There is also a sort order...things might get more complex when journal devices are replaced.

What is the situation where profiles would be regenerated? Would that be with the same hardware? Are we talking about changing profiles for OSD replacement?

For the latter it might be better to change the existing profiles where necessary. The yaml files can easily be loaded into python objects and then altered and dumped again. In my head that should be straight forward, though I'm not sure regarding the use case.

edit I think we should avoid imposing a new sorting order for existing profiles at all cost. User will not understand why we migrate all (or a majority) of their OSDs.

@jschmid1
Copy link
Contributor

jschmid1 commented Jul 27, 2018

proposal.populate ( runner )
  -> proposal.generate ( module )
    -> cephdisks.list
      -> hwinfo/lshw

That would mean; If the profile generation is deterministic we currently rely on the hwinfo/lshw output to be sorted.
By looking at a simple hwinfo output ( hwinfo --only disk example ) I assume that the internal sorting is done by looking at the Storage Controller # .

Looking at a more sophisticated cluster, things are different and hwinfo does not seem to sort by _Storage Controller # 249 _

EDIT:

My fault.. it does. In this case it says: "Attached to #ID ( Sata controller )"

If we follow our assertion that a user would litterally 'replace' a disk. ( As in remove the old and put in the new in the same slot ) we should get the same sorted list of devices from hwinfo.
That needs to be properly tested though.

If however one tries to add a new disk while the faulty disk is still in the machine we may have a problem. But wouldn't that mess up the profile even if we had sorting.. yes.

Next step is to have a look at the hwinfo implementation and check how / if it's sorted.

@jschmid1
Copy link
Contributor

jschmid1 commented Jul 31, 2018

Things change if attached to a raid bus controller..

@jschmid1
Copy link
Contributor

jschmid1 commented Aug 2, 2018

I added a test that tries to figure out how the module behaves with unsorted input.

@jschmid1
Copy link
Contributor

jschmid1 commented Aug 2, 2018

If this however not assert as true we have to go the other route, like @jan--f already stated, to read-in the moved 'profile-replaced' file and make sure that the newly generated proposal matches the old layout and wal-db matching.

How this will work out is to be seen..

@jschmid1
Copy link
Contributor

jschmid1 commented Aug 9, 2018

The mentioned test seems to prove that proposal.generate might be deterministic, but is depending on the ordering from it's input function ( cephdisks.list -> hwinfo )

@jschmid1
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants