Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow adapters to handle their own state #495

Merged
merged 3 commits into from
Apr 24, 2023

Conversation

bazile-clyde
Copy link
Collaborator

@bazile-clyde bazile-clyde commented Apr 13, 2023

Description

This PR allows for adapters (video and audio drivers) to handle their own state. This allows us to fix a few issues

  • Calling one of the Initialize functions resets the state of an adapter to StateClosed even if it's in use.
  • Disconnected drivers will return a stale state, possibly StateClosed or StateRunning.
  • Drivers used by other applications will start with a state of StateClosed.

Ideally, all drivers would return the actual state of the driver on queries. This PR returns real states for camera_linux.go fixing the above issues. All other drivers keep their current behavior.

Testing

This was manually tested on a Linux device and works as expected. It should not break any tests or change the current behavior for any drivers other than those on Linux.

Advice for Reviewers

The easiest way to review this code is to start with the changes in driver.go and wrapper.go. Afterwards, ensure none of the logic in any of the files besides camera_linux.go has changed. Lastly, review/test the changes in camera_linux.go and the new states in state.go. Simply running a background goroutine that continuously calls and prints the Status() of a Linux camera and observing what happens when the camera is disconnected and reconnected should be enough.

return nil, recordErr
}

func TestVideoWrapperState(t *testing.T) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test and the one removed below can no longer test state errors by calling mock functions since that functionality has been pushed down to the level of the adapter. This means state information can only be obtained by querying real drivers, not mocks.

@codecov
Copy link

codecov bot commented Apr 13, 2023

Codecov Report

Patch coverage: 36.36% and project coverage change: -0.29 ⚠️

Comparison is base (f0f6be7) 59.11% compared to head (d82c93b) 58.83%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #495      +/-   ##
==========================================
- Coverage   59.11%   58.83%   -0.29%     
==========================================
  Files          62       63       +1     
  Lines        3784     3826      +42     
==========================================
+ Hits         2237     2251      +14     
- Misses       1416     1444      +28     
  Partials      131      131              
Impacted Files Coverage Δ
pkg/driver/camera/camera_linux.go 27.54% <0.00%> (-3.56%) ⬇️
pkg/driver/driver.go 80.00% <80.00%> (ø)
pkg/driver/wrapper.go 88.31% <100.00%> (+1.74%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Member

@edaniels edaniels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice changes! Done with the first pass here. I'd like to see some tests added as well as manual testing performed on windows and macos.

@@ -16,24 +16,34 @@ const (
// StateRunning means that the driver has been sending data. The caller
// who started the driver may start reading data from the hardware.
StateRunning = "running"
// StateGood means the driver is available for use by applications.
StateGood = "good"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good and bad are kind of ambiguous depending on language/context. Based on your comments, how do you feel about: StateAvailable, StateUnavailable, and StateUnknown?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I like StateAvailable and StateUnavailable better. So that's done. StateUnknown is a bit different. It's less about not being able to determine the state of a driver (which may be connected) and more about the fact that we know the device does not exist or has been disconnected. Or in other words, this is us saying "no device by that name exists on this OS". That being said does StateDisconnected seem more clear?

// s will stay unchanged. Otherwise, s will be updated to next
func (s *State) Update(next State, f func() error) error {
// CheckUpdate returns nil if the current state can be updated to next state. Otherwise, it returns an error.
func (s *State) CheckUpdate(next State) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's new states now but they are not checked here. why is that? it seems like you could hit an error by passing one of the new states.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true. The new statuses don't cleanly map to the state-machine-like states we've had before. For example, you can open v4l2 drivers twice without issue but the this function will error if you do. I added an error for trying to use the CheckUpdate function with any of the new states. Ideally, this function would go away after we've updated the other drivers but for now it's only used for legacy behavior.

@@ -16,24 +16,34 @@ const (
// StateRunning means that the driver has been sending data. The caller
// who started the driver may start reading data from the hardware.
StateRunning = "running"
// StateGood means the driver is available for use by applications.
StateGood = "good"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need some new tests around these states being set and retrieved

Copy link
Collaborator Author

@bazile-clyde bazile-clyde Apr 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this is now dependent on a specific driver it's a lot more difficult to test. For Linux devices this requires an ioctl call. Short of mocking the return values for ioctls or setting up a virtual v4l2 device I don't think there's a good way to test it. This is also a public function that isn't needed much internally so you wouldn't be able to mock out the Status function and see the behavior of internal functions change either. It's tricky.

}

func (d *dummy) Close() error {
d.cancel()
return nil
return d.state.Update(driver.StateClosed, func() error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like for each of these closed states now, that if closing fails, the camera is not closed and we are an indeterminate state

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the logic has changed from what it already is. This is the old code.

func (w *adapterWrapper) Close() error {
	return w.state.Update(StateClosed, w.Adapter.Close)
}

func (cam *camera) Close() error {
	if cam.rcClose != nil {
		cam.rcClose()
	}
	return cam.session.Close()
}

And this is the new code.

func (cam *camera) Close() error {
	return cam.state.Update(driver.StateClosed, func() error {
		if cam.rcClose != nil {
			cam.rcClose()
		}
		return cam.session.Close()
	})
}

I'm only replacing w.Adapter.Close with an anonymous function that does the same thing. This was intentional to avoid unnecessarily changing behavior.

@@ -303,6 +304,10 @@ func (c *camera) VideoRecord(p prop.Media) (video.Reader, error) {
}

func (c *camera) Properties() []prop.Media {
if c.cam == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would the camera be nil?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so c.cam is set in the Open method but devices are registered in the discover function. That leaves a window when you can call Properties before you call Open on a device which will crash your program. I've hit this case while testing.

@@ -350,3 +355,28 @@ func (c *camera) Properties() []prop.Media {
}
return properties
}

func (c *camera) Status() driver.State {
var index int32
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

index can move down into the if along with camera

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

func (d dummy) Properties() []prop.Media {
func (d *dummy) Properties() []prop.Media {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did this change to being a pointer?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto fix by IDE for mixed value and pointer receivers. Changed back.

@@ -144,3 +152,7 @@ func (d dummy) Properties() []prop.Media {
},
}
}

func (d *dummy) Status() driver.State {
return d.state
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should these states be atomic values?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't before so I didn't change it to be. The adapterWrapper class just returns and modifies its values without any mutexes so pushing down this logic won't cause any new issues. It seems like the adapterWrapper class in general should be protected by mutexes. I think that would be out of scope for this change though.

}

type Driver interface {
Adapter
ID() string
Info() Info
Status() State
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did this need to move? if so, why?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Adapters are the specific drivers that are passed to the driver.Register function to create Drivers which abstract away the differences between Adapters. We want the Status of a Driver to be implemented by each specific driver just like the Open and Close functions.

@@ -16,24 +16,34 @@ const (
// StateRunning means that the driver has been sending data. The caller
// who started the driver may start reading data from the hardware.
StateRunning = "running"
// StateGood means the driver is available for use by applications.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mentioned that only linux returns this now but this means that end users need to understand all of these states. is there a way to use all of the states or merge the new ones together? I'm concerned an end user may not know how to interpret the state.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the end users would need to know how to make sense of these states. So it would be a breaking change for Linux users. The odd states are StateClosed and StateOpened. Very open to changing this but I'm at a bit of a loss as to how. Ultimately we want to convey four meanings

  1. The driver is reachable and available for use
  2. The driver is reachable and is not available for use (presumably because it's in use)
  3. The driver is reachable but we cannot retrieve a state from it. (the error state, maybe it's corrupt or broken?)
  4. The driver is not reachable (we can not find a device by that name)

StateRunning works for case 2. Now on to states 1,3, and 4.

For Linux, drivers can be opened and closed multiple times but only one application can stream from it at a time. For example, while something is streaming from a device another application can also open the device and set properties like brightness or hue. So what does it mean for a device to be opened or closed? You wouldn't know if a device has been opened if you're still able to open it and stream from it. For v4l2 devices, AFAIK, nothing keeps track of how many file descriptors are attached to the device driver nor would that information be very useful. So that means we can't really use StateOpened. Likewise for StateClosed. Closing a v4l2 device means you closed your file descriptor to it and if you were streaming from it, you stopped. The only useful part of that information is the fact that you stopped streaming which is handled by case 2.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Let's see if we can work with this to keep the existing enums.

  1. This would be StateClosed; nothing in the process has opened it yet
  2. This would be StateRunning as you already pointed out
  3. This would be StateClosed if we consider closed to be the default
  4. This would be StateClosed if we consider that if we cannot find something, it is not open

If we did things this way, could the status of camera linux use its current state plus these cases to produce the right value?

Copy link
Collaborator Author

@bazile-clyde bazile-clyde Apr 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so given the ambiguity of open and closed for a v4l2 driver but I think there's a better way. Since the big thing we want to avoid here is breaking code for current users would versioning these changes work?

These changes are to support exposing the experimental Initialize function since calling that function ruins the state logic we have in place. For current users, neither this change nor the Initialize() function is needed. We could put these changes in pion/mediadevices/tree/master/pkg/driver/experimental or a v2 folder and merge them all at once in a future version once they're more stable. I would suggest adding these changes to a different major version but since we don't have one yet, and as far as I've seen there is no current consensus on what the first version of this software would look like, it's kind of a moot point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't that work given the ambiguity? What breaks upstream? I' m not sure about adding a v2/experimental for this if we can still make something work

Copy link
Member

@edaniels edaniels Apr 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do raise a good point that this isn't 1.0 yet but I'm still curious if we can make this work with what we have in addition to @at-wat's feedback

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't that work given the ambiguity?

What exactly does an open and closed state mean or imply for a device driver? In the truest sense it means something has an open file descriptor to it. It implies nothing in particular. For example, if the user has a device at /dev/video0 we can see what processes opened this file

$ lsof -t /dev/video0
46915
...
$ ps -ef | grep myApp  
$USER      46915   46729  7 15:14 pts/4    00:00:18 myApp

If it's closed then lsof will have an empty process list and no output. What does this imply or in other words, what can (or can't) we do based off of this information? To get the status of the driver we need to open the driver, query its status, then close the driver. The driver may or may not have already been opened or closed. If we continue to assume that drivers can only be opened once (see toOpened) we will never be able to get the status of an opened driver because we won't be able to open it.

  1. This would be StateClosed; nothing in the process has opened it yet

We call Initialize and discover what we think is a new driver. Have we opened it yet? We would need to map the camera.path to a boolean indicating whether we called Open. This ignores the fact that another process may have opened the file and I'm not sure who this info is useful for per my comment above.

  1. This would be StateClosed if we consider closed to be the default

There's not a "default" state for a driver that you can't get state information from. See some common error states here. Those errors are returned from the (VIDIOC_LOG_STATUS)[https://www.kernel.org/doc/html/v4.9/media/uapi/v4l/vidioc-log-status.html] ioctl. Device drivers aren't guaranteed to implement that ioctl so I used a different set of ioctls but same deal.

  1. This would be StateClosed if we consider that if we cannot find something, it is not open

How would a user know if they can attempt to open a driver and start streaming? If both condition 4 and 1 are the same it could mean either, "nothing is streaming from this driver so feel free to do so" or "this driver is not connected so do not stream." It's inherently ambiguous.

What breaks upstream?

Anyone using the driver.Status will now have to update their code based on these new states. It doesn't necessarily break the API in the literal sense but users may have to re-write code or this will introduce subtle runtime errors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noodling on a thought. Considering windows and macos won't be updated here. Maybe a temporary pre 1.0 thing could be to introduce an adapter specific state that can be optionally used in conjunction with driver state. Then v1 can merge the two or have your way supersede the two

@edaniels edaniels requested review from at-wat and removed request for zjzhang-cn and lherman-cs April 17, 2023 14:11
@edaniels
Copy link
Member

@at-wat could you take a second set of eyes on this?

Copy link
Member

@at-wat at-wat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main changes looks nice!

Comment on lines 375 to 378
case err.(syscall.Errno) == syscall.EBUSY:
return driver.StateRunning
case err.(syscall.Errno) == syscall.ENODEV || err.(syscall.Errno) == syscall.ENOENT:
return driver.StateDisconnected
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to use errors.As to make it tolerant for wrapped error

Copy link
Collaborator Author

@bazile-clyde bazile-clyde Apr 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Done!

@@ -81,7 +89,7 @@ func (d *dummy) AudioRecord(p prop.Media) (audio.Reader, error) {
}
return a, func() {}, nil
})
return reader, nil
return reader, d.state.Update(driver.StateRunning, func() error { return nil })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to allow changing state between CheckUpdate and Update, then return invalid state error?
Maybe it doesn't happen in typical usages?

Copy link
Collaborator Author

@bazile-clyde bazile-clyde Apr 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah i guess I can just set the state here without doing the check again and risk returning an invalid state error. That be closer to the original logic. Done!

Updated to not do this.

func (s *State) CheckUpdate(next State) error {
// This function should not be used with the states below. This function is kept solely to support the legacy states.
if next == StateAvailable || next == StateUnavailable || next == StateDisconnected {
return fmt.Errorf("invalid state: cannot update to state '%s'", string(next))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to expose base invalid state error and wrap by %w

Copy link
Collaborator Author

@bazile-clyde bazile-clyde Apr 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand. What's the "base invalid state error"? If next is equal to any of these newer states, checkFunc may not error so I ensure that we always error by checking for these states explicitly here.

Updated to not do this.

@bazile-clyde
Copy link
Collaborator Author

@edaniels @at-wat Thanks for the reviews. After seeing all of the comments and speaking with @edaniels I updated this PR to make this functionality optional. It's far less changes now and will not break the API.

Copy link
Member

@edaniels edaniels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. I think this is actually really clean now. Thanks for discussing this. I have some comments around the error itself but they're fairly nitpicky. If you disagree with them, I have no problem with this merging in.

)

type Error interface {
Error() string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need to add a unexported method of some bogus name to this otherwise this will be implemented by any error

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively you can use a no interface approach for errors like this. Look at the must rebuild error in RDK if you're interested

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Used the no interface approach. Works nicely.

Error() string
}

var ErrUnimplemented Error = errors.New("not implemented")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can put this in a var block

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@bazile-clyde bazile-clyde merged commit cadb155 into pion:master Apr 24, 2023
@bazile-clyde bazile-clyde deleted the RSDK-1677 branch April 24, 2023 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants