FW: wrapper device driver

Mon Feb 2 22:15:13 EST 2015

On Mon, Feb 02, 2015 at 05:50:43PM -0600, riya khanna wrote:
> On Mon, Feb 2, 2015 at 5:00 PM, Greg KH <greg at kroah.com> wrote:
> >
> > On Mon, Feb 02, 2015 at 04:46:24PM -0600, riya khanna wrote:
> >> The goal is to provide multiple instances of a real device, where each
> >> instance could be assigned to a container. This is to enable support
> >> for device multiplexing in user space.
> >
> > Heh, no, don't do it.
> >
> > Seriously, don't, it's been shot down time and time again in person and
> > in emails.  The 2013 Plumbers conference had a whole session on this in
> > which people yelled at me for 45+ minutes, it was fun, I still said no.
> >
> 
> Yes, I'm apprised of the LPC '13 and email discussion on device
> namespaces. In fact the reason I started down this path is because,
> like you said, the discussion outcome ruled out kernel changes.

The discussion also stated that this was impossible without kernel
changes, you need to do this in the kernel on a subsystem-by-subsystem
basis, there is no "magic fix" to make all devices work at once.  That
was my main point of that discussion, people seem to be thinking that
they don't want to do the hard work for some reason :(

> >> I did look at CUSE. However,  I realized that not all the device
> >> driver's all all operations to be forwarded to CUSE proxy daemon -
> >> some device drivers do bookkeeping based on process PID, so CUSE proxy
> >> daemon cannot operate on behalf of processes. Performance is another
> >> reason.
> >
> > Have you benchmarked CUSE?  It's fast, but the real question is what
> > types of devices are you trying to use this for?
> >
> > If a device is to be multiplexed, it needs to be done so in the driver
> > for the device, or the subsystem, you can't do it in a "wrapper" driver,
> > or even in userspace, as state will get confused and messed up and it
> > will not work properly in the end, sorry.
> >
> 
> The purpose of multiplexing is to either block undesired
> events/operations on devices (e.g. input, graphics) or respond to the
> applications based on the in-memory state of device instances.

You didn't answer my question of "which specific devices do you care
about" :(

You can't "filter" device commands (see the before-mentioned cdrom mess,
you have learned from history, right?)  And you can't assume you know
what the in-kernel state of devices really are, as they are getting
commands from the hardware itself that changes this state, not all
changes come from userspace.

> With CUSE, in-memory states can be maintained and mediated in user
> space. AFAIU, doing device multiplexing in the kernel would also
> entail the same - maintain in-memory state (replicating data structs)
> for each virtual device instance, but that also means changing the
> drivers/subsystem to incorporate this functionality. I may be missing
> something here, but I'm not sure why maintaining the state in
> userspace (as a CUSE device) would be messy and not work. I would
> appreciate more explanation. thanks.

It's impossible to maintain the state in userspace properly.  If you
could do it, then you would just have a userspace device driver, and why
need the kernel at all for it?  :)

Think through the specifics of a specific device you wish to try to
mitigate access to, and walk through the complexity and marvel at how
much more work you are now doing than the original kernel driver did.

Again, this has to be done on a subsystem-basis, in the kernel, for it
to work properly.  And when you try to do that you will get a lot of
pushback, which is correct, as you will be adding complexity for almost
no gain in the end.

Just properly assign different devices to different containers, if you
want to do more than that, then think about using a "real" virtual
machine, which does properly abstract the hardware away like this (or
really, it just does hardware pass-through and again, does not share the
hardware fully, look at how USB works in virtual machines for examples.)

good luck,

greg k-h