Distributed Process Scheduling Algorithm

Tue Feb 16 05:43:25 EST 2016

The essence of the discussion is that :

We can run each process in a container and migrate the container itself.
Migration can be done based on work stealing. As far as communication
between processes in different containers is concerned, can't we use
sockets?

On Tue, Feb 16, 2016 at 3:16 PM, Nitin Varyani <varyani.nitin1 at gmail.com>
wrote:

> According to my project requirement, I need a distributed algorithm so
> mesos will not work. But work stealing is the best bargain. It will save
> communication costs. Thankyou. Can you please elaborate on the last part of
> your reply?
>
> On Tue, Feb 16, 2016 at 2:12 PM, Dominik Dingel <dingel at linux.vnet.ibm.com
> > wrote:
>
>> On Tue, 16 Feb 2016 00:13:34 -0500
>> Valdis.Kletnieks at vt.edu wrote:
>>
>> > On Tue, 16 Feb 2016 10:18:26 +0530, Nitin Varyani said:
>> >
>> > > 1) Sending process context via network
>> >
>> > Note that this is a non-trivial issue by itself.  At a *minimum*,
>> > you'll need all the checkpoint-restart code.  Plus, if the process
>> > has any open TCP connections, *those* have to be migrated without
>> > causing a security problem.  Good luck on figuring out how to properly
>> > route packets in this case - consider 4 nodes 10.0.0.1 through 10.0.0.4,
>> > you migrate a process from 10.0.0.1 to 10.0.0.3,  How do you make sure
>> > *that process*'s packets go to 0.3 while all other packets still go to
>> > 0.1.  Also, consider the impact this may have on iptables, if there is
>> > a state=RELATED,CONNECTED on 0.1 - that info needs to be relayed to 0.3
>> > as well.
>> >
>> > For bonus points, what's the most efficient way to transfer a large
>> > process image (say 500M, or even a bloated Firefox at 3.5G), without
>> > causing timeouts while copying the image?
>> >
>> > I hope your research project is *really* well funded - you're going
>> > to need a *lot* of people (Hint - find out how many people work on
>> > VMWare - that should give you a rough idea)
>>
>> I wouldn't see things that dark. Also this is an interesting puzzle.
>>
>> To migrate processes I would pick an already existing solution.
>> Like there is for container. So every process should be, if possible, in
>> a container.
>> To migrate them efficiently without having some distributed shared memory,
>> you might want to look at userfaultfd.
>>
>> So now back to the scheduling, I do not think that every node should keep
>> track
>> of every process on every other node, as this would mean a massive need
>> for
>> communication and hurt scalability. So either you would implement
>> something like work stealing or go for a central entity like mesos. Which
>> could do process/job/container scheduling for you.
>>
>> There are now two pitfalls which are hard enough on their own:
>> - interprocess communication between two process with something different
>> than a socket
>>   in such an case you would probably need to merge the two distinct
>> containers
>>
>> - dedicated hardware
>>
>> Dominik
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20160216/7dc03d87/attachment.html