<div dir="ltr"><div>The essence of the discussion is that :<br><br></div>We can run each process in a container and migrate the container itself. Migration can be done based on work stealing. As far as communication between processes in different containers is concerned, can't we use sockets?<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Feb 16, 2016 at 3:16 PM, Nitin Varyani <span dir="ltr"><<a href="mailto:varyani.nitin1@gmail.com" target="_blank">varyani.nitin1@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">According to my project requirement, I need a distributed algorithm so mesos will not work. But work stealing is the best bargain. It will save communication costs. Thankyou. Can you please elaborate on the last part of your reply?<br></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Feb 16, 2016 at 2:12 PM, Dominik Dingel <span dir="ltr"><<a href="mailto:dingel@linux.vnet.ibm.com" target="_blank">dingel@linux.vnet.ibm.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div>On Tue, 16 Feb 2016 00:13:34 -0500<br>
<a href="mailto:Valdis.Kletnieks@vt.edu" target="_blank">Valdis.Kletnieks@vt.edu</a> wrote:<br>
<br>
> On Tue, 16 Feb 2016 10:18:26 +0530, Nitin Varyani said:<br>
><br>
> > 1) Sending process context via network<br>
><br>
> Note that this is a non-trivial issue by itself. At a *minimum*,<br>
> you'll need all the checkpoint-restart code. Plus, if the process<br>
> has any open TCP connections, *those* have to be migrated without<br>
> causing a security problem. Good luck on figuring out how to properly<br>
> route packets in this case - consider 4 nodes 10.0.0.1 through 10.0.0.4,<br>
> you migrate a process from 10.0.0.1 to 10.0.0.3, How do you make sure<br>
> *that process*'s packets go to 0.3 while all other packets still go to<br>
> 0.1. Also, consider the impact this may have on iptables, if there is<br>
> a state=RELATED,CONNECTED on 0.1 - that info needs to be relayed to 0.3<br>
> as well.<br>
><br>
> For bonus points, what's the most efficient way to transfer a large<br>
> process image (say 500M, or even a bloated Firefox at 3.5G), without<br>
> causing timeouts while copying the image?<br>
><br>
> I hope your research project is *really* well funded - you're going<br>
> to need a *lot* of people (Hint - find out how many people work on<br>
> VMWare - that should give you a rough idea)<br>
<br>
</div></div>I wouldn't see things that dark. Also this is an interesting puzzle.<br>
<br>
To migrate processes I would pick an already existing solution.<br>
Like there is for container. So every process should be, if possible, in a container.<br>
To migrate them efficiently without having some distributed shared memory,<br>
you might want to look at userfaultfd.<br>
<br>
So now back to the scheduling, I do not think that every node should keep track<br>
of every process on every other node, as this would mean a massive need for<br>
communication and hurt scalability. So either you would implement something like work stealing or go for a central entity like mesos. Which could do process/job/container scheduling for you.<br>
<br>
There are now two pitfalls which are hard enough on their own:<br>
- interprocess communication between two process with something different than a socket<br>
in such an case you would probably need to merge the two distinct containers<br>
<br>
- dedicated hardware<br>
<span><font color="#888888"><br>
Dominik<br>
<br>
</font></span></blockquote></div><br></div>
</div></div></blockquote></div><br></div>