Unison sync between Windows/Linux hangs randomly during transfer-Collection of common programming errors

We regularly deploy interactive kiosk CPUs to remote phsyical sites, and I’ve developed a content updater application that performs a nightly sync of media assets between each kiosk (Windows 7 Pro) and a hosted CMS (virtualized Ubuntu server running on linode.com). The content updater is authored in C#/.NET, and it spawns a child Unison process using Process.Start(). Unison is configured to connect to the remote server via SSH using a private key.

The issue that we’re hitting is that when spawned as a child process from ContentUpdater.exe, Unison will often simply stop communicating with the remote server during a transfer and hang indefinitely. There’s no simple repro – sometimes it works, more often than not it hangs. It seems to be more fragile on larger updates (400MB+) but that’s more conjecture than anything else. When it does hang, the Unison process on the client (Windows 7) is still showing 25% CPU utilization, and the server also shows the unison process running as well — there’s just no network activity. I know it’s connecting, because it always starts the process and gets partway through the transfer, but it never hangs in the same place twice. I’m running a native Windows binary build of Unison-2.40.63.exe, and the same version of unison on the remote server.

The Unison command line on Windows looks like:

Unison-2.40.63.exe -contactquietly -silent -batch -sshcmd "C:\KioskManagement\Apps\ssh2plink.bat" -sshargs "-p 22 -i C:\cygwin\home\someuser\.ssh\contentupdater-rsync-key.ppk" -ignore "Path {innovations,todaytomorrow,scale,mooreslaw,brilliantminds,askafab}" ssh://cmsuser@server//home/cms/base-preview/webapps/ROOT/applications C:\kioskdir\temp\applications -force ssh://cmsuser@server//home/cms/base-preview/webapps/ROOT/applications

For the record, I had originally authored the content updater to use rsync (via cygwin on Windows), but was hitting the same issues. To see if the ssh transport was part of the problem, I tried using rsync in server mode (rsyncd) but the hanging continued to rear its head.

At this point, I’m thoroughly stumped. The issue repros on other servers too, so I’m thinking it’s on the Windows side of things. I’m also inclined to believe that the problem only happens when calling Unison/rsync from Process.Start() inside of another process (UPDATE: I just got it to repro when running from the command line) – it doesn’t seem to fail when running directly from the command line. Unison/rsync also never error out, so there are no logfiles to check (unless somebody knows of some sort of server-side trace or logfile on the remote server I can check — full disclosure: I’m a FreeBSD geek, and know precious little about Ubuntu under the hood).

Thanks in advance for any and all insight/ideas/solutions!

Best