IWETHEY v. 0.3.0 | TODO
1,095 registered users | 0 active users | 0 LpH | Statistics
Login | Create New User
IWETHEY Banner

Welcome to IWETHEY!

New cygwin, rsync, network shares

The data setup here is a bunch of Win2K workstations and a bunch of Win2K fileservers, with data liberally distributed among the lot. I've got ~10 GB of files I'm trying to synch up from a fileserver to my desktop.

\r\n\r\n

Since I've got Cygwin installed, I say to myself: "Self, sounds like a job for rsync".

\r\n\r\n

Well, it more-or-less gets the job done, but it takes its time doing it....

\r\n\r\n

So I think about what I'm doing, and how rsync likely tackles the problem.

\r\n\r\n

Normally, rsync is used in one of six modes, the main ones being local-local, local-remote, remote-local, and file listings (variants include use of rsync server vs. remote shell/ssh sessions). What makes rsync useful is that it only copies updated files, and it only transfers that part of the data which has been modified.

\r\n\r\n

Potential problem here is that rsync may think it's got a local-local situation going on where it's actually got a networked connection, and is being profligate with its data. Anyone here familiar with internals and how / what rsync's tossing over the local network? It does seem to improve on the situation somewhat, but I'm seeing runtimes of several minutes to negotiate what's actually a pretty small bit of effective data transfer.

\r\n\r\n

Sample session summary:

\r\n\r\n
\r\n
\r\nwrote 1247944957 bytes  read 468 bytes  4665216.54 bytes/sec\r\ntotal size is 10577091343  speedup is 8.48\r\n\r\nreal    4m27.345s\r\nuser    0m27.608s\r\nsys     1m8.358s\r\n
\r\n
\r\n\r\n

So: is this the best that can be done, or could the situation be improved on further?

--\r\n
Karsten M. Self [link|mailto:kmself@ix.netcom.com|kmself@ix.netcom.com]\r\n
[link|http://kmself.home.netcom.com/|http://kmself.home.netcom.com/]\r\n
What part of "gestalt" don't you understand?\r\n
[link|http://twiki.iwethey.org/twiki/bin/view/Main/|TWikIWETHEY] -- an experiment in collective intelligence. Stupidity. Whatever.\r\n
\r\n
   Keep software free.     Oppose the CBDTPA.     Kill S.2048 dead.\r\n[link|http://www.eff.org/alerts/20020322_eff_cbdtpa_alert.html|http://www.eff.org/alerts/20020322_eff_cbdtpa_alert.html]\r\n
New Wow... considering...
...the abstraction layer added by cygwin.

I average a BIT faster... but not much. Only because I have a very small network I am dealing wiht right now.

To write 1.2G of data taking 4.5 minutes... tis not a bad thing... IMO Windows straight could do better but usually get around 6MB/sec... on a 100Mb network with a collison domain (Ethernet is one that does).

I have seen upwards of 10MB/sec but that is the exception rather than the rule.

b4k4^2
[link|mailto:curley95@attbi.com|greg] - Grand-Master Artist in IT
[link|http://www.iwethey.org/ed_curry/|REMEMBER ED CURRY!]   [link|http://pascal.rockford.com:8888/SSK@kQMsmc74S0Tw3KHQiRQmDem0gAIPAgM/edcurry/1//|ED'S GHOST SPEAKS!]
[link|http://www.eweek.com/article2/0,3959,857673,00.asp|Writing on wall, Microsoft to develop apps for Linux by 2004]
Heimatland Geheime Staatspolizei reminds:
These [link|http://www.whitehouse.gov/pcipb/cyberspace_strategy.pdf|Civilian General Orders], please memorize them.
"Questions" will be asked at safety checkpoints.
New More a matter of working intelligently...

...though it's a bit hard to see how this would be done without\r\nforcing some work on the part of rsync.

\r\n\r\n

One athought is that I'm actually doing more work than if I'd simply\r\ncopied the files. If rsync has to compute a hash or checksum from the\r\nsource files, then I have to read the remote files anyway, and\r\nI'd be better off simply copying them, rather than reading them,\r\ncomputing a checksum, then re-reading them for the copy.

\r\n\r\n

OTOH, if rsync can look at a file and say "the size and modification\r\nitmestamps are consistent with the existing copy I'm supposed to be\r\ncomparing this to, let's punt and say the files are the same". In the\r\nlatter case, I'd be seeing some significant benefits.

\r\n\r\n

Moving 10 GiB across a network in four minutes isn't bad for a day's\r\nwork.... Hell, it took ~15 minutes just to get linecounts of the data\r\n(granted, gunzipping it to a pipeline), so I guess I'm getting my\r\nmoney's worth.

\r\n
--\r\n
Karsten M. Self [link|mailto:kmself@ix.netcom.com|kmself@ix.netcom.com]\r\n
[link|http://kmself.home.netcom.com/|http://kmself.home.netcom.com/]\r\n
What part of "gestalt" don't you understand?\r\n
[link|http://twiki.iwethey.org/twiki/bin/view/Main/|TWikIWETHEY] -- an experiment in collective intelligence. Stupidity. Whatever.\r\n
\r\n
   Keep software free.     Oppose the CBDTPA.     Kill S.2048 dead.\r\n[link|http://www.eff.org/alerts/20020322_eff_cbdtpa_alert.html|http://www.eff.org/alerts/20020322_eff_cbdtpa_alert.html]\r\n
New It can do the latter
In fact it is the default. So if you don't get that speedup, you are turning it off with -I or something that implies it (eg -a).

Look in the manpage. There are a lot of options. And some of them imply other options. I found in particular that I never use the -a option because it implies a ton of flags, one or two of which I don't want, and there is no way which I saw to override that easily. So have scripts that use a gazillion flags instead...

Cheers,
Ben
"good ideas and bad code build communities, the other three combinations do not"
- [link|http://archives.real-time.com/pipermail/cocoon-devel/2000-October/003023.html|Stefano Mazzocchi]
     cygwin, rsync, network shares - (kmself) - (3)
         Wow... considering... - (folkert) - (2)
             More a matter of working intelligently... - (kmself) - (1)
                 It can do the latter - (ben_tilly)

Cloaca Cola... oh, that was sooo good.
67 ms