File synchronizers suck! Guess what, I’m procrastinating!
Yup, I’ve been ’slacking off’ in the sense that I haven’t worked on any business projects. Saying I’ve been snowed under with client work sounds like an excuse - I’ll always be snowed under with client work. There’s more of it than I can handle. The market economist in my head says that I should increase my hourly rate, but so far that’s been met with very poor responses.
I went to the Sydney OpenCoffee meet yesterday and had a blast. Finally, entrepreneurial types in Sydney! There was a lot of talking about mentoring and quipped that I just needed someone to kick me in the ass and tell me to do some work. Seems I’m not the only one with that problem.
I’ve decided what project 3 will be. A file sychronizer. Yup, another one. The problem that has become sorely obvious in the last couple of days is that all current file synchronizers suck. They either:
- use timestamps to detect changes. This isn’t particularly accurate.
- use checksums to detect changes. This is accurate, but a synchronization takes forever if you have any reasonable amount of data. My typical ‘working set’ is on the order of 4 gigabytes across a few hundred thousand files. What I want is to just sync my entire home directory, and that’s around 40 gigs.
I tried out Unison and jfilesync. Unison seems to work alright but is very slow - slow enough that I haven’t actually bothered completing a sync yet. jfilesync craps itself if you have non-files in your directory (such as named pipes) or if you have directory loops (such as many build environments) or if you breathe on it wrong, so that’s out too.
I’ve been doing the incremental rsync backup thing for a while and like it. I’d like my synchronizer to be able to handle backups too.
I don’t know how I’d achieve both performance and accuracy, yet. Obviously, checksumming every file on the system is not acceptable on a regular basis. That’s the sort of task a filesystem should be doing. Scarily enough, ZFS is the only modern filesystem which performs end-to-end checksumming; I consider that something of an outrage considering the abundance of CPU power we have nowadays. I’m hoping something like FAM can tell me when files change without requiring me to poll them or install magic kernel modules. I’ll do kernel modules, but they’re a massive support burden.
Stepping back for a moment, what I really want isn’t file synchronization. I want my working environment to be wherever I am. Carrying a laptop and doing all of your work on there solves a good chunk of the problem, but laptops are physically limited in what they’re capable of.
Location awareness is a hassle; I’m forever changing resolutions and font sizes and performance settings depending on where I am. Software such as MarcoPolo solves some of this by helping the computer figure out where it’s located automatically.
I like Philip Greenspun’s idea for using a mobile phone as a computer; you take your phone, plug it in at one location and all of your stuff pops up. Programs run on the larger CPU at your location. When you unplug, everything is somehow still on the phone, you take it to another location, plug it in, and everything pops up again. The technology is part of the way there; what’s still lacking is probably storage space on the phone. It’s certainly solvable if you paid enough money.
An idea I’ve been kicking around for a while is ‘process hibernation’. When you hibernate a process, its memory image and external references (files, sockets, etc) are saved to disk. You can then take that disk image, stick it on another computer and fire it up again exactly as if it never stopped running. That would be convenient, especially if you could do it for an entire X or Windows session.
You can sort of do it with X forwarding, but that’s too slow over a network. You can sort of do it by running out of a VMWare session saved to removable storage, but that’s too slow as well. I’d rather just carry a laptop.
I really have to get some work done on the planner, too. My ‘next task’ for a while has been to get the Windows version to work using py2exe, but I might push that back to do a Reminders module and a Scheduler module, given I actually need them now and don’t personally need a Windows port for the foreseeabe future.
November 3rd, 2007 at 4:46 am
There used to be a hardware device out there that allows you to bring your environment with you, and to hibernate the current state back to the device. Unfortunately it seems like its no longer available
News here:
http://www.linuxdevices.com/news/NS8562564746.html
April 4th, 2008 at 1:22 am
You might be interested in this article I wrote. Only tangentially related, but your “process hibernation” reminded me of it. Instead of saving the memory state of the entire program, it is more document-centric and continuously saves the state of the document itself, so that closing a program and minimizing a program are really the same thing.
http://endolith.com/wordpress/2007/12/24/abolish-the-saving-of-documents/
Your post brings up an important point about the state of the program changing when you restart it though. (In a graphics program, for instance, you might have the eraser tool selected. When you minimize it and reopen it, the eraser tool is still selected, but when you close it and re-open it, it goes back to the default brush tool. In other types of programs, this volatility might cause more serious problems that I hadn’t thought of. Even with the graphics program constantly resetting, it would cause a little lost time as you re-select the tool you were using last time you had the document open. And if the graphics program remembered its state from one instance to another, but you open a different document each time, that would cause wasted time, as well. Should the state of the program be saved along with the document?)