Archive for June, 2008

Data storage and performance

Sunday, June 22nd, 2008

I’m finally getting SyncDroid to the point where I can start testing some of my initial assumptions about the project:

  • Scanning while idle will not cause perceptible slowdown to the user
  • There is enough idle time on a typical desktop session to get useful work done

Without these, I’m stuck writing an ordinary file synchronizer (or a slow one) and that’s not really worthwhile to me.

PySQLite Bad. APSW Good.

One of the big question marks has always been ‘how am I going to store the data on disk?’ Most other programs with this requirement use Berkeley DB or SQLite. As I’ve previously mentioned, SQLite can be fast if you’re intelligent about it. That said, I’ve never used a large SQLite database - anything past about 10,000 rows. Performance issues will show up well before that if you’re doing something silly (like starts-with queries on a string column).

I want to use an external database for the data integrity features. People will unplug USB drives mid-transaction and I don’t want that to trash my entire data store.

I started out using SQLite and common ORMs to simplify the database implementation. Skipping over a few frustrating days of development, I came to the following conclusions:

  • Most ORMs will not let you access multiple physical databases at the same time - they implicitly store a reference back to the database. This is no good for me, because I have databases on the host and on any USB drives that get connected. This ruled out Autumn, which was looking beautifully simple.
  • Storm’s SQLite transaction behavior is a bit weird - weird enough to be unusable. They do document it, thankfully, and the codebase is very clean, so it’s likely I’ll use it in future (non-SQLite) projects. These problems are mostly because:
  • PySQLite completely mangles SQLite’s transaction model. Seriously, guys. SQLite is not typical as far as SQL databases go, but its behaviour is very well documented and not difficult to understand. If you’d just butt out and let SQLite do its thing, you’d be fine, but noooo, you gotta be clever about it…

Aaanyway, this rules out anything based on PySQLite as an ORM. Which means everything. So I wrote my own using APSW (Another Python SQLite Wrapper). This turned out to not be a big deal - the operations I perform on the database are quite simple by design. Once I worked out a schema bug, this worked beautifully.

Performance

My use case for this application is:

  • My maildirs (170,000 files)
  • A Buildroot repository (240,000 files)
  • A new OpenEmbedded repository (80,000 files)
  • My git repositories (15,000 files)
  • Personal data and records (15,000 files)

I do not intend to be kind to my own software.

Right away, I noticed I was hitting the 120-transactions-per-second issue in SQLite (linked to fsync() and the rotational rate of 7200RPM drives - this is on my laptop). No big deal. It’s still pretty quick.

(I actually have concerns about USB flash drives due to their extremely slow fsync() performance - depending on the filesystem, it might require a few erases/rewrites. Erases are horrendously slow on flash devices, which is why a lot of people are seeing poor write performance on SSDs).

I left it overnight to scan my data. When I got up, we were up to 140,000 files and adding maybe three per second instead of thirty. Grr. (On the upside, Python is not immediately a performance issue. Take that, C++ coders!)

In a past life, I was a performance geek. I heart oprofile. The results:

      samples|      %|
    ------------------
        66522 88.5084 libsqlite3.so.0.8.6
         3963  5.2728 libpthread-2.7.so
         3247  4.3202 python2.5

I took a shower, realized that I hadn’t added appropriate indexes to the tables, and retested:

  samples|      %|
------------------
    81119 99.5264 no-vmlinux
      105  0.1288 python2.5
    CPU_CLK_UNHALT...|
      samples|      %|
    ------------------
           40 38.0952 libsqlite3.so.0.8.6
           30 28.5714 libpthread-2.7.so
           18 17.1429 libc-2.7.so
           17 16.1905 python2.5

The vast majority of the time is now spent in the kernel, not in the SQLite library.

I’m not sure what conclusion to draw from this, yet. Perhaps ’showers are good’. So far, it does not appear that either Python or SQLite will cause performance problems.

Reflections on synchronizer algorithms

Wednesday, June 18th, 2008

In my classic inability to actually focus on a single task for any length of time, I’ve been working on SyncDroid.

I’ve been attacking the tricky areas of data storage and what I refer to as the ‘datapath’ - the chain of events that takes place between a change occuring on a computer and it propagating (across physical space and time) to another computer . I can partly explain why nobody has done this before: it’s really tricky.

Unison (and most other synchronizers) make some simplifying assumptions:

  • There is always a master computer and a slave computer
  • We only care about what is happening at this exact moment in time
  • We can synchronize the times on the two computers when the synchronization occurs
  • We can suck up as much CPU and IO time as we like while synchronization takes place

Unfortunately, none of these are true for SyncDroid. They have interesting consequences.

There is always a master and a slave

This makes configuration management really easy: you always look to the master computer. In network-connect hosts, the master is (by definition) contactable, so you can just tell it to update its configuration with any changes made on the slave end.

SyncDroid doesn’t have this luxury. In the case of USB-drive synchronization, the two computers cannot just tell each other about changes. So there’s an interesting sub-synchronization problem: in order to know what data we need to synchronize, we need to synchronize the configuration first.

We only care about a single moment in time

There’s really only one trap if you use this assumption: files might change between the time you detect a change and when you actually synchronize it. This is easy to solve if you take out an exlusive lock on the file-being-synchronized and ensure that it still looks like it did when you scanned it.

SyncDroid cares about lots of points in time. Because it syncs constantly, we have to be very careful about what state we think a file is in versus what state it actually is in. If you’re doing syncs to multiple partners, you have to keep track of all relevant metadata for all partners. If a partner goes away - say the user loses the USB drive - we shouldn’t waste time and resources tracking data that will never be used. And we can’t just rescan things constantly or lock files because that would hurt performance (or make it impossible for users to actually do work). I’m a user of this thing, too, and if it doesn’t perform acceptably, I won’t use it!

We can synchronize computer times easily

On a network-connected synchronizer, this is easy. You run some variation of the NTP protocol between the two hosts and calculate an offset so that you don’t disturb the user’s clock. You can then work out relative change timings and the best course of action.

Because this version of SyncDroid works over USB drives, it can’t synchronize times easily. I get around that with a ‘mountcount’ - it’s just a number that is incremented every time the metadata on a drive is loaded. RAID arrays use the same idea to detect drives that were unplugged from an array and are now out-of-sync with the rest of the array. Each computer using a USB drive can then use the mountcount to determine relative change times without being dependent on the computer’s clock, which will probably be wrong.

The consequence of the mountcount is that multiple access to the metadata is strictly forbidden. This is reasonably easy to ensure and shouldn’t be visible to the user.

We can suck up as much CPU and IO time as we like

This is a big one, and it’s one of the major reasons I started this project. None of the current synchronizers are sensitive to the user. Perhaps I’m a dreamer, but I would like my files to be synchronized without taking a massive hit in PC performance (or battery life).

Unison (as well as most synchronizers) will do exactly  what you tell them to. If you say ’scan for changes’, they will scan right now. If you say propagate changes, they will propagate right now. While they are working, the computer is struggling under massive IO load, and if you have large amounts of data (like I do) that could lead to several minutes where the disk is spinning and you can’t use the computer and you have to sync right now because your plane is leaving but it’s still running and argh I’m going to be late.

SyncDroid has a fairly involved set of priorities to determine under what circumstances it should scan and sync and bookkeep. For example, it has two scanner types: a notification scanner (which uses the OS to determine when files have changed) and a comprehensive scanner (in case SyncDroid wasn’t running and you changed a file). The notification scanner runs all of the time, but if you’re on battery or using the computer, it just remembers the changes in RAM and gets out of the way as quickly as possible. The comprehensive scanner only runs when the computer is connected to power and you’re not using it.  In this way, you get the effect of non-stop change scanning without any perceptible difference to your computer’s responsiveness.

There is a big ‘but’ here, and it’s one of those annoying engineering tradeoffs: if you are not aggressive enough about scanning, you will miss changes (say, the user disconnects their laptop without warning). If you are too aggressive, you’ll slow down the computer. The trick is to find a set of tradeoffs that works well in most circumstances. In those cases that it doesn’t work, you can warn the user and give them an opportunity to fix the problem (by plugging the laptop back into the network for a minute, for example).

Data Storage

And then, there’s the hairy issue of where to put all of this data that we’re collecting. What we have is roughly a parallel filesystem to the one on the disk: for a file, we want to store some metadata. The best way to store this, from a design point of view, would be to store it in the filesystem itself, but this is impractical for a number of reasons (don’t want to change the user-visible view of their data, no filesystem support, differing semantics between systems, and so on).

So we have to create a filesystem within a filesystem. It’s another meta-problem like the sub-synchronization problem in configuration management. I considered doing this in the literal fashion - creating an image on disk with a virtual ext2 filesystem. Instead of files, there would be structs of metadata that I had collected. Licensing issues were, well, issues here, and it would require me to maintain a fairly complicated data access layer. The big technical problem is that contemporary filesystem assume a constant-sized disk, while I wanted to be able to expand and shrink the image size dynamically.

My stopgap solution (while this is all stubbed out in my code) is to use a YAML file. I adore YAML. It is not a high-performance data access layer, however, and it was not designed as such. It’s just very easy to use.

Another option was a custom C data type - or, phrased another way, ‘write my own filesystem’. Lots of effort. Transaction management is a big hairy problem that I don’t want to get into.

Finally, SQLite. I love SQLite - it’s very easy to use and gives you very powerful query functionality. It handles on-disk consistency well and - used sensibly - can be very high-performance.

Many applications, sadly, do not use SQLite in a sensible fashion. (I’m looking at you, Meta-Tracker). Like any SQL database, you can do silly things to it that will absolutely destroy its performance characteristics. A classic in this situation is if you want a directory listing and your rows look like { filename | data }; the database needs to do a ’starts-with’ check on each row in the database because there’s no easy way to index efficiently by filename and retain  simplistic tree-searching operations. This is Really Really Slow.

My current plan is to solve this by implementing a more traditional inode/parent structure within my database schema. I have the big advantage of knowing exactly which operations are necessary (read record by path+name, write record by id, create record by path+name, list children by path) and so can optimise specifically for them.

Office ergonomics

Saturday, June 7th, 2008

Joel Spolsky has just posted a thing about the desks at Fog Creek’s new office. It uses an image that is often cited as being ‘ergonomic best practice’, but I can’t imagine a more horrendous way to spend a day in a chair:

desk_layout.jpg

(image borrowed from the Nicholas Institute of Sports Medicine and Athletic Trauma)

This seating position completely ignores the balance and weight distribution of a normal human.

What supports the arms and hands? Your muscles do. How do you think that feels after a day of work? Pretty damn awful. Hence, everyone I know rests their shoulders or forearms on an armrest or desk edge, or puts their wrists on a wrist rest. You just can’t hold your arms in that position all day without support.

A significant omission is the mechanism by which you support your hand/arm while using a mouse. Wrist injuries are another whole area which I’ve been lucky to avoid, but solutions include:

  • arm weight goes on your hand (on the mouse), which leads to wrist strain and inefficient mousing
  • arm weight goes on your wrist (on the desk), which leads to some twisting and nice callouses on your wrist
  • arm weight goes on a wrist rest, which puts pressure on the tendons in your forearm (and exacerbates carpal tunnel issues)

I don’t know who they modelled this image on, but that is some damn fine posture they’ve got around the neck. I can’t keep my neck that straight when I’m standing. Maintaining this position requires quite a bit of tension in the rear neck muscles. Unfortunately, most of us also need to work on the desk occasionally and lean forward. This tires the neck muscles. The muscles tire, the head droops forward, the load on the muscles increases, and you go home with a sore upper back.

Again, this image is modelled on some imaginary human with very short upper arms. A significant problem that I have when setting up my own workstations is that to reach a comfortable table height, I have to compromise between squashing my legs (table lower) and having too acute an angle at my elbows (table higher). In fact, a fairly comfortable keyboard location is on my lap (and it solves the support-the-arm-weight problem to some extent).

Points 3 and 4 on the image are where your weight is supported. This just does not work. I cannot find a chair or a position in which this does not lead to numbness or loss of blood flow to some part of my legs. (Keep in mind that I’ve done quite a bit of long-distance cycling purely on my sit bones (3) and not had blood flow issues). The legs have some pretty serious blood vessels running through them and are moving in a normal human. Most people that I know move around through the day, and this helps a lot with blood flow. Keep in mind that sitting in an office chair all day is awfully like sitting in an airplane seat all day.

‘Feet flat on floor’ is another guideline which I find absurd. You can only reach this condition by adjusting the height of your seat. If you put it too high, the legs will be dangling and exacerbate blood flow problems at 3 and 4. If you put it too low, the legs will either be out in front and unsupported, or in the suggested position, under compression and throwing off the rest of your balance. This is an extremely fine balance to make, and completely incompatible with the idea of moving around in your chair during the day.

Most office chairs won’t go high enough for me to put my feet flat on the floor. I’m about 6ft/180cm, so while tall, I’m not that tall. The legs on most chairs will get in the way of your feet anyway, so this is a very unhelpful guideline.

Finally, the fact that you’re pushing against the chair at 1 and 2 means that your entire body is being shoved forward (Newton’s Laws strike again). The force must be balanced at 3, 4 and 7 (hence compression of the legs). Because you have nothing particularly good to push again, it’s all frictional, and you tend to slide forward in your seat. Then you’re a sloucher. One of us!

Enough complaining. What to do?

First, buy a good chair. You spent good money on a bed to be in for 8 hours/day; now you have to spend good money on a chair that you’ll be in for another 8 or 12 or 16.

I worked on an Aeron for a few months but didn’t like it. The mesh is too stiff for my bony butt and the whole thing is very heavy and difficult to adjust. I eventually bought a Steelcase Leap and am fairly satisfied. (It was also half the price of the Aeron in Sydney. Just watch the 8-week shipping.) No, it doesn’t have mesh, but I find fabric and some padding to be more comfortable. And you can choose colours!

Roughly, my setup is:

  • Elbows rest on the armrests. This works way better than no support, but isn’t great. Some people use a towel on the desk to provide extra padding, and as a bonus it catches food and coffee ‘accidents’.
  • Chair is high enough that I can either stretch my feet out across the floor or tuck them under and rest on the chair legs.
  • Tilt varies through the day. I can either sit up if I’m working on the desk a lot, or recline back if it’s mid-afternoon and I’m cruising.
  • Slouching is a no-no. If I’m starting to slouch, I tilt the chair back. I’m a hardcore sloucher.
  • I use a Kinesis Contoured keyboard and strongly recommend them. They have big wrist-rest areas and supply with pads. Admittedly, I rely on this too much (the keyboard is designed to have your hands hovering) but it works fairly well.
  • I use the mouse as little as possible. My geek is showing here, but I use a tiling window manager. This is a massive timesaver; there’s no fussing with window positioning and hence much less mouse use.
  • I use a Standit to hold laptops up at eye level and Synergy so that I’m not reaching across the desk or shuffling keyboards around.
  • I learned to mouse with either hand, since the joints of my right hand are fairly worn out after too much playing Quake. I wish I was joking.

I expect that each person is going to have a slightly different way to solve these problems, so please share how you approach these issues in your own office.