Tuesday, July 22, 2008

One in a Million

I remember discussing multi-threaded programming with a senior engineer the better part of a decade ago. I was reacting rather strongly to his cavalier attitude towards multi-threaded code, deadlocks and race conditions.

"That is such a small hole, it's one in a million that it will happen."

At the time, we were a small company, and I was a young engineer. One in a million, that doesn't sound that big. Of course, one in a million happens more frequently than you would expect. It didn't take very long for our systems to be processing a million transactions a day, and then millions per hour. Every time the holes had to get smaller and smaller.

They still have a fault in there where the entire system crashes when the system goes completely idle for a couple of minutes. It doesn't happen all the time, it's a "one in about 20 million", but it only has to happen once. So, it crashed about once a week. That one took forever to track down. Still haven't been able to fix it entirely, but they know that it's there, and the hole is smaller, probably "1 in a couple billion" now, below the rate that it's restarted for maintenance. Fixed, for now.

It's interesting see that the DNS engineers are learning the same lesson. Compute power has now shifted in favour of the attackers. As we saw with Captcha, it doesn't matter that you only get through rarely - you only have to get through once in a while. Even just guessing a DNS TXID (2^16), you've got a 1 in a 65536 chance of guessing the right answer.

In pure mathematical terms, the likelihood of getting the TXID wrong 400,000 times in a row is 0.2%. In other words, pretty certain.


Let's look at the 50/50 point, where the likelihood of seeing X number of failures is 0.5.

That turns out to be somewhere around 45000 attempts, simple for loop range. Since I am spoofing packets, I don't care about a response, even better, filling up the pipe will expand the race situation, giving me a larger chance to get through.

Still, it sounds like the attacker has to also be requesting lookups to poison the server. Let's say they take 2ms each, and they're being nice by only doing them one at a time. Even using the large 400k count, it will take at most 15 minutes of concerted effort to poison the cache. The 50/50 point? One minute, 20 seconds.

Look out for those one in a million situations, they happen more often than you'd like.

Monday, July 21, 2008

Naming Documents

I thought we killed off Hungarian Notation because it doesn't work? Did I miss a memo somewhere? It seems that Hungarian Notation is making a comeback, this time in document names.

There's nothing like having your meta data encoded in the file name, with three letter acronyms for everything. It is impossible to decode, and becomes really fun when the team names change once a year!

"Say where's the SRS?" "Is it in SRS.doc?" "Nope" "How about esgEng_SIT_DR4_SRS.doc?" "Nope" "Oh, wait, Engineering was renamed last year for 6 months.... How about esg_PEN_SIT_DR3_SRS.doc?" "Ah, that's got it."

Encoding meta data in the filename is stupid. How about putting it in the file in the meta data portion where it belongs, and then using a search tool to search it? Or, maybe a directory structure that represents your tag tree.


Then, when you need to change the meta data, you rename the tree:


But Jason, I need to know who produced the document without opening it!

Use the checksum, and search/store that. Anything else can be mistakenly altered or lost. Use the file's checksum. It is a much more reliable descriptor of the file than the filename! I have learned from experience to, never, ever trust a filename, they lie.

Or better yet, you buy a document repository off the shelf for 100k, and shove the problem at them. Personally, I just use the Google Search Appliance. Now that they've got it looking at the correct bits of data, it's very useful. Much better than any of the searches built into the various corporate portals.

Friday, July 11, 2008

Getting Starcraft working on OSX

I've started playing Starcraft again after ages away, but I found that my brand spanking new MacBook Pro wouldn't work!

The new versions of OSX have an updated video driver which does not support 256 colour modes. This means that Starcraft, Diablo II, or basically any older game will no longer work. Not good when the applications were still sold by Apple as OSX compatible games (they've since been removed).

I have now figured out a simple (albeit more expensive - US$79.99 + Windows license) way to solve the problem. I decided to run the game under VMWare Fusion.

However there were two problems with this. First, when running in full screen mode, the display doesn't stretch to fill the screen. So on my laptop, I end up with a postage stamp display right in the middle of the screen. Second, the mouse is not restricted to that little postage stamp, it moves freely all the way around the display. When attempting to scroll the display in the game, you move the mouse to the edge, since the mouse will leave the bounds, the game doesn't scroll very well.

The first was fixed with a quick google.

In the file "Preferences/VMWare Fusion/preferences" add

    pref.autoFitFullScreen = "fitHostToGuest"

This will stretch the display. If you have a widescreen display, you will still have black bars on the left and the right, but we're making progress.

Now to mouse capture. There doesn't appear to be an option to prevent the mouse from being taken back by the host OS, so I took a more drastic approach. I uninstalled VMWare Tools. This requires a reboot of the VM, but when you are done, you will no longer be able to move the mouse outside of the VM!

To get back to the host OS, press CTRL-CMD. To re-install VMWare Toole, select "Install VMWare Tools" under "Virtual Machine"

You should now be able to play Diablo II and Starcraft in all their 256 colour beauty.

Now, if I can only control my nerves enough to work my trackball....

Thursday, July 10, 2008

Keeping Efficiency Gains

Air Canada has implemented self check-in at Pearson Airport in Toronto. For all flights, you no longer go to an attendant to obtain you boarding pass, you enter all the required data into a computer, and then check your bags.

This was supposedly done for efficiency reasons. However, most of the efficiencies in the system have been lost. Amazingly, when you get to the front of the second line (to check in your bags), the check in clerk does exactly the same amount of work as before. The first thing they do is verify all of the details that you provided the computer!

Let's go over that again. After having provided my information to a computer, I have to provide it a second time to a human who verifies the first data. The time taken for verification was exactly the same as it would have taken for them to key it in in the first time. Instantly, you have a net loss of efficiency. Even worse, they only have a single set of scales for each pair of desks. That means that there is substantial dead time while you wait for the person at the desk next to you to finish weighing their bags. More lost efficiency.

I can see how it happened. Someone checked in to the wrong flight, or their bags went to the wrong place. Perhaps they got to US customs (you go through US customs in Canada when flying to the US), and didn't have the correct forms or all their data provided, and were sent back, maybe they even missed a flight.

So, the check-in clerks, who are also looking to protect their jobs, add in the task of checking customer provided data.

However, they don't check the data of people who don't have checked baggage, they go straight through to customs. This shows the stupidity of the additional check. Is Air Canada saying that people with checked luggage are more likely to enter incorrect data? I doubt it.

If you make efficiency gains in your organisation, make sure you protect them. Guard them jealously. If you don't, you will see them frittered away.