I've ended up using SimpleDB, which has no concept of joins but permits multiple values for attributes. So instead of 4 million rows, I've got 100,000 rows with around 40 values on each. Doing the same search takes a tenth of a second, plus, I get more information back and my algorithm is much more accurate.
Just registered a cool domain name, going to have a prototype within a week and beta by Xmas.
Edit: makes that a million records with 200 values on each. Doesn't seem to make any difference to response time. |