Hmm. I wasn't really thinking of it as anything more than a one-off. If it would be a useful feature generally then, aye, makes it worth it.
It'd still take ages to execute here and that is a problem.
As for complexity - yeah, it could be done as you say but... why bother? That offers no advantage over just making this forum read-only and starting again. Well, little advantage.
If it were to be done properly it would happen in a sort of rolling way. And that would be hard. By which I mean not having to run a script to do the archiving but mechanisms put in place whereby older posts (beyond an arbitrary threshhold) aren't involved in queries. But... I have no idea how that might be done.
But aye, if you're up for making it then that's cool by me, of course.
I'm just looking at the message SQL with my optimizing hat on, and there's quite a lot that could be done to reduce the load on the database there.
The SQL includes a join to the THREAD table from every post, which is redundant. The values from the THREAD table can be retrieved in a single SELECT before the message SQL is run.
There are four joins to the user table, one of which (APPROVED_USER) is only relevant if posts have to be approved by an administrator, and another (EDIT_USER) only if the post has been edited by someone other than the original author. The APPROVED_USER join could be added in conditionally, and the EDIT_USER could be done using singleton SELECTs after the main select is finished.
This one might be a stretch: there are two joins to the USER_PEER table for the relationship and the new peer nickname thing. It might make more sense to do a single select against the USER_PEER table for the logged-in user and cache the result in an array, then get the values from that. It kind of depends on how many peers people have, on average. I've got 23, mostly ignoring sigs. Some testing might be advisable for this one.
Done all that (the joins I mean). It's not in CVS yet, but it's live here. The messages_get function now only uses POST, USER_PEER (x2) and USER (x2). The join to the thread table now uses the previous call to thread_get() in messages.php and the other joins to the USER table have been moved to separate queries to be done only if the post needs approval or the edit message needs displaying.
I've been running a profiler against the code on my own machine (with a copy of the database from here) and noticeable bottle necks are the emoticons code which takes up to 500ms and most noticeably the threads_any_unread() function which is taking at least 1 second to complete(!), compared to the next longest running function being the RSS feed checker which takes 7ms. The threads_any_unread function is usually the cause for the load here. I've tried fiddling with it and adding indexes to the tables and rewriting the query to better use said indexes but it makes diddly squat. It even has a LIMIT clause on it now when it never used to but even that makes no difference, so if it's not the USER_THREAD table that's causing I'm stumped as to what it could be.
for ($i = 0; $i < $e_keys_size; $i++) { for ($j = 0; $j < $e_keys_size; $j++) { if ($i != $j) { if (($pos = strpos(strtolower($e_keys[$j]), strtolower($e_keys[$i]))) !== false) { $a = $e_keys[$j]; $b = $e_keys[$i]; $v = $emoticon[$a]; $a2 = urlencode($a); $a_f = preg_quote(substr($a, 0, $pos), "/"); $a_m = preg_quote(urlencode(substr($a, $pos, strlen($b))), "/"); $a_e = preg_quote(substr($a, $pos +strlen($b)), "/"); $pattern_array[] = "/". $a_f."<span class=[^>]+><span[^>]*>".$a_m."<\/span><\/span>".$a_e ."/"; $replace_array[] = "<span class=\"e_$v\" title=\"$a2\"><span class=\"e__\">$a2</span></span>"; } } } }
Yes it is the modified code.
Emoticons->Emoticons() now executes quicker than the convert function. Hooray.