Teh Lack of Speed

From: Woggy 3 Jul 2006 14:59
To: ALL13 of 72

It's bloody terrible for me, from home and from work, so it isn't my connection (which is fast on other websites).

 

That's why I haven't posted much, I could make a cup of coffee waiting for threads to load.

 

EDIT: This just opened in another tab while I waited for this to post, 'Dictionary, Initialising'. So far it's been ages. Looks like it could be a spell checking problem?

 

http://www.tehforum.co.uk/forum/dictionary.php?webtag=DEFAULT&obj_id=t_content

EDITED: 3 Jul 2006 15:06 by WOGGY
From: Matt 3 Jul 2006 18:02
To: Woggy 14 of 72
It's not as simple as saying it's the dictionary causing the slowness.

But it is as simple as saying that this server can't cope.

Problem is when a query is performed the tables required are locked so no data can be inserted or modified at the same time. If the server is under load the query is going to take longer to perform due to lack of resources and because of the nature of PHP and MySQL if you get annoyed by the slowness and keep clicking links or refreshing pages trying to force it to load the page all you're doing is queuing up more queries for MySQL to execute and increasing the load on the server.

The thread list uses one of the most complex queries in Beehive so you can imagine what happens if you keep trying to refresh the thread list all the time.

It's not easy balancing the server performance between Apache and MySQL mostly because I haven't a clue how to set up MySQL to best suit our needs and searching for the right information reveals fuck all because all it's all along the lines of "try this" and "do that" without actually telling you what it does or why you should change it.

So yeah. Like it or lump it or we a) prune the database drastically or b) get together some money again to upgrade the hosting. We should probably do that again soon anyway as we'll need to be paying soon I would have thought.
From: Dave!! 3 Jul 2006 18:15
To: Matt 15 of 72

Could you give me details about how much load Beehive normally places on the server, how much bandwidth it uses etc?

 

Only our old server at work is being largely replaced within the next few weeks and the old one will be still hooked up and in use, but doing very little. There's a possibility of being able to use it to host Teh as well. I'd have to look into it though. Incidentally, the machine is a 1.13GHz PIII with 512MB RAM and is currently running Server 2000.

 

Like I say, no promises or anything, but it's a possibility! :)

From: Matt 3 Jul 2006 19:01
To: Dave!! 16 of 72

Bandwidth is about 4GB a month with compression on.
Load averages about 1.2 from what I can tell from 'top'.

 

Thanks for the offer but I don't think that machine would cope very well. Maybe if you quadruppled the RAM and upped the processor speed to about 3Ghz we'd stand a chance :)

From: Kriv 3 Jul 2006 19:11
To: Matt 17 of 72
Could there not be a teharchive database / forum set up. Dunno if it's easy or doable.
From: paul 3 Jul 2006 19:19
To: Kriv 18 of 72

I think I see what you're saying there, i.e. dump most of whats already 'archived' here to another place and reference it from here?

 

On the other hand, I would question whether we /really/ need to keep all the old posts?

 

Does anyone ever read them?

 

Are we just keeping them for the sake of keeping them?

 

Personally, I think pruning is the order of the day.

 

If someone cares enough to do a Kenny and dump everything into an 'attic' somewhere, all well and good, otherwise let it go...

 

:O)

From: Matt 3 Jul 2006 19:19
To: Kriv 19 of 72

It's a possibility.

 

To keep it as a database yet seperate from the live forum is going to mean it could potentially have the same impact on performance as we have with it remaining as it is now because we'd still have only the one MySQL server with the database it can't cope with. Unless the archive was moved to a different machine, which means paying for it and setting it all up seperatly, but making it read-only.

 

We could dump it to static HTML files and that has been talked about before but I would need help implementing that feature in Beehive and we'd have to have substantial downtime while the archive is created.

From: Matt 3 Jul 2006 19:28
To: paul 20 of 72

I'd have a problem with dumping it.

 

There is a massive amount of knowledge and wealth here. There are countless solutions to problems that people have had and some very interesting discussions that everyone should read at least once. There is no doubt also a lot of crap, but it would be a shame to loose all the good stuff just for the sake of keeping everything ship shape.

 

Recently we've been getting quite a few hits from search engines for people looking for solutions to their problems which is nice to see. Although I'm not sure how the managed to find us, because I've tried searching for things I know are here and we never show up, but they'er definitely finding us. I'd like to think the posts they find are useful as well.

From: milko 3 Jul 2006 20:01
To: Matt 21 of 72
There's a reasonable amount of funds in the Amazon thingy, although not much has been contributed that way of late, lack of publicity and stuff.
From: paul 3 Jul 2006 20:56
To: Matt 22 of 72

I totally understand where you're coming from, but also look at all the shite I've got stashed in my loft/shed/garage!

 

Why not look at getting a static HTML archive site for it all then? Would that be a suitable compromise?

 

:O)

From: Kriv 3 Jul 2006 20:59
To: Matt 23 of 72

HTML wouls stop old posts being archived, therfore stopping the need for intense queries... I'd imagine.

 

While I couldn't help.. I can donate my PC and bandwidth for conversion purposes if needed in some way.

From: Drew (X3N0PH0N) 4 Jul 2006 05:05
To: Matt 24 of 72

Whenever this happens, I'm all for archiving it.

 

I know you've explained this, but I think I didn't understand the explanation. What's the problem (in terms of performance) in making this forum read-only and then starting afresh?

 

I mean, I realise the archived forum would still have the same performance issues but surely it would be used so little as to not matter? It would be used as much as people use the search function currently, at most. Which for me, is very very seldom. Regardless, the overall load should be much lower shouldn't it?

 

If for some reason that's not practical, how about dumping this database, getting someone to host this forum read-only somewhere and again, starting afresh here. The archive might fall over a bit but... better that than this place suffering. The slowness is not the fault of the software or whatever, just the size of this place. Butif it's keeping people from posting then we need to make sacrifices I think.

From: Manthorp 4 Jul 2006 08:00
To: ALL25 of 72
Incidentally, and I assume this serves to illustrate Matt's analysis, it loads better for me - or rather, less slowly - in the ungodly hours of the morning.
From: Dave!! 4 Jul 2006 09:09
To: Manthorp 26 of 72
It's faster for me in the mornings as well, but with onl 6 members online compared with some afternoons and evenings when there's 20-30 online and there you go.
From: Peter (BOUGHTONP) 4 Jul 2006 20:41
To: Matt 27 of 72
What about using bind variables?

Bah, looks like it would mean restricting Beehive to PHP5+MySQL 4.1 (or later),
but it should give a significant boost to query performance; maybe it could be added as a branch/mod?



Also (this may already be done, but I couldn't see anything on a quick glance), what about having shortcut thingies - um, by which I mean something similar to this:
Pseudo-PHP code:
if (looking_for_unread_messages && $_SERVER['LAST_BH_MESSAGE'] > $_SESSION['USER_LAST_THREADLIST'])
{
// perform query as normal
}
else
{
// do cut-down query or cached query or whatever
}

ie: Store a server-wide variable of the last message post time, and use it to determine if there's any point even running a [complete] query.
From: steve 4 Jul 2006 23:29
To: Peter (BOUGHTONP) 28 of 72
I thought about cached stuff to cut down little queries. Although I guess there's not many like that on here. On my bh101 forum all the user details are cached, so if they are needing in a later post on the page they are there ^_^
From: milko 4 Jul 2006 23:39
To: ALL29 of 72

Is there any way of compromising somehow? This feels like wishful thinking but I do not know. Anyway - somehow semiarchiving old stuff so it isn't actually affecting anything except on the rare occassion someone isn't going through old threads. Hmm, no, this doesn't sound likely. Hm.

 

having it all separate and readonly feels wrong somehow.

From: Peter (BOUGHTONP) 5 Jul 2006 10:41
To: milko 30 of 72
We could convert old stuff to static HTML files, and then use a search engine to index both the database and files together.

Lucene is an Apache open-source one of those, which can be integrated with PHP via Zend, although again it appears to be a PHP5 solution.
From: ian 5 Jul 2006 11:06
To: ALL31 of 72

I'm not entirely sure if I know what I'm talking about but would it not be possible to periodically move posts/threads etc. that are older than a certain amount of time (say a year) to a seperate table (or tables) which are then only ever searched when absolutely neccessary?

 

So regular thread list 'stuff' would only look at the 'current' table(s) wheras the Search might look at both the 'current' and the 'archive' table(s) using whatever SQL magic that requires.

 

Any time a post is retrieved it would compare its number to the 'last-archived-thread' and based on that decide which table(s) to get it from. For the most part it would only require recent posts so only ever do stuff with the leaner 'current' table(s), thereby making everything faster, finding a cure for AIDS and achieving world peace.

 

Does that make sense? Am I repeating people? Can I go home yet?!

 

Hmm. But then posting in a thread from 3 years ago might cause problems, or would it? I don't know.

From: Drew (X3N0PH0N) 5 Jul 2006 11:28
To: ian 32 of 72

I think the main problem with that is it would be hard as fuck to make. And would pretty much be a one-off. So no one will make it.

 

And also the script which does the splitting would have to be run as a cron job or something like that. And would be, I reckon, slow as fuck.

 

(That's not to say it's a bad idea, it would be the ideal solution, just I don't see it happening)

 

I think we have three choices:

 

1. something like you suggested
2. carry on as we are
3. start again and archive the rest somewhere

 

I think 1 won't happen and 2 shouldn't happen so I like 3.