Wobbly computer

From: Chris (CHRISSS)11 Oct 2011 17:34
To: Drew (X3N0PH0N) 20 of 64

Very strange that was, and the only time it's done anything like that. Had no problem in games when it's been using the grpahics card, only when it's been sitting on the desktop but could be the graphiocsa card.

 

Can the i5 be used with a single stick of RAM? I could try that first, swapping between the two and then replace the graphics card with my old ATI one which I assume woyuld just work with the current AMD drivers installed.

From: Drew (X3N0PH0N)11 Oct 2011 17:47
To: Chris (CHRISSS) 21 of 64
Yeah should work fine with one stick (as far as I know).
From: Ixion12 Oct 2011 18:15
To: Chris (CHRISSS) 22 of 64

The IRQ_NOT_LESS_EQUAL error is normally graphics card drivers related, start by installing the most up to date drivers, and if you still get the same problem then try the driver disk that came with the card and wait until the current latest drivers are updated before trying them again!

Also, check the timings on the RAM are set correctly, I have 1333MHz Corsair XMS3 in my big i7 box and that defaults to the wrong timings so I have to set them manually to get a nice stable configuration, I think from memory they were 9-9-9-24-2T

From: Chris (CHRISSS)16 Oct 2011 21:36
To: Ixion 23 of 64

I haven't had much of a chance to look at things yet, been away to Tenby last weekend and up to London the rest of the week til yesterday. I had hoped I'd come home to a lovely stable computer but had a BSOD while booting, dxgkrnl.sys, couldn#t get it to boot for more than a minute or so with various BSODs, usbstor.sys, and others I can#t remember.

 

I haven't changed anything and it's gone from crashing while booting/every minute to being stable since this afternoon so I have no idea what's going on. The next time it crashes I'll have a fiddle with the RAM/graphics to see if I can find the problem.

From: Serg (NUKKLEAR)17 Oct 2011 06:57
To: Chris (CHRISSS) 24 of 64
RAM failures can be very difficult to spot, sometimes only repeated Memetest passes might spot'em. Have you done a long Memtest yet?
From: Chris (CHRISSS)17 Oct 2011 07:46
To: Serg (NUKKLEAR) 25 of 64
Depends what you call a long test. I did leave it for about 12 hours the first time. Not done one since though.
From: Serg (NUKKLEAR)17 Oct 2011 09:06
To: Chris (CHRISSS) 26 of 64

Hm ok, not likely then... Try a few of these stress tests, something must give eventually:
http://www.tested.com/news/how-to-stress-test-your-hardware-and-keep-your-pc-stable/762/

 

edit: from what you're saying, I would (for now) blame the power supply or the graphics card. But heck, it could be anything...

EDITED: 17 Oct 2011 09:07 by NUKKLEAR
From: Chris (CHRISSS)17 Oct 2011 21:16
To: Serg (NUKKLEAR) 27 of 64

Could be the power supply, especially as I made my own cables to get the graphics and P8 connector going from spare molex connectors. Dunno. It's been fine since yesterday afternoon now so sometimes it's rock solid then sometimes it won't boot at all.

 

Actually one thing I did change was moving the main ATX power cable away from the RAM wotsits but I don't know if I did that before or after it started running well yesterday. Could have been causing some interference touching the chips?

From: CHYRON (DSMITHHFX)18 Oct 2011 10:58
To: Serg (NUKKLEAR) 28 of 64
Could be lots of things, down to dead/dying transistors or even physical damage (hairline cracks, static damage) in the pcb traces.
From: Chris (CHRISSS)18 Oct 2011 23:06
To: ALL29 of 64

The computer's been running souper dooper since Sunday afternoon now, not a single crash since, although it has been sleeping some of that time. I think the crashes have buggered up a couple of things slightly though.

 

I'm trying the Turbo memory setting now to see if it really is running properly as that crashed it when I first installed everything.

From: Ixion19 Oct 2011 07:18
To: Chris (CHRISSS) 30 of 64

If you're worried it's buggered up your OS files slightly try typing the following from the command line (I think you need to be admin)

sfc /scannow

Which will check the integrity of any of your system files and if they're found to be corrupted replace them with shiny new ones!

From: Chris (CHRISSS)19 Oct 2011 20:39
To: Ixion 31 of 64

I tried that and it got to 40% then said "Windows Resource Protection could not perform the requested operation."

 

The only problem I seem to be having now is with Windows Live Mail. I might just try reinstalling that to see if it fixes it. The turbo memory setting seems to be working fine now too, the computer's been on since yesterday torrenting with no problems :D Hopefully everything's working properly <crosses fingers>

From: Chris (CHRISSS)21 Oct 2011 21:41
To: ALL32 of 64

Problems once again. All was fine til I started to convert my CR2s to JPGs to use with the screensaver on my HTPC. I set Irfanview to some batch conversion and things started going wonky again.

 

Took a while the first time but soon after starting the conversion on subsequent attempts the computer BSODed pretty quickly. I had a PFN_LIST_CORRUPT and something_MEMORY_CORRUPT. I've now taken one of the DIMMs out to see if things are stable with just the one. If it is I'll swap over the other one and hope it crashes so I can have be reasonably certain it's the RAM.

From: Chris (CHRISSS)29 Oct 2011 09:40
To: ALL33 of 64

I did eventually get around to taking one of my DIMMs out and the computer was running perfectly with just the 2GB, even if it seemed ridiculously slow after using it with 4GB. It had been running for days with no crashing so I swapped the two over last night.

 

Everything was fine, watched some stuff from it on the HTPC, played Trackmania for a bit, got worried that nothing was crashing. Just tried to open Computer Management and it BSODed straight away so I'm sure it's a faulty DIMM.

From: ComtronBob30 Oct 2011 08:36
To: Chris (CHRISSS) 34 of 64

"Just tried to open Computer Management and it BSODed straight away so I'm sure it's a faulty DIMM".

Don't be so sure.  Opening Computer Management doesn't invoke anything special.  Without some type of trap-and-trace diagnostic running it doesn't really tell you anything.  It could simply have been one additional thread or process overflowed some register or pushed the CPU to a slightly higher temperature, where the real instability is hiding.

I realize I'm somewhat late to the party.  (I've only recently registered to Teh Forum).  But let me make a few suggestions/observations that may come in handy.

First, it helps to know the EXACT make and model of your MoBo.  Does your user profile still accurately list the hardware in question, such as a MoBo in the Asus P5B series?  Is that just a "P5B" (no suffix), or is it something more like "P5BP-E_4L", "P5B Deluxe", "P5B Premium", or similar suffix?

Knowing the full model, socket type, and any revision, helps nail down problems that are sometimes unique to that MoBo subset.  (Note the Asus classes MoBos by socket type).  For instance, most (but not all) P5B-series MoBos are socket-775 (LGA775).

Knowing the EXACT CPU model also helps.

As does knowing the graphics card and/or any on-MoBo graphics chipset, which you presently list as the ATI X1950XT.

If you're not sure what you have under the hood, the free CPU-Z utility can answer many questions about both the CPU and memory.  (See the SPD tab, which should also tell you the proper, non-overclocked memory voltage).

The free GPU-Z will give you similar info about your graphics card.

With regard to memory, the first thing I always do is clean the edge-connector fingers with 91% Isopropyl alcohol on a lintless cloth.  This clears up most problems for memory sticks that had been working, but seem to develop problems at some later time.  (For that matter, I would similarly clean all card edge connectors, especially the graphics card, before proceeding).

As to Memtest86, while I like it in general, it doesn't always catch some odd problems.  Quoting myself from this PCmag forum post:

Further, off-brand memory module/stick vendors sometimes use a trick, that while yielding a marginally functional product, will often fail to work whatsoever in a high-performance motherboard: They will buy large surplus lots of specialty RAM chips, not originally intended for use in PCs, and use a cheap FPGA to re-map and/or otherwise emulate a standard RAM product.  Unfortunately, this can skew access timing in a way that is not easily detected.  Interestingly, the old MS memory diagnostic will usually flag this type of module as defective, even when other diagnostics won't.  See this old post of mine for a few details, and the download and users' guide links. 

So you may want to get a second opinion from the MS diagnostic.

This may seem obvious, but have you tried substituting a different power supply?  Or have you measured the various supply voltages, right at the MoBo connectors, with a VOM?

Continued next post...

EDITED: 30 Oct 2011 08:54 by COMTRONBOB
From: ComtronBob30 Oct 2011 08:41
To: Chris (CHRISSS) 35 of 64

As "View full message" doesn't appear to be working, post continued...

As supplies age, and the hold-up capacitors become weak, certain critical supply rails may sag under load.  GPUs have a considerable demand for surge current just as soon as they are called upon for any complex rendering.

As to BSODs specifically (revised in Win8 to the FOD — Frown Of Death — see below graphic), while hardware can certainly be at fault, better than 85% of the time it turns out to be a driver issue.  Rather than go through much additional trial-and-error, I'd personally prefer to test, not guess.  To that end, a free diagnostic, called WhoCrashed, may be able to answer the question as to the precise culprit.  Note that on first use WhoCrashed will download either the 32 or 64 bit Debugging Tools for Windows (WinDbg) Package from MS, which it uses to collect and extract data for analysis.

I assume you already looked into Win32K.sys, and got nowhere?  Sometimes the driver indicated on the BSOD is just where the process hung and not the root cause.  If nothing else, take a look at the STOP message troubleshooting list.

Two of the more common drivers known to be problematic, often causing a blue screen, are TCPIP.sys and/or Intel's netw5v32.sys, which is part of many WiFi driver packages.

There is also the possibility of HAL errors.  The HAL sometimes needs rebuilding after replacing certain critical hardware, like the MoBo, CPU, GPU/graphics card, or upgrading certain drivers, like those for the GPU, network card/on-board network chipset, or the MoBo itself.  Rebuilding the HAL requires, at minimum, a repair reinstall, sometimes called a "refresh" reinstall, which should retain most of your settings.

Two questions:

1. Does the machine behave in Safe Mode?  Safe Mode loads only the compatibility drivers, which doesn't run the video in a stressful fashion.  So that can significantly narrow things down.

2. Have you tried running with a "live" Linux CD.  If it behaves running a live Linux distro, that pretty much rules out hardware.  (You might give either Knoppix, or the KDE-based BackTrack a try). 


MS "feels your pain" with
new Win8 Frown Of Death.
 
...And so should Nvidia
(feel our pain ;-)

One of the more typical hardware-related problems that can cause a blue screen error is overheating of the CPU, GPU, memory sticks, and sometimes even the Northbridge.  For this reason, do you have some type of temp' monitor installed?  Something like the free HWMonitor?  (It's from the same folks as CPU-Z).  Click on the "Version History" link for the available downloads.

If you had an Nvidia GPU, I'd point you to this Inquirer article, detailing the chip substrate overheating problem they had, as things typically looked just like your video, prior to crashing altogether.  There are still a lot of those bad GPUs floating around.

I think that about covers it for the moment. <grin>

EDITED: 30 Oct 2011 08:51 by COMTRONBOB
From: Chris (CHRISSS)30 Oct 2011 10:47
To: ComtronBob 36 of 64

That's a very thorough and helpful post, thanks. The infromation in my profile hadn't been updated but is now.

 

It's a (mostly) new system with a fresh install of 7 which started crashing soon after it was built. I've been running the computer with only one of the 2GB RAM modules and it's perfectly stable. As soon as I swapped them over it started crashing again. I had 2 BSODs within 5 minutes (one being the memory_corrupt one), swapped them back and stable against.

 

I assume it's the Corsair Value Select RAM I accidentally bought instead of their XMS ones so when I send it back I'll pay the extra £2 for better stuff.

From: ComtronBob30 Oct 2011 12:17
To: Chris (CHRISSS) 37 of 64

"That's a very thorough and helpful post, thanks".

Glad you found it useful. :-)

"The infromation in my profile hadn't been updated but is now".

Indeed, I see you've made a considerable step up, going to the Gigabyte GA-Z68XP-UD3 and AMD 6870.

I've not kept up on the details of AMD cards, as you can probably tell. <g>  And I'm still not sufficiently awake to plow through the comparison tables.  But, presuming they exist and are accessible, you may be able to take advantage of unlocking any extra shaders, similar to the way it can be done for the AMD 6950 cards.

This blurb on the 10.12a Hotfix Drivers for the 6950 and 6970 cards may, or may not, also apply to you.  But you should probably have a look-see, just in case.

"As soon as I swapped them over it started crashing again.  I had 2 BSODs within 5 minutes (one being the memory_corrupt one), swapped them back and stable against.

Ah!  That wasn't quite clear from your post #33.

"...when I send it back I'll pay the extra £2 for better stuff".

Sounds like a plan.

Good luck!

From: Chris (CHRISSS)30 Oct 2011 12:54
To: ComtronBob 38 of 64

Yes indeed, it's a big difference from the old system which wasn't too high spec when it was bought 5 years ago. Having 4GB of RAM certainly makes everything seem smoother, but running with 2GB at the mo isn't nice.

 

The RAM passed all Memtest86+ test running for 12 hours or so but as soon as Windows was booted things started to become quite unstable. Good thing I have two modules so I can test each separately.

 

Seems the 6870 can only be overclocked, no unlocking of anything to give it special powers. I probably should try doing something with the CPU as that's the main extra the K processors give.

From: ComtronBob30 Oct 2011 23:35
To: Chris (CHRISSS) 39 of 64

"The RAM passed all Memtest86+ test running for 12 hours or so but as soon as Windows was booted things started to become quite unstable".

Even though you're likely not saddled with one or more of those "trick" FPGA'd modules, if you're so inclined, you may still care to give the old MS WinDiag memory diagnostic a try.  In spite of it not being as fancy as Memtest86+, it often flags modules as bad that sail right past Memtest86+ as being A-OK.

If you do decide to play around with WinDiag, note that if you hit the *P* (for pause), a new option will appear at the top of the screen: The *M* option (for menu).  You can then hit M -> 2 (advanced) -> 1 (change cache settings) -> 2 (turn off caching for all tests).  This allows you isolate if there are any CPU cache related issues, which are usually non-obvious.

There's also a "change the test suite" option that will let you perform some additional/more rigorous tests.

One other thing I like to do while running memory diagnostics is take a hair dryer and deliberately heat up the DIMMs under test.  (Though you have to be careful to not overheat them to excess).  As they heat up, the CL-timings tend to shift, sometimes to an out-of-bounds value.  This will usually show up on-screen for several test loops before you crash. <grin>  This approach is far more likely to find a marginal DIMM than running a diagnostic by itself for days at a time.

Have fun!