Bug 1771 - [Intel/i865G] on P4P800-VM I830WaitLpRing() lockup
Summary: [Intel/i865G] on P4P800-VM I830WaitLpRing() lockup
Status: CLOSED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: 6.8.1
Hardware: x86 (IA32) Linux (All)
: high blocker
Assignee: Alan Hourihane
QA Contact:
URL:
Whiteboard:
Keywords:
: 2136 (view as bug list)
Depends on:
Blocks:
 
Reported: 2004-11-03 00:03 UTC by Andrej Prsa
Modified: 2005-08-14 15:46 UTC (History)
9 users (show)

See Also:
i915 platform:
i915 features:


Attachments
X.org 6.8 log (64.74 KB, text/plain)
2004-11-03 00:05 UTC, Andrej Prsa
no flags Details
Xorg CVS 2004-11-04 version crash log (47.94 KB, text/plain)
2004-11-04 16:19 UTC, Andrej Prsa
no flags Details
X.org 6.8.1 config file (3.00 KB, text/plain)
2004-11-09 05:11 UTC, Andrej Prsa
no flags Details
Output of "lspci; lspci -v; lspci -vv; lspci -vvv" (18.61 KB, text/plain)
2004-11-25 08:32 UTC, Andrej Prsa
no flags Details
Xorg log with the brand new ASUS P4P800-VM MB (48.05 KB, text/plain)
2004-12-10 16:22 UTC, Andrej Prsa
no flags Details
xorg.conf with Option "CacheLines" "512" (11.09 KB, text/plain)
2005-01-29 16:15 UTC, Andrej Prsa
no flags Details
Xorg log with Option "CacheLines" "512" (50.16 KB, text/plain)
2005-01-29 16:16 UTC, Andrej Prsa
no flags Details
Keenan's log file (61.27 KB, text/plain)
2005-02-03 13:15 UTC, Keenan Pepper
no flags Details
Phil's xorg.conf (2.80 KB, text/plain)
2005-02-09 15:10 UTC, Phil Edelbrock
no flags Details
Output of lspci -vvn and DMA on (5.99 KB, text/plain)
2005-03-28 03:46 UTC, Andrej Prsa
no flags Details
Output of lspci -vvn and DMA off (5.81 KB, text/plain)
2005-03-28 03:47 UTC, Andrej Prsa
no flags Details
Gilboa's X.org (54.18 KB, text/plain)
2005-05-02 01:28 UTC, Gilboa Davara
no flags Details
Gilboa's lspci (8.38 KB, text/plain)
2005-05-02 01:30 UTC, Gilboa Davara
no flags Details
Gilboa's dmesg log (402 bytes, text/plain)
2005-05-02 01:32 UTC, Gilboa Davara
no flags Details
lspci.optiplex (1.74 KB, text/plain)
2005-08-15 08:41 UTC, Jon Oberheide
no flags Details
Xorg.0.log.optiplex (59.36 KB, text/plain)
2005-08-15 08:41 UTC, Jon Oberheide
no flags Details
dmesg.optiplex (26.21 KB, text/plain)
2005-08-15 08:41 UTC, Jon Oberheide
no flags Details
kernel.config.optiplex (28.55 KB, text/plain)
2005-08-15 08:42 UTC, Jon Oberheide
no flags Details
xorg.conf.optiplex (1.75 KB, text/plain)
2005-08-15 08:42 UTC, Jon Oberheide
no flags Details

Description Andrej Prsa 2004-11-03 00:03:13 UTC
The problem is being manifested on the integrated Intel 865G (Extreme Graphics 
2) card on ASUS P4P800-VM motherboard. It has occured with every recent X 
version, XFree86 4.1 through 4.4 and X.org 6.7, 6.8.0 and 6.8.1. It is related 
to i810 module that causes an X server lockup with the message (for details see 
attached Xorg.log.0):

Error in I830WaitLpRing(), now is 39399141, start is 39397140

This error occurs frequently (e.g. 20 minutes of active using), but it isn't 
strictly reproducible. If DMA is on, the lockups occur more frequently. I tried 
using 865patch to no avail. gdm tries to restart it, but the only way to revive 
the X is to reboot or to ssh from another computer and change the video driver 
to vesa in xorg.conf.
Comment 1 Andrej Prsa 2004-11-03 00:05:34 UTC
Created attachment 1212 [details]
X.org 6.8 log

This particular log examplifies no-DMA X crash.
Comment 2 Matthias Hopf 2004-11-04 06:38:37 UTC
Does this happen with current CVS as well?

This looks like a duplicate of bug 1592 or bug 1594. I had similar problems here
at SuSE, and AFAIK we tracked all problems down and the cards are running stable
again.
Comment 3 Andrej Prsa 2004-11-04 16:17:11 UTC
> This looks like a duplicate of bug 1592 or bug 1594. I had similar problems 
here at SuSE, and AFAIK we tracked all problems down and the cards are running 
stable again.

Unfortunately, it's not. Before I get into details, please note that in my 
original bug report I misused "DMA" for "DRI". It was 2AM and I was really 
sleepy... ;)

So the problem I'm experiencing occurs both when DRI is on and when it's off. 
However, to be sure, I checked out today's CVS version, recompiled it, run X 
with DRI, played Tuxracer for 5 minutes and voila - the same problem. This time 
I even upgraded the kernel to 2.6.9 to get i930 module. DRI works while it's 
working, but then it hangs just as before. Please see Xorg.0.log attached.

Thanks for your help,

Andrej
Comment 4 Andrej Prsa 2004-11-04 16:19:49 UTC
Created attachment 1223 [details]
Xorg CVS 2004-11-04 version crash log
Comment 5 Matthias Hopf 2004-11-09 03:16:42 UTC
Ok, I'll try to reproduce here. Though I have to check first that we have a 865G
in the house.

Is there any program that actively triggers the problem frequently (something
that can be run w/o user input)?

You said it happens with and without DRI. With tuxracer, this means it happens
with software rendering as well? Some idea about which way it happens faster?

Please attach your xorg.conf.
Comment 6 Andrej Prsa 2004-11-09 05:10:22 UTC
Indeed it happens with and without DRI, however with DRI this problem is 
manifested significantly faster (20min with DRI vs. 1-2 days without DRI). Any 
OpenGL screensavers trigger the problem, tuxracer triggers it after at most 3 
consecutive races (so 2-3 minutes).

To be sure, I double-checked BIOS settings and VGA memory size is set to 32Mb. 
Although I don't know if it's relevant, I set AGP aperture size to 32Mb as well.

Finally, if it's of any use, the problem exists even if 865patch program is 
used.

I attach the xorg.conf file below.
Comment 7 Andrej Prsa 2004-11-09 05:11:50 UTC
Created attachment 1248 [details]
X.org 6.8.1 config file
Comment 8 Matthias Hopf 2004-11-09 05:59:54 UTC
Unfortunately, nobody seems to have an I865G here. Still trying, though.
Egbert has one, but is in Boston at least this week.
Comment 9 Matthias Hopf 2004-11-09 07:13:52 UTC
Found a 865G, will run a test over night.
Comment 10 Matthias Hopf 2004-11-09 10:07:06 UTC
This will take some time - we have some installation issues with that machine...
Stay tuned!
Comment 11 Andrej Prsa 2004-11-09 10:15:49 UTC
Great, thanks! :)

If there's anything I can do to help, please let me know! If helpful, I can even 
provide a ssh-able account on the problematic machine, because somehow I have 
suspicions it might not be exclusive graphical card's fault, perhaps it's the 
motherboard...

Thanks again!
Comment 12 Donnie Berkholz 2004-11-15 09:59:05 UTC
Duplicate of #1353 and #1379.
Comment 13 Matthias Hopf 2004-11-25 07:08:57 UTC
Wasn't able to reproduce the bug, still trying.
Can you post your kernel version? And maybe a lspci -v?
Comment 14 Andrej Prsa 2004-11-25 08:29:42 UTC
Of course! The problem is affecting kernels 2.4.18 through 2.6.9. The current 
kernel is 2.6.9. This is the output of uname -a:

Linux regulus 2.6.9n #1 SMP Fri Nov 5 00:34:19 CET 2004 i686 GNU/Linux

I'm running Debian testing on Pentium 4 2.8GHz HT, ASUS P4P800-VM motherboard.

I'm attaching the output of lspci as a file, so it doesn't clutter the space 
here.
Comment 15 Andrej Prsa 2004-11-25 08:32:46 UTC
Created attachment 1372 [details]
Output of "lspci; lspci -v; lspci -vv; lspci -vvv"
Comment 16 Matthias Hopf 2004-11-29 03:17:36 UTC
Tried to reproduce this here on a similar machine (at least graphics, PCI bridge
and DRAM controller are the same as on yours) and it ran over the weekend w/o
the slightest problem (including DRI).

Sorry, but as long as nobody else has a similar problem I guess that your
hardware is broken. Given the number of broken gfx cards I have here this
doesn't seem to occure rarly :-(((

Maybe you want to watch bug 1794 closely, I was able to reproduce a bug on a
i810 system that *might* be related to yours.

Feel free to reopen the bug if you have additional information or can make sure
that it is not faulty hardware.

Sorry for the bad news.
Comment 17 Andrej Prsa 2004-12-01 04:06:29 UTC
OK.

I contacted the company that I've gotten this MB from and they will give me a 
new board if they find one, otherwise they'll replace it with a similar one, but 
not necessarily with the same graphical chipset. If they do find P4P800-VM, I'll 
try to reproduce the bug and if I manage to do that, I'll open the MB-specific 
(rather than chipset-specific) bug.

In the mean time, if there's anyone else with P4P800-VM out there that doesn't 
have problems with 865G, I'd appreciate the feedback about it! :)
Comment 18 Matthias Hopf 2004-12-01 04:09:54 UTC
That sounds great.

I really hope that your probs are gone with a new mainboard. Please post
whatever you find out.
Comment 19 Andrej Prsa 2004-12-10 16:20:06 UTC
Hi, Guys!

New information, thought I'd let you know... I got a brand new P4P800-VM 
motherboard with exactly the same preferences and obviously exactly the same 
graphical card; trying Xorg 6.8.1 CVS with DRI enabled *again* crashes X windows 
in 2-3 minutes with exactly the same error. :(((((((((((

So the way I see this is the following:

a) all ASUS P4P800-VM cards are crappy, with non-functional 865G's,
b) the problem is in i810 driver.

I'm attaching the Xorg.log of the most recent crash. If anyone can suggest what 
to try next, I'd be VERY grateful, since I promised those nice guys from the 
computer store I'll let them know the outcome of the test. Should I reopen a bug 
or should I leave it as invalid? Thanks for listening to my frustrations... ;)
Comment 20 Andrej Prsa 2004-12-10 16:22:10 UTC
Created attachment 1510 [details]
Xorg log with the brand new ASUS P4P800-VM MB
Comment 21 Matthias Hopf 2004-12-20 02:59:54 UTC
*Sigh*

This is sad news.
We'll have to discuss this internally, I'll let you know about the outcome.
Comment 22 charles collin 2004-12-23 01:47:32 UTC
Hi everyone,

I've unpurposedly posted :
https://bugs.freedesktop.org/show_bug.cgi?id=2136
without seeing this thread!
So, to sum up: i own the same MB and i cant have it work :-(
if i can help for tests, let me know.

Regards,
CH COLLIN
Comment 23 Jason Hale 2004-12-23 10:46:55 UTC
I have an ASUS P4P800-MX with the i865GV chip running FreeBSD 5.3 which seemed 
to work fine with Xorg 6.7.0 and the i810 driver compiled from source. I just 
upgraded to 6.8.1 from source yesterday and startx crashed immediately using 
the i810 driver. I am convinced this is a driver issue, as the 6.7.0 version 
seemed to work and I was able to use the vesa driver under 6.8.1. Upon looking 
at my Xorg log, I found the probed memory was lower that the default VideoRAM 
and higher that the VESA VBE total mem. Here is the key part of my 
log:<BR><BR>(II) I810(0): VESA BIOS detected 
(II) I810(0): VESA VBE Version 3.0 
(II) I810(0): VESA VBE Total Mem: 32576 kB 
(II) I810(0): VESA VBE OEM: Intel(r)865G Graphics Chip Accelerated VGA BIOS 
(II) I810(0): VESA VBE OEM Software Rev: 1.0 
(II) I810(0): VESA VBE OEM Vendor: Intel Corporation 
(II) I810(0): VESA VBE OEM Product: Intel(r)865G Graphics Controller 
(II) I810(0): VESA VBE OEM Product Rev: Hardware Version 0.0 
(II) I810(0): Integrated Graphics Chipset: Intel(R) 865G 
(--) I810(0): Chipset: "865G" 
(--) I810(0): Linear framebuffer at 0xF0000000 
(--) I810(0): IO registers at addr 0xFE780000 
(==) I810(0): Write-combining range (0xfe780000,0x80000) was already clear 
(II) I810(0): 1 display pipe available. 
(II) I810(0): detected 32636 kB stolen memory. 
(II) I810(0): I830CheckAvailableMemory: 450560 kB available 
(--) I810(0): Pre-allocated VideoRAM: 32636 kByte 
(**) I810(0): VideoRAM: 32768 kByte 
(==) I810(0): video overlay key set to 0x101fe 
(**) I810(0): page flipping disabled 
(==) I810(0): Using gamma correction (1.0, 1.0, 1.0) 
(II) I810(0): BIOS Build: 3062 
(==) I810(0): Device Presence: disabled. 
(==) I810(0): Display Info: enabled. 
(II) I810(0): Broken BIOSes cause the system to hang here.<BR><BR>Originally 
(and by default), VideoRam is commented out. I have the BIOS set to use 32MB 
of shared RAM for video, but apparently this does not exactly equal 32*1024 B 
(32768 kB). When I set VideoRam to 32576, X started right up with the i810 
driver. I don't know much about the drivers, but it seems to me like there is 
an error probing the memory and it may be specific to this chipset.  I could 
not get X to start at all before, but now it starts after changing VideoRam. 
If you got X to start at all before, you were probably lucky and it may be the 
way Linux handles memory (I have a slightly different MB too). Then X crashes 
because of the memory leak. I don't know about DRI, I've never tried it even 
on 6.7.0. Sorry for the long comment, but this is what has worked for me so 
far and I wanted to put it on the table. If it crashes on me I'll post again. 
 
Comment 24 Andrej Prsa 2004-12-23 11:06:32 UTC
Thanks for your reports and your effort, guys!

Jason, are you using original 6.8.1 or from the CVS? Matthias pointed out a 
while ago that several related bugs have been hunt down and removed in the CVS 
version, perhaps the one you describe is already squashed. I'm using the CVS 
version and I don't experience anything similar that you report - X start pretty 
much insensitive to the value of VideoRAM. Vesa is of course ok, for that is 
exactly what I'm running now - no crashes, no problems, yet no DRI and 
unfortunately no DivX's - it's too slow.

Charles, it's great to hear you got your MB working with 2.6.9 kernel (I thought 
that information might be of use to the developers, that's why I'm summarizing 
your private e-mail) - did you try whether DRI-related crashes occur?
Comment 25 Jason Hale 2004-12-23 11:58:51 UTC
(In reply to comment #24 
> Jason, are you using original 6.8.1 or from the CVS? Matthias pointed out a  
> while ago that several related bugs have been hunt down and removed in the 
CVS  
> version, perhaps the one you describe is already squashed. I'm using the CVS  
> version and I don't experience anything similar that you report - X start 
pretty  
> much insensitive to the value of VideoRAM. Vesa is of course ok, for that is  
> exactly what I'm running now - no crashes, no problems, yet no DRI and  
> unfortunately no DivX's - it's too slow. 
 
I'm using the original 6.8.1 which just made it into the FreeBSD ports tree 
the other day. The only way I could get the i810 driver to work at all was to 
specify my own VideoRam value, so apparently it has some effect. 
 
From the i810 man page: By default 8 Megabytes of system memory are used for 
graphics. For the 830M  and later, the default is 8 Megabytes when DRI is not 
enabled and 32 Megabytes with DRI is enabled. This amount may be changed with  
the VideoRam entry in the config file Device section. 
 
I don't have dri enabled, but it appears the driver still tries to use 32 
Megabytes. It seems, though that the default value is too high for what video 
memory is actually allocated. 
 
From my dmesg: 
agp0: <Intel 82865G (865G GMCH) SVGA controller> port 0xeff0-0xeff7 mem 
0xfe780000-0xfe7fffff,0xf0000000-0xf7ffffff at device 2.0 on pci0 
agp0: detected 32636k stolen memory 
agp0: aperture size is 128M 
 
So I actually have 32636k of video memory which is just short of 32 Megabytes. 
From the xorg log this appears to be probed by the i810 driver correctly, 
however it also shows it using a default of 32768k, which of course would 
crash the server. If I define VideoRam 32636 it works. The motherboard is not 
allocating a strict 32 Megabytes, so the default fails and apparently the 
driver does not use the probed value so you have to force it. 
 
Perhaps, as you said, this particular issue is resolved in CVS. For anyone not 
using the CVS version--which I sort of hesitate to do, this may solve part of 
the problem for the P4P800-MX motherboard (the VM model is probably similar). 
Comment 26 Matthias Hopf 2004-12-28 05:37:50 UTC
*** Bug 2136 has been marked as a duplicate of this bug. ***
Comment 27 Matthias Hopf 2004-12-28 05:45:19 UTC
I don't think that *this* issue is fixed in CVS. As far as I can see specifying
VideoRam with the same amount the driver found to be already allocated just
skips the reallocation step. Maybe the BIOS is broken WRT videomem allocation,
and the former driver didn't do that?!?
Just wild guessing.
However, Andrej's system crashed with 6.7 and even older servers as well...

Andrej, does specifying the VideoRam amount have any influence on the stability
of your system?
Comment 28 Andrej Prsa 2004-12-28 10:05:13 UTC
Hi, I tried a couple of things to do with the graphic card after Jason and 
Matthias were kind enough to supply feedback, here are the results:

                            --- 1ST TEST ---

I tried playing with VideoRam option and this time the X didn't initialize DRI 
at all. I'm posting an excerpt of the logfile here, only the relevant part, for 
brevity. I'll enumerate the parts here for commenting purposes:

1) The version is 6.8.1 built from CVS:

X Window System Version 6.8.1
Release Date: 17 September 2004
Build Date: 04 November 2004

2) The part that is changed: I set VideoRam to 32636 after dmesg'ing to see
   what's the stolen memory size. If relevant, AGP aperture size is set to
   128MB. The output below strikes me as unusual - how does I830 function
   see 418812kB available memory???

(II) I810(0): detected 32636 kB stolen memory.
(II) I810(0): I830CheckAvailableMemory: 418812 kB available
(--) I810(0): Pre-allocated VideoRAM: 32636 kByte
(**) I810(0): VideoRAM: 32636 kByte

3) Finally, the DRI initialization failure:

(II) I810(0): 4 kBytes additional video memory is required to
	enable tiling mode for DRI.
(II) I810(0): 4 kBytes additional video memory is required to enable DRI.
(II) I810(0): Disabling DRI.

                               --- 2ND TEST ---

I tried commenting out VideoRam option completely - to remind you, originally
I had it set to 32768. This time startx didn't work at all - all I got was a
silly picture and a frozen graphic card that could not be accessed again,
although the system was fully responsive. Since I didn't think of any better
way to capture it, I took my digital camera and took two screenshots, literally! 
;) Please find them at:

     http://www.fiz.uni-lj.si/~prsa/screenshot1.jpg
     http://www.fiz.uni-lj.si/~prsa/screenshot2.jpg

(I didn't attach them to the bug report because of their size). Seemingly it has 
something to do with i830 module, because if I modprobe it and startx, the same 
thing happens regardless of VideoRam setting. If on the other hand i830 isn't 
modprobe'd, i915 gets loaded automatically.

                               --- 3RD TEST ---

The test a.k.a. "The Weird One!" ;) I set VideoRam to 32768 and I startx'd. The 
logfile shows that everything is perfectly ok, DRI enabled, everything else as 
well, but glxinfo then reports that there is NO direct rendering available!? 
This was sooo weird I had to come to the bottom of this and the result is 
absolutely unbelievable: I set .xinitrc to start gnome; if gnome is used as the 
window manager, then DRI won't work, but if it's TWM or similar, DRI works! How 
weird is that? Anyway, even with TWM and DRI enabled, the original I830 ring 
problem pops up and crashes the card in about 10 seconds of tuxracer 
(unmistakenly, it never lasts longer than 10 seconds!).

Finally, if of any help, the 865patch program doesn't affect the results I put 
out here in any way.
Comment 29 Keenan Pepper 2005-01-06 01:34:23 UTC
I am experiencing a bug with very similar symptoms on slightly different
hardware: an Intel 845 Brookdale motherboard and an 82845G video chip. This is
with the X.org server from Ubuntu Hoary which is very recent, if not CVS, and a
vanilla 2.6.10 kernel. I would be happy to provide any other technical details.
Comment 30 Egbert Eich 2005-01-12 06:51:45 UTC
Alan, could you please have a look at this one?
Comment 31 Alan Hourihane 2005-01-12 07:31:56 UTC
Couple of things to try... 

1. Try 32bpp instead of 16bpp

2. If you say it happens even with DRI off, try disabling some of the 2D 
acceleration with these flags... (try them one by one)... And obviously make
sure that DRI is still off.

Option "XaaNoScanlineCPUToScreenColorExpandFill"
Option "XaaNoMono8x8PatternFillRect"
Option "XaaNoScreenToScreenCopy"
Option "XaaNoSolidFillRect"
Comment 32 Alan Hourihane 2005-01-27 02:32:36 UTC
You could also try adding

Option "CacheLines" "512"

to see if that helps.
Comment 33 Alan Hourihane 2005-01-27 04:25:21 UTC
Actually,

I've found the cause of this.

Anyone care to contact me and test a fix ?

Alan.
Comment 34 Matthias Hopf 2005-01-27 07:24:40 UTC
I can't because I couldn't reproduce the problem. But I would be very happy to
review a patch and test whether it breaks something else :)
Comment 35 Andrej Prsa 2005-01-27 09:39:10 UTC
Hi, Alan & others!

Yesyesyes, I'd really appreciate the fix and I can test it! Sorry about a bit 
long time without responding, I was away, skiing! :) What was the problem?
Comment 36 Alan Hourihane 2005-01-27 15:47:01 UTC
O.k. So I thought I'd found it whereas I've caught an alignment problem but 
Andrej has tested my fix and it doesn't help him - unfortunately.

I've tried to reproduce the crash without any luck, so there must be something 
a little more funky going on with the P4P800-VM boards that I've not come 
across before.

I'll keep digging as and when I've got time.
Comment 37 Andrej Prsa 2005-01-29 11:16:53 UTC
Hi again!

I'll just recap here the tests I've done with X.org 6.8.1 (CVS) + Alan's i810 
driver + DRI turned on:

1) 32bpp instead of 16bpp
2) VideoRam option set to 32768 or commented out, both 16bpp and 32bpp
3) Set BIOS memory to 8MB instead of 32MB. I found it curious, though, that
   the logfile contains these lines:

   (II) I810(0): detected 8060 kB stolen memory.
   (II) I810(0): I830CheckAvailableMemory: 441340 kB available
   (II) I810(0): Will attempt to tell the BIOS that there is 12288 kB VideoRAM
   (WW) I810(0): Extended BIOS function 0x5f11 not supported.
     [several lines skipped]
   (II) I810(0): initializing int10
   (WW) I810(0): Bad V_BIOS checksum
   (II) I810(0): Primary V_BIOS segment is: 0xc000

Unfortunately, none of these changes got me any closer to a working DRI.

                                   * * *

I've been trying to crash X.org 6.8.1 without DRI for two days now without 
success; I guess Alan's fix indeed fixed something! :) I'll try to produce a 
crash until the end of weekend, but if I can't, then I guess only the DRI 
problem remains! I'm open to any new suggestions on what to test.

Just a quick reflection: is there any chance that SMP could influence the way 
i810 is performing?

Andrej
Comment 38 Andrej Prsa 2005-01-29 11:38:32 UTC
Hi again!

Murphy at work: 5 minutes after I said everything's ok, X.org crashed with the 
same error as before! :( I'll test Alan's suggestions from above and report my 
findings...

Andrej
Comment 39 Andrej Prsa 2005-01-29 16:14:47 UTC
Report no. 1: still crashing

X.org 6.8.1 CVS, no DRI.
Option used: "CacheLines" "512"
Outcome: crash after a couple of hours of moderate usage
How to trigger: use the mouse scroll-wheel in browsers

I'm attaching xorg.conf and Xorg.0.log of this test below.
Comment 40 Andrej Prsa 2005-01-29 16:15:47 UTC
Created attachment 1788 [details]
xorg.conf with Option "CacheLines" "512"
Comment 41 Andrej Prsa 2005-01-29 16:16:43 UTC
Created attachment 1789 [details]
Xorg log with Option "CacheLines" "512"
Comment 42 Alan Hourihane 2005-01-30 05:36:57 UTC
As you now have DRI off Andrej, I'm definately keen to hear your results from 
the options I suggested using.
Comment 43 Alan Hourihane 2005-02-01 11:47:39 UTC
I'm pretty sure now that Keenan Pepper and Charlie Collin are having a 
different problem than Andre.

Some people are using the i830 kernel module still, when they should be using 
the i915 kernel module now.

I've committed some fixes to the i830 driver in CVS which should detect this 
condition and refuse to load the DRI and stop this problem happening.

Andre's problem is different altogether.
Comment 44 Phil Edelbrock 2005-02-01 15:52:40 UTC
I've had a problem with X crashing on me (apps quiting, X restarting, or
complete system hang) daily with my Asus P4P800-VM.  I first thought it
was my mouse because it seemed to be related to scrolling or moving
windows (particularly brower windows or OpenOffice windows), but going from a
usb to ps/2 seemed to discount this.  Then, my RAM, but so far memory tests show
nothing.  Very fustrating.  Using the 'Flame' screensaver seemed to lock the
machine up each night.  I now use 'blank' and at least the machine survives the
night now.

I updated to the latest Bios, and I'm current with Fedora 3.  Other
version details:

X Window System Version 6.8.1
Release Date: 17 September 2004
X Protocol Version 11, Revision 0, Release 6.8.1
Build Operating System: Linux 2.6.9-1.751_ELsmp i686 [ELF]
Current Operating System: Linux DrTheopolis.netroedge.com
2.6.10-1.741_FC3smp #1 SMP Thu Jan 13 16:53:16 EST 2005 i686
Build Date: 01 December 2004
Build Host: tweety.build.redhat.com

        Before reporting problems, check http://wiki.X.Org
        to make sure that you have the latest version.
Module Loader present
OS Kernel: Linux version 2.6.10-1.741_FC3smp
(bhcompile@porky.build.redhat.com) (gcc version 3.4.2 20041017 (Red Hat
3.4.2-6.fc3)) #1 SMP Thu Jan 13 16:53:16 EST 2005


-Phil
Comment 45 Keenan Pepper 2005-02-03 04:38:33 UTC
(In reply to comment #43)
> I'm pretty sure now that Keenan Pepper and Charlie Collin are having a 
> different problem than Andre.
> 
> Some people are using the i830 kernel module still, when they should be using 
> the i915 kernel module now.

I've been using the i915 kernel module.

BTW, the only reliable way to trigger this for me is to run xine (with the Xv
output plugin) and glxgears at the same time. Sometimes it crashes when only
xine is running, usually when switching to fullscreen mode or back, but that's
hard to trigger, it only happens on occasion (usually when it's least convenient
=) ).

I tried starting up X without DRI and noticed some weird things. Xine with the
Xv output plugin looked like crap, it was flashing and green with weird lines
and I think I even saw random pieces of other windows. Xine with the Xshm plugin
looked fine. I couldn't trigger the server lockup with either plugin, though,
not even rapidly switching back and forth to fullscreen mode with glxgears
running at the same time.

So since my problem doesn't appear without DRI, maybe it is a different bug.
Comment 46 Alan Hourihane 2005-02-03 04:53:12 UTC
Do you have a logfile Keenan that you could send me ?
Comment 47 Phil Edelbrock 2005-02-03 10:50:34 UTC
(In reply to comment #44)
> Using the 'Flame' screensaver seemed to lock the
> machine up each night.

Edit: That should be "XFlame", if it matters.  It seems to crash on any extended
video activity.  Almost as if some system or app memory gets allocated in the
video card's memory and causes the app to segfault, kernel to panic, or the
machine to simply lock up hard?

I tried a few things: disabled DRI (commenting it out of the xorg.conf... is
that good enough?), bumped the DDR voltage up to 2.65, and set Video RAM to 8MB
in the bios.  Machine locked up hard again over night.

I ran across some interesting notes on high-memory usage on these boards:

http://groups-beta.google.com/group/linux.kernel/browse_thread/thread/fc131bec9125dde8/dccfa5d6825c920e
http://groups-beta.google.com/group/linux.kernel/browse_thread/thread/e93c18a0da785ad6/6e3c4d0eaab34482

I put a copy of my logfile here (pre-tweaks), for those interested:

http://secure.netroedge.com/~phil/Xorg.0.log-withdri

Next things I will try: Disable high-mem support in kernel.  Take out second
DIMM (leaving 1 512MB).


-Phil
Comment 48 Andrej Prsa 2005-02-03 11:31:31 UTC
Alan,

I can't seem to crash X.org with the following option:

    Option      "XaaNoScanlineCPUToScreenColorExpandFill"

I may simply be very lucky, but I thought I'd let you know... I'll keep trying, 
but so far so good. Would it make any sense in your opinion to try and couple 
DRI to this option?
Comment 49 Keenan Pepper 2005-02-03 13:15:09 UTC
Created attachment 1829 [details]
Keenan's log file

I crashed it a few times and all the logs looked pretty much like this.

lspci says my card is a
"0000:00:02.0 VGA compatible controller: Intel Corp. 82845G/GL[Brookdale-G]/GE
Chipset Integrated Graphics Device (rev 03)"
Comment 50 Alan Hourihane 2005-02-03 14:47:02 UTC
Keenan,

Can you try the same option Andrej just mentioned ?
Comment 51 Keenan Pepper 2005-02-03 19:34:58 UTC
(In reply to comment #50)
> Keenan,
> 
> Can you try the same option Andrej just mentioned ?

I tried "XaaNoScanlineCPUToScreenColorExpandFill" and it still crashed. However,
I tried turning off _all_ the Xaa stuff and it didn't crash! So I went through a
process of trial and error and found that "XaaNoOffscreenPixmaps" was the one
that did the trick. Now I'm running xine along with about 5 opengl demos
(obviously it's not going very fast =P) but the X server isn't crashing, so it
looks like (for me) this bug is fixed!

Note that all the other Xaa stuff is still enabled, so part of Xorg.0.log looks
like this:

(II) I810(0): Using XFree86 Acceleration Architecture (XAA)
        Screen to screen bit blits
        Solid filled rectangles
        8x8 mono pattern filled rectangles
        Indirect CPU to Screen color expansion
        Solid Horizontal and Vertical Lines
        Setting up tile and stipple cache:
                16 128x128 slots
                4 256x256 slots

The only thing missing is "Offscreen Pixmaps".

I'll keep stressing it out and trying to crash it, but it sure seems like that
fixed it.
Comment 52 Keenan Pepper 2005-02-03 20:08:43 UTC
... so of course, right after I submit that it crashes again =P. I would attach
the log but it looks identical to my first one except for times of day, memory
addresses, and the XaaNoOffscreenPixmaps option. So apparently
XaaNoOffscreenPixmaps makes the bug much harder to trigger, but doesn't fix it. =(
Comment 53 Phil Edelbrock 2005-02-08 10:30:43 UTC
I'm not sure if I'm experiencing the same problem (see my comment #44), but I've
found a workaround:  I removed the second DIMM, so I now only have a single
512MB DIMM.  Everything is stable now, at least for the past 4 days and now
survives the night with XFlame screen saver.

I tried leaving the second DIMM in but disabling high-mem, but that wasn't enough.

Regarding comment #13, is the test platform using more than 1 DIMM?

For others still experiencing random hangs and crashes, are you using more than
1 DIMM?  What happens if you remove all but 1?


-Phil
Comment 54 Andrej Prsa 2005-02-08 12:55:03 UTC
Reply to comment #53: I opened up the box and took a look: there is only one 
512Mb RAM card in DIMM A1 slot (out of four slots, DIMM A1, A2, B1, B2), on both 
P4P800-VM motherboards I tested. :( So I guess we're experiencing different 
problems. Phil, could you post your xorg.conf and your BIOS version? I'd like to 
compare it to my own.

As a side note, after more than 15 days of testing X.org without DRI and with 
Option "XaaNoScanlineCPUToScreenColorExpandFill" in xorg.conf, I wasn't able to 
crash it. Still, without DRI it takes a lot of time to crash it, so perhaps I 
was just being lucky.
Comment 55 Phil Edelbrock 2005-02-09 15:10:40 UTC
Created attachment 1883 [details]
Phil's xorg.conf

This is from a Fedora 3 box which crashes repeatidly if there are 2 DIMMs
installed.  Works fine with 1. Bios 1014. 

X Window System Version 6.8.1
Release Date: 17 September 2004
X Protocol Version 11, Revision 0, Release 6.8.1
Build Operating System: Linux 2.6.9-1.751_ELsmp i686 [ELF] 
Current Operating System: Linux DrTheopolis.netroedge.com 2.6.10-1.760_FC3smp
#1 SMP Wed Feb 2 00:29:03 EST 2005 i686
Build Date: 01 December 2004
Build Host: tweety.build.redhat.com
Comment 56 Phil Edelbrock 2005-02-09 15:24:11 UTC
(In reply to comment #55)
A little more info on my setup: The xorg.conf I posted did have the dri stuff
commented out at one point (the "load dri" and the DRI section). It didn't help,
so I uncommented back again.  I also just switched DIMM's for the heck of it,
just to make sure the problem wasn't solved from me removing a bad DIMM.  RAM is
Asus 'qualified' Transcend DIMMs (2x512MB)

The computer has been very stable since removing the second DIMM.  Unless
something new crops up, it looks like my troubles are related to running two
DIMMs in dual-channel mode.


-Phil
Comment 57 Andrej Prsa 2005-02-17 16:20:30 UTC
Hi,

I just flashed my bios to 1.014 (2004-12-31) and tried with 32MB Video RAM and
32MB AGP aperture. X still crash, but: when it crashed, i915 module reported:

[drm:i915_wait_irq] *ERROR* i915_wait_irq: EBUSY -- rec: 73805 emitted: 73807
i810_audio: drain_dac, dma timeout?

Just for the heck of it, I disabled DMA with hdparm for /dev/hda and
unbelievably, I ran tuxracer for more than 1/2 hour without crashing; then I
logged off, logged in again, ran tuxracer for another 1/2 hour and then it
finally crashed.

Then I turned the DMA back on to see whether the change was induced by the BIOS
flash or by DMA turned off; it's the latter - if DMA is on, tuxracer still
crashes X in a matter of seconds.

No idea how (and if at all) this is relevant, but I thought I'd let you know...
Comment 58 Alan Hourihane 2005-02-17 16:31:24 UTC
Try not loading the i810_audio module, and see if that helps.
Comment 59 Andrej Prsa 2005-02-19 10:20:36 UTC
The OSS's i810_audio was built into the kernel. I recompiled the kernel (2.6.9)
and tried again the following combinations:

1) DMA, i810_audio loaded
2) NO DMA, i810_audio loaded
3) DMA, i810_audio unloaded
4) NO DMA, i810_audio unloaded

I could not establish any correlation of the i810_audio module presence with the
stability of 865. If DMA is on, it crashes within seconds. If it is off, it
crashes within tens of minutes.

Then, as a side project I've been meaning to do for quite some time, I replaced
the obsolete OSS module with ALSA's sound modules; the results are exactly the
same as above, this time without any dmesg output from sound modules about DMA
issues.

Based on this I guess we can rule out sound modules causing the crashes. Still,
the DMA/NO DMA behavior puzzles me; what do i810/i915 have to do with DMA?
Comment 60 Alan Hourihane 2005-03-28 01:20:10 UTC
Andrej,

Can you do an 'lspci -vvn' and attach the output with and without DMA turned on.
Comment 61 Andrej Prsa 2005-03-28 03:45:37 UTC
Hi Alan,

I did lspci -vvn with and without DMA, the output is exactly the same regardless
of DMA setting. I'm attaching both outputs for the record anyway.

Andrej
Comment 62 Andrej Prsa 2005-03-28 03:46:52 UTC
Created attachment 2220 [details]
Output of lspci -vvn and DMA on
Comment 63 Andrej Prsa 2005-03-28 03:47:21 UTC
Created attachment 2221 [details]
Output of lspci -vvn and DMA off
Comment 64 Gilboa Davara 2005-05-02 01:22:58 UTC
Hello all,

If I may butt in,  I was about to report the same bug, with some further
information.

A. Test case:
1. Goto init 5.
2. Login into KDE as a normal user.
3. Start kaffeine (KDE multimedia player), play a avi file.

B. Expected Result:
Play AVI file.

C. Actual Result:
0. (This usually happends after a *long* uptime)
1. Kaffeine display goes blue.
2. The display shutters and then hangs.
2. The display goes blank, X dies.
3. Fedora Core tries to restart X, over and over and fails. (Attached X.org log.)
4. Machine is accessible over SSH. I can init 3, and kill the i915 driver, but I
can no longer restart the X process unless I reboot the machine.

D. Machine Configuration:
1. Hardware:
Motherboard: Intel D865GBF. 
(http://www.intel.com/design/motherbd/bf/index.htm)
Chipset: Intel i865.
BIOS: 86A.0069.P21
CPU: Intel P4, 3.06Ghz, 
	Hyperthreading: On.
VGA: On-board i915.
	Aperture size 32MB, 64MB.
	Frame Buffer Size: 16MB.
Sound: On-board.
RAM: 2 x 512MB DDR400, Slots A0, B0. (Tracend?)

2. OS:
Distro: Fedora Core 3.
Kernel: Stock FC3 kernels: 2.6.9, 2.6.10, 2.6.11. (Tried them all)
X.org: 6.8.1, 6.8.2. (See attached xorg_versions.txt)
KDE: 3.4 (kde-redhat.sourceforge.net)

Attached logs are from 2.6.11, X.org 6.8.2.
Comment 65 Gilboa Davara 2005-05-02 01:28:08 UTC
Created attachment 2603 [details]
Gilboa's X.org

Be ware that sadly enough, this does not belong to the initial crash; it
belongs to one of subsequent crashes. (Fedora automatically tries to recover a
dead X session [again and again] trashing the original X.org file.)
Comment 66 Gilboa Davara 2005-05-02 01:30:38 UTC
Created attachment 2604 [details]
Gilboa's lspci

Gilboa's lspci -vv
Comment 67 Gilboa Davara 2005-05-02 01:32:30 UTC
Created attachment 2605 [details]
Gilboa's dmesg log 

(X.org related part of it.)
Comment 68 Phil Edelbrock 2005-05-10 16:54:34 UTC
(In reply to comment #64)
> Hello all,
> 
> If I may butt in,  I was about to report the same bug, with some further
> information.
[...]
> RAM: 2 x 512MB DDR400, Slots A0, B0. (Tracend?)

Have you tried removing one of the DIMMs to see if the problem goes away? (It
did for me.)


Phil
Comment 69 Andrej Prsa 2005-06-03 08:23:55 UTC
After a lot of time invested to resolve this problem by a lot of people, I have
come to the conclusion that this bug is due to a faulty hardware. To back up my
claim, I installed windows a couple of weeks ago on that machine only to test
the graphics and the very same faulty behavior happens - whenever DRI kicks in,
computer crashes. So unless anyone else experiences the same problems with a
different motherboard, I am closing this bug report as invalid.
Comment 70 Jon Oberheide 2005-08-15 08:40:23 UTC
I can reproduce this relatively easily on 2 identical Dell Optiplex SX280s,
leading me to believe this is not a hardware problem.  While running our
in-house 3d visualization software, it is usually triggered when a large number
of polygons are attempted to be rendered.  One is running suse 9.3 and the other
gentoo.  All the following information and attachments are from the gentoo box.

xorg: 6.8.2
kernel: 2.6.11
gcc: 3.4.4

Error in I830WaitLpRing(), now is 9129, start is 7128
pgetbl_ctl: 0x3ffc0001 pgetbl_err: 0x0
ipeir: 0 iphdr: 6db3ffff
LP ring tail: b0 head: 0 len: 1f001 start 0
eir: 0 esr: 1 emr: ffff
instdone: ffc1 instpm: 0
memmode: 108 instps: 800f0050
hwstam: fffe ier: 2 imr: 8 iir: a0
space: 130888 wanted 131064
(II) I810(0): [drm] removed 1 reserved context for kernel
(II) I810(0): [drm] unmapping 8192 bytes of SAREA 0xf88c6000 at 0xb7d44000
Comment 71 Jon Oberheide 2005-08-15 08:41:11 UTC
Created attachment 2870 [details]
lspci.optiplex
Comment 72 Jon Oberheide 2005-08-15 08:41:29 UTC
Created attachment 2871 [details]
Xorg.0.log.optiplex
Comment 73 Jon Oberheide 2005-08-15 08:41:54 UTC
Created attachment 2872 [details]
dmesg.optiplex
Comment 74 Jon Oberheide 2005-08-15 08:42:12 UTC
Created attachment 2873 [details]
kernel.config.optiplex
Comment 75 Jon Oberheide 2005-08-15 08:42:34 UTC
Created attachment 2874 [details]
xorg.conf.optiplex
Comment 76 Alan Hourihane 2005-08-15 08:46:33 UTC
Jon,

Please open another bug report, as your hardware is i915 and not i865 as this
bug report applies to that and a specific motherboard - the P4P800-VM.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.