Navigate

· Home page

· Bio

· Projects

· Tips & Tricks

    · OpenGL

    · LaTeX

    · C/C++

    · DOCTYPEs

    · Linux

· Programming

· More Links

· Colleagues

· Weekly Schedule

· Research


Trent Apted's Photo




Slow glRenderMode ( GL_SELECT ) Performance in Recent Graphics Drivers

I've started noticing a dramatic drop in the performance of OpenGL rendering in SELECT mode. That is, after a call glRenderMode(GL_SELECT);. GL_SELECT is meant to be faster than regular rendering, because it only needs to render the geometry, and not shading or textures but recent drivers (since July 2007) have started slowing this operation down dramatically.

So far, it only appears to affect ATI graphics drivers. I haven't tested any HD cards -- if you have one, please consider contributing some numbers, using the test program below. Also, the numbers below indicate that GL_RENDER performance may very well also be affected -- by half!. Clearly something has screwed up.

On this page:

Current Status

Update: 2008-05-25

I've had reports that Catalyst 8.5 has finally fixed the problem! When I finish my thesis I might get around to updating the table below, but there are some early reports on the Blender forums of improvements.

Windows (XP)

In Windows, Catalyst display drivers 7.12, 8.1 and 8.2 are affected. At the time of writing, 8.3 isn't out yet, so that might be affected too. 7.11 and earlier versions don't appear to have the problem. 7.12 is seriously affected -- 3 times worse than in 8.1/8.2 (see below), which leads me to think ATI realised there was a problem and implemented a partial fix.

These versions correspond to internal versions 8.442, 8.451 and 8.453. Versions 8.432 and earlier are not affected.

You can get the 7.11 driver and control center from ATI's Catalyst 7.11 Software Component Downloads for Windows XP page. Other versions are on ATI's Previous Catalyst Drivers for Windows XP page.

If this affects you you should create a Ticket in ATI's support page (needs registration). Let them know that people use this and that it is currently broken!
Good News! I got an email from ATI on 2008-03-13 saying they have recently done some fixes in this area, and will be releasing them over the next couple of months.

Linux

In Linux, versions 8.433 and later appear to be affected. 8.42.3 has serious bugs and I haven't tested it. 8.40.4 has less-serious bugs -- the most annoying bug being a broken glReadPixels on some cards because this is what you need for backbuffer picking (although glRenderMode(GL_SELECT) works fine). More testing may ensue -- if glReadPixels works for you in 8.40.4, I'd use that. Otherwise I'll test 8.39.4, 8.35.5 and earlier if I have time to see if they are affected by the glReadPixels bug.

Unfortunately, 8.40.4 is annoying for other reasons -- it has no HD support and support for Linux kernel 2.6.23+ is possible only via a patch. I need to check on the Xorg support status (it probably doesn't support Xorg 7.3..).

It's strange that Linux appears to be hit earlier -- looks like ATI are using us Linux users as a sandbox before moving new driver bugs (er, features) into the Windows drivers.

Examples

These are some examples of performance in my Test Program that you can download below.

Test 1: Windows XP

System:
Windows XP Home Edition Version 2002 Service Pack 2
Graphics Card:
256MB ATI RADEON X600 PRO (0x5B62)
Bios Info:
113-A62903-103

Summary

Driver DateCatalyst
Version
Internal
Version
Render TimeSelect Time
2005-05-31??6.14.10.654224ms45ms
2007-07-277.88.401.0.023ms44ms
2007-08-217.98.411.0.023ms44ms
2007-09-287.108.421.0.024ms44ms
2007-11-017.118.432.0.023ms44ms
2007-12-047.128.442.0.084ms15345ms
2007-12-208.18.451.0.084ms4241ms
...............
2008-??-??8.58.493.0.0??msfixed?

Details

The table pretty much covers it. At some point the driver string changed to "ATI Radeon X300/X550/X1050 Series". These were all tested in 800x600 with render/select of 120000 squares (same as Linux below).

Test 2: Linux pc-320-0

System:
Linux pc-320-0 2.6.23-gentoo-r8 #1 SMP PREEMPT Mon Feb 11 12:31:06 EST 2008 i686 Intel(R) Pentium(R) D CPU 3.00GHz GenuineIntel GNU/Linux
XServer:
x11-base/xorg-server-1.3.0.0-r2 (xorg-x11-7.2)
Graphics Card:
01:00.0 VGA compatible controller: ATI Technologies Inc RV380 [Radeon X600 (PCIE)]

Good "Old" driver -- fglrx 8.40.4

OpenGL version string: 2.0.6747 (8.40.4)
[fglrx] Maximum main memory to use for locked dma buffers: 1898 MBytes.
[fglrx] USWC is disabled in module parameters
[fglrx] PAT is disabled!
[fglrx] module loaded - fglrx 8.40.4 [Jul 31 2007] on minor 0

120000 squares render in 45ms
120000 squares select in 70ms

Slow "New" driver -- fglrx 8.45.4

OpenGL version string: 2.1.7276 Release
[fglrx] Maximum main memory to use for locked dma buffers: 1898 MBytes.
[fglrx] ASYNCIO init succeed!
[fglrx] PAT is enabled successfully!
[fglrx] module loaded - fglrx 8.45.4 [Jan 16 2008] on minor 0

120000 squares render in 95ms
120000 squares select in 4803ms  <-- holy crap that's big!

Test Program

You can download C-Source: picksquare.c and a Makefile. The code is reproduced below, if you want to inspect it. You need SDL installed and a working `sdl-config` for the Makefile to compile -- get it from libsdl.org. For the lazy, you can download a picksquare.exe Windows 32-bit binary with SDL.dll in a zipfile. Just extract and run the exe file. If you are running Linux, but you're too lazy to compile, running the Win32 binary from Wine works just as good.

After compiling and running the program, you should see a 800x600 window of 2x2 pixels of rendered squares. Clicking somewhere in the window will "select" the squares you click (within 5-pixel range) and change their colour slightly (incorporating a blue component). Make sure you actually have direct rendering (check `glxinfo`). Compare the times for 'render' and 'select', from standard error output.

If the times are too quick to measure, try increasing R and C in the source. If the window gets too big, change the value of SZ from 2 to 1 -- this will use 1x1 pixels.

The program is a modified / updated version of picksquare/s from the OpenGL redbook, so portions are copyright SGI Graphics or some such. I've updated it to use SDL (rather than glut) and to render many many more squares with feedback so that we can get some meaningful performance measures.

Explanations

Nvidia drivers seem to be unaffected -- if they were doing it too, I might be lead to believe a:

Conspiracy Theory

There's a chance that ATI/AMD (and nvidia) are deliberately disabling this feature of their graphics cards. Games often get by without using it, while high-end graphics workstation applications like Maya depend on it heavily. There's a chance the card manufacturers want you to shell out for a workstation card, like Quadro, if you want to use this feature. But, clearly, the cards can support it just fine!

Test Program Code

Makefile

SDLCFLAGS := $(shell sdl-config --cflags)
SDLLIBS := $(shell sdl-config --libs)

CC = gcc
CFLAGS = -Wall -W -g $(SDLCFLAGS) -ansi -pedantic -Wshadow
LDFLAGS = $(SDLLIBS) -lGL -lGLU

all : picksquare

picksquare.o : Makefile

picksquare.c

/*
 * $Id: picksquare.c 604 2008-02-22 02:27:50Z tapted $
 * $URL: svn+ssh://pc-g33-9.it.usyd.edu.au/var/svn/pub/picksquare/picksquare.c $
 */
/**
 * picksquare.c
 * Use of multiple names and picking are demonstrated.
 * A RxC grid of squares is drawn.  When the left mouse
 * button is pressed, all squares under the cursor position
 * have the blue component of their color changed.
 */
#include <stdlib.h>
#include <stdio.h>
#include "SDL.h"
#include "SDL_opengl.h"

enum {
    R = 300,   /* number of rows of squares */
    C = 400,   /* number of columns of squares */
    Z = 100,   /* granularity of blue component */
    RANGE=5,   /* range of pick matrix in each direction (screen pixels) */
    STEPS = 3, /* number of steps to cycle blue component */
    SZ=2       /* size of each square in "world" coordinates */
};
static const GLfloat RM = 1.0/R;
static const GLfloat CM = 1.0/C;
static const GLfloat ZM = 1.0/Z;

enum {
    WINWIDTH = SZ*C,  /* initial window width (screen pixels) */
    WINHEIGHT = SZ*R, /* initial window height (screen pixels) */
    WINBPP = 32,      /* bits per pixel to request */
    WINXPOS = 100,      /* requested window x-position (often ignored) */
    WINYPOS = 100,      /* requested window y-position */
    INCR = Z/STEPS*3/2  /* The integer increment of the blue component */
                        /*    each time a square is selected */
};

static int board[R][C];       /* amount of blue color for each square */
static int videoFlags;        /* Flags to pass to SDL_SetVideoMode */
static SDL_Surface *surf = 0; /* The SDL surface (window handle) */

/* Clear color value for every square on the board */
static void init_cols(void)
{
   int i, j;
   for (i = 0; i < R; i++)
      for (j = 0; j < C; j ++)
         board[i][j] = 0;
   glClearColor (1.0, 0.0, 0.0, 0.0);
}

/*  The nine squares are drawn.  In selection mode, each
 *  square is given two names:  one for the row and the
 *  other for the column on the grid.  The color of each
 *  square is determined by its position on the grid, and
 *  the value in the board[][] array.
 */
static void drawSquares(GLenum mode)
{
   Uint32 start = SDL_GetTicks();
   GLuint i, j;
   for (i = 0; i < R; i++) {
      if (mode == GL_SELECT)
         glLoadName (i);

      for (j = 0; j < C; j ++) {
         if (mode == GL_SELECT)
            glPushName (j);
         glColor3f ((GLfloat) i*RM,
                    (GLfloat) j*CM,

                    (GLfloat) board[i][j]*ZM);
         glRecti (j, i, j+1, i+1);

         if (mode == GL_SELECT)
            glPopName ();
      }
   }
   fprintf(stderr, "%d squares %s in %dms\n",
           R*C,
           mode == GL_SELECT ? "select" : "render",
           SDL_GetTicks() - start);
}

/*  processHits prints out the contents of the
 *  selection array.
 */
static void processHits (GLint hits, GLuint buffer[])

{
   unsigned int i, j;
   GLuint ii, jj, names, *ptr;
   /* printf ("hits = %d\n", hits); */
   ptr = (GLuint *) buffer;
   for (i = 0; i < (unsigned)hits; i++) {	/*  for each hit  */
      names = *ptr;
      /* printf (" number of names for this hit = %d\n", names); */
      ptr++;
      /* printf("  z1 is %g;", (float) *ptr/0x7fffffff); */
      ptr++;
      /* printf(" z2 is %g\n", (float) *ptr/0x7fffffff); */
      ptr++;
      /* printf ("   names are "); */

      for (j = 0; j < names; j++) { /*  for each name */
         /* printf ("%d ", *ptr); */
         if (j == 0)  /*  set row and column  */
            ii = *ptr;
         else if (j == 1)
            jj = *ptr;
         ptr++;
      }

      /* printf ("\n"); */
      if (ii >= R || jj >= C) {
          fprintf(stderr,
                  "ERROR: name out of range: (%u, %u)\n",
                  (unsigned)ii, (unsigned)jj);
      } else {
          board[ii][jj] = (board[ii][jj] + INCR) % Z;
      }
   }
}

/*  pickSquares() sets up selection mode, name stack,
 *  and projection matrix for picking.  Then the
 *  objects are drawn.
 */
#define BUFSIZE 512
static void pickSquares(int button, int state, int x, int y)
{
   GLuint selectBuf[BUFSIZE];
   GLint hits;
   GLint viewport[4];

   /* just interpret all mouse clicks */
   (void)button;
   (void)state;

   glGetIntegerv (GL_VIEWPORT, viewport);
   glSelectBuffer (BUFSIZE, selectBuf);
   (void) glRenderMode (GL_SELECT);

   glInitNames();
   glPushName(0);
   glMatrixMode (GL_PROJECTION);
   glPushMatrix ();
   glLoadIdentity ();

   /*  create RANGExRANGE pixel picking region near cursor location */
   gluPickMatrix ((GLdouble) x,
                  (GLdouble) (viewport[3] - y),
                  RANGE,
                  RANGE,
                  viewport);
   gluOrtho2D (0.0, C, 0.0, R);
   drawSquares (GL_SELECT);
   glMatrixMode (GL_PROJECTION);
   glPopMatrix ();
   glFlush ();
   hits = glRenderMode (GL_RENDER);
   processHits (hits, selectBuf);
}

static void display(void)
{
   glClear(GL_COLOR_BUFFER_BIT);
   drawSquares (GL_RENDER);
   SDL_GL_SwapBuffers();
   glFlush();
}

static int reshape(int w, int h)
{
    if (w < C) w = C;
    if (h < R) h = R;
    fprintf(stderr, "Reshaping to %d x %d\n", w, h);

    /* get a SDL surface */
    fprintf(stderr, "[N] Calling SDL_SetVideoMode(%u, %u, %u, 0x%x)\n",
            w, h,
            WINBPP, videoFlags);
    surf = SDL_SetVideoMode(w, h, WINBPP, videoFlags );

    if (!surf) {
        fprintf(stderr, "[!CRITICAL!] Video mode set failed: %s\n", SDL_GetError( ));
        fprintf(stderr, "\tVideo flags:\n"
                         "\t\t%s SDL_OPENGL\n"
                         "\t\t%s SDL_HWPALETTE\n"
                         "\t\t%s SDL_RESIZABLE\n"
                         "\t\t%s SDL_FULLSCREEN\n"
                         "\t\t%s SDL_HWSURFACE\n"
                         "\t\t%s SDL_SWSURFACE\n"
                         "\t\t%s SDL_HWACCEL\n"
                         "\t\t%s SDL_GL_DOUBLEBUFFER\n",
                         videoFlags & SDL_OPENGL ? "yes" : "NO ",
                         videoFlags & SDL_HWPALETTE ? "yes" : "NO ",
                         videoFlags & SDL_RESIZABLE ? "yes" : "NO ",
                         videoFlags & SDL_FULLSCREEN ? "yes" : "NO ",
                         videoFlags & SDL_HWSURFACE ? "yes" : "NO ",
                         videoFlags & SDL_SWSURFACE ? "yes" : "NO ",
                         videoFlags & SDL_HWACCEL ? "yes" : "NO ",
                         videoFlags & SDL_GL_DOUBLEBUFFER ? "yes" : "NO ");
        return 1;
    }

    glViewport(0, 0, w, h);
    glMatrixMode(GL_PROJECTION);
    glLoadIdentity();
    gluOrtho2D (0.0, C, 0.0, R);
    glMatrixMode(GL_MODELVIEW);
    glLoadIdentity();
    return 0;
}

static void makeSurface() {
    /* this holds some info about our display */
    const SDL_VideoInfo *videoInfo;

    /* initialize SDL */
    if ( SDL_Init( SDL_INIT_VIDEO ) < 0 ) {
        fprintf( stderr, "[N] Video initialization failed: %s", SDL_GetError( ) );
        return;
    }

    if (!(videoInfo = SDL_GetVideoInfo( ))) {
        fprintf( stderr, "[N] Video query failed: %s", SDL_GetError( ) );
        return;
    }
    /* the flags to pass to SDL_SetVideoMode */
    videoFlags  = SDL_OPENGL;          /* Enable OpenGL in SDL */
    videoFlags |= SDL_HWPALETTE;       /* Store the palette in hardware */
    videoFlags |= SDL_RESIZABLE;       /* Enable window resizing */

    /* This checks to see if surfaces can be stored in memory */
    videoFlags |= videoInfo->hw_available ? SDL_HWSURFACE : SDL_SWSURFACE;

    /* This checks if hardware blits can be done */
    if (videoInfo->blit_hw) videoFlags |= SDL_HWACCEL;

    videoFlags |= SDL_GL_DOUBLEBUFFER; /* Enable double buffering */
    /* Sets up OpenGL double buffering */
    SDL_GL_SetAttribute( SDL_GL_DOUBLEBUFFER, 1 );

    if (videoFlags & SDL_HWSURFACE) {
        fprintf(stderr, "[N] Using hardware accelleration\n");

    }
}

static int mainloop() {
    SDL_Event event;
    reshape(WINWIDTH, WINHEIGHT);
    display();
    while (SDL_WaitEvent(&event)) {
        switch(event.type) {
        case SDL_USEREVENT:
            break;
        case SDL_MOUSEMOTION:
            /* handleMotion(event.motion) */
            continue;
        case SDL_MOUSEBUTTONDOWN:
            pickSquares(event.button.button, event.button.state, event.button.x, event.button.y);
            break;
        case SDL_MOUSEBUTTONUP:
            continue;
        case SDL_KEYUP:
        case SDL_KEYDOWN:
            if (event.key.keysym.sym == SDLK_ESCAPE)
                return 0;
            break;
        case SDL_VIDEOEXPOSE:
            break;
        case SDL_QUIT:
            return 0;
        case SDL_ACTIVEEVENT:
        case SDL_SYSWMEVENT:
            /* fprintf(stderr, "[E] Ignoring WM Events\n"); */
            break;
        case SDL_VIDEORESIZE:
            reshape(event.resize.w, event.resize.h);
            /* fprintf(stderr, "[E] Resize not supported!\n"); */
            break;
        case SDL_JOYAXISMOTION:
        case SDL_JOYBALLMOTION:
        case SDL_JOYHATMOTION:
        case SDL_JOYBUTTONDOWN:
        case SDL_JOYBUTTONUP:
            fprintf(stderr, "[E] Ignoring Joystick Events\n");
            break;
        default:
            fprintf(stderr, "[E] Ignoring unknown SDL Event type\n");
        }

        display();
    }
    return 1;
}


/* Main Loop */
int main(int argc, char** argv)
{
    (void)argc;
    (void)argv;
    makeSurface();
    init_cols();
    return mainloop();
}


/*
 * Copyright (c) 1993-1997, Silicon Graphics, Inc.
 * ALL RIGHTS RESERVED
 * Permission to use, copy, modify, and distribute this software for
 * any purpose and without fee is hereby granted, provided that the above
 * copyright notice appear in all copies and that both the copyright notice
 * and this permission notice appear in supporting documentation, and that
 * the name of Silicon Graphics, Inc. not be used in advertising
 * or publicity pertaining to distribution of the software without specific,
 * written prior permission.
 *
 * THE MATERIAL EMBODIED ON THIS SOFTWARE IS PROVIDED TO YOU "AS-IS"
 * AND WITHOUT WARRANTY OF ANY KIND, EXPRESS, IMPLIED OR OTHERWISE,
 * INCLUDING WITHOUT LIMITATION, ANY WARRANTY OF MERCHANTABILITY OR
 * FITNESS FOR A PARTICULAR PURPOSE.  IN NO EVENT SHALL SILICON
 * GRAPHICS, INC.  BE LIABLE TO YOU OR ANYONE ELSE FOR ANY DIRECT,
 * SPECIAL, INCIDENTAL, INDIRECT OR CONSEQUENTIAL DAMAGES OF ANY
 * KIND, OR ANY DAMAGES WHATSOEVER, INCLUDING WITHOUT LIMITATION,
 * LOSS OF PROFIT, LOSS OF USE, SAVINGS OR REVENUE, OR THE CLAIMS OF
 * THIRD PARTIES, WHETHER OR NOT SILICON GRAPHICS, INC.  HAS BEEN
 * ADVISED OF THE POSSIBILITY OF SUCH LOSS, HOWEVER CAUSED AND ON
 * ANY THEORY OF LIABILITY, ARISING OUT OF OR IN CONNECTION WITH THE
 * POSSESSION, USE OR PERFORMANCE OF THIS SOFTWARE.
 *
 * US Government Users Restricted Rights
 * Use, duplication, or disclosure by the Government is subject to
 * restrictions set forth in FAR 52.227.19(c)(2) or subparagraph
 * (c)(1)(ii) of the Rights in Technical Data and Computer Software
 * clause at DFARS 252.227-7013 and/or in similar or successor
 * clauses in the FAR or the DOD or NASA FAR Supplement.
 * Unpublished-- rights reserved under the copyright laws of the
 * United States.  Contractor/manufacturer is Silicon Graphics,
 * Inc., 2011 N.  Shoreline Blvd., Mountain View, CA 94039-7311.
 *
 * OpenGL(R) is a registered trademark of Silicon Graphics, Inc.
 */

-- Trent Apted - 2008-02-26