Performance and screenSpace culling

Questions about the LÖVE API, installing LÖVE and other support related questions go here.
Forum rules
Before you make a thread asking for help, read this.
User avatar
monsieur_h
Citizen
Posts: 65
Joined: Tue Oct 30, 2012 4:43 pm

Performance and screenSpace culling

Post by monsieur_h »

Hey, I have a question of general performance that I can't solve with my humble knowledge of LOVE inner parts. I recently decided to implement a kind of RenderQueue myself, using Slither OOP library and HUMP, mainly for vectors and camera objects. In essence a RenderQueue is an object taking care of everything that has to be rendered to the screen. It has two main goals : 1: depths management (who is drawn when) and 2:optimization. As you may guess, I stumbled on the later.

Ordering occurs via a object.zindex variable. But table sorting being a heavy operation I told myself: "Hey, why not remove every object that is not in my scene before sorting the queue ?". It's known as frustum culling when it is about 3D games. I call it screenSpace culling for 2d as I have no better word to describe it.

RenderQueue also performs a test to know wherever I have to change love.graphics.setColor() and camera:attach() or camera:detach(). The goal being, minimizing the calls to theses function. (Feel free to tell me if it's a stupid idea :huh: ).

I made a test where I draw a few thousand images randomly positioned in and out of the screen space. It appears that activating the culling makes it slightly SLOWER than a standard call. I'll provide a .love demonstrating it soon if I'm asked to. By the time here is the code I use in my RenderQueue object:

Code: Select all

------------------------------------------------------------------------
-- @class RenderQueue
-- Renders a list of drawable to the screen
------------------------------------------------------------------------

class "RenderQueue"{
}

function RenderQueue:__init__()
    self.queue          =   {}
    self.camera         =   nil
    self.cameraAABB     =   nil
    self.attached       =   false
    self.screenAABB     =   AABB(0, 0, love.graphics.getWidth(), love.graphics.getHeight() )
    self.debug          =   {queueSize=0, renderTime=0, culledItems=0}
end

function RenderQueue:addDrawableObject( p_element )
    assert( p_element )
    assert( isinstance(p_element, DrawableObject) or issubclass(p_element, DrawableObject) )
    table.insert( self.queue, p_element )
end

function RenderQueue:registerCamera( p_camera )
    assert( p_camera )
    self.camera =   p_camera
end

function RenderQueue:sort()
    table.sort( self.queue, function(a,b) return a.zindex<b.zindex end )
end

function RenderQueue:setCurrentColor( p_color )
    assert( p_color )
    love.graphics.setColor( p_color.r, p_color.r, p_color.b, p_color.a )
    self.currentColor   =   p_color
end

function RenderQueue:draw()
    self.debug.queueSize    =   #self.queue
    local startTime =   love.timer.getMicroTime()
    self:screenSpaceCulling()
    self.debug.culledItems  =   (self.debug.queueSize - #self.queue)
    self:sort()
    for i, v in ipairs( self.queue ) do
        if not ( self.currentColor == v.color ) then
            self:setCurrentColor( v.color )
        end
        if not ( self.attached == v.attached ) then
            self:setCameraAttached( v.attached )
        end
        v:draw()
    end
    self:clear()
    local endTime           =   love.timer.getMicroTime()
    self.debug.renderTime   =   (endTime - startTime)
end

function RenderQueue:registerCamera( p_camera )
    self.camera     =   p_camera
end

function RenderQueue:setCameraAttached( p_bool )
    assert( self.camera )
    if p_bool then
        self.camera:attach()
        self.attached   =   true
    else
        self.camera:detach()
        self.attached   =   false
    end
end

function RenderQueue:clear()
    self.queue  =   {}
    if self.attached then
        self:setCameraAttached( false )
    end
end

function RenderQueue:getCameraAABB()
    if self.camera then
        if self.camera.rot == 0 then --already aligned, quick AABB
            local tlx, tly  =    self.camera:worldCoords( 0, 0 )
            local brx, bry  =   self.camera:worldCoords( love.graphics.getWidth(), love.graphics.getHeight() )
            return AABB( tlx, tly, brx, bry )
        else
            local tlx, tly  =    self.camera:worldCoords( 0, 0 )
            local brx, bry  =   self.camera:worldCoords( love.graphics.getWidth(), love.graphics.getHeight() )
            return AABB( math.min(tlx, brx), math.min(tly,bry), math.max(tlx, brx), math.max(tly,bry) )
        end
    end
end

function RenderQueue:screenSpaceCulling()
    --Removes elements that are not in the screen or in its projection
    local camBox    =   self:getCameraAABB()
    --Reverse loop idea from:
    --http://stackoverflow.com/questions/12394841/safely-remove-items-from-an-array-table-while-iterating
    --Consider using the 3 loops answers for more perfs
    local tremove   =   {}
    for i=#self.queue,1,-1 do
        if self.queue[i].attached then
            if not camBox:crosses( self.queue[i].boundingBox ) then
                table.remove( self.queue, i )
            end
        else
            if not v.boundingBox:crosses( screenAABB ) then
                table.remove( self.queue, i )
            end
        end
    end
end

function RenderQueue:getDebugInfo()
    return "Queue size: "..self.debug.queueSize.."\n"
            ..self.debug.culledItems.." items culled".."\n"
            .."in "..self.debug.renderTime
end
My question is: why is it slower? Am I trying to do something stupid, or I am I doing it wrong?
User avatar
daviddoran
Prole
Posts: 30
Joined: Sun Mar 24, 2013 5:35 am

Re: Performance and screenSpace culling

Post by daviddoran »

I'll have a read over it a little later...but a .love would be great!
User avatar
bartbes
Sex machine
Posts: 4946
Joined: Fri Aug 29, 2008 10:35 am
Location: The Netherlands
Contact:

Re: Performance and screenSpace culling

Post by bartbes »

I assume you're talking about the culling function in particular, but you've got to be aware that table.insert and table.remove (the one you're using) are relatively slow on huge tables. Maybe there's an easy and efficient way to nil them, then remove the gaps afterwards. Alternatively, maybe a new table and a counter, so you can do direct insertions, instead of table.insert.
User avatar
markgo
Party member
Posts: 190
Joined: Sat Jan 05, 2013 12:21 am
Location: USA

Re: Performance and screenSpace culling

Post by markgo »

You may be interested in a spatial hash to prevent off screen drawing. You would index the range of cells in a spatial hash with your viewing window, and only draw what's in the cells. It's way more efficient than testing AABB's for every object. As for depth management, I think brute force table sorting is good enough. You can also look at binary trees or skip lists to manage layer drawing order.
User avatar
monsieur_h
Citizen
Posts: 65
Joined: Tue Oct 30, 2012 4:43 pm

Re: Performance and screenSpace culling

Post by monsieur_h »

bartbes wrote:I assume you're talking about the culling function in particular, but you've got to be aware that table.insert and table.remove (the one you're using) are relatively slow on huge tables. Maybe there's an easy and efficient way to nil them, then remove the gaps afterwards. Alternatively, maybe a new table and a counter, so you can do direct insertions, instead of table.insert.
I managed to do so on the remove part. It doesn't seem to improve very much. Maybe I did it wrong. Feel free to try the .love as well. Here is the new code:

Code: Select all

function RenderQueue:screenSpaceCulling()
    --Removes elements that are not in the screen or in its projection
    local camBox    =   self:getCameraAABB()
    --Reverse loop idea/3loops ideas from:
    --http://stackoverflow.com/questions/12394841/safely-remove-items-from-an-array-table-while-iterating
    local tremove   =   {}
    for i=#self.queue,1,-1 do --Loop 1 : identify what to remove
        if self.queue[i].attached then
            if not camBox:crosses( self.queue[i].boundingBox ) then
                --~ table.remove( self.queue, i )
                tremove[self.queue[i]]=true
            end
        else
            if not v.boundingBox:crosses( screenAABB ) then
                --~ table.remove( self.queue, i )
                tremove[self.queue[i]]=true
            end
        end
    end

    local n=#self.queue
    for i=1,n do --Loop 2 : passing values to nil
            if tremove[self.queue[i]] then
                    self.queue[i]=nil
            end
    end

    local j=0
    for i=1,n do --Loop 3 : building the new array from not deleted items
            if self.queue[i]~=nil then
                    j=j+1
                    self.queue[j]=self.queue[i]
            end
    end
    for i=j+1,n do --Freeing memory for the rest of the array
            self.queue[i]=nil
    end
end
daviddoran wrote:I'll have a read over it a little later...but a .love would be great!
Here goes the file ! Press C to toggle culling.

Also, here is a small draft diagram of what I'm doing. I only have the "engine" package running yet. I put it there in case it helps you to follow my idea. Basically, every drawable as an Axis Aligned Bounding Box (AABB) that I test either against camera coordinates(if it's attached) or screen coordinates. Objects that doesn't fit are "culled" from the queue.

Thank you for your advices
Attachments
classDiagramEngine.png
classDiagramEngine.png (30.81 KiB) Viewed 5270 times
culling.love
(397.51 KiB) Downloaded 111 times
User avatar
bartbes
Sex machine
Posts: 4946
Joined: Fri Aug 29, 2008 10:35 am
Location: The Netherlands
Contact:

Re: Performance and screenSpace culling

Post by bartbes »

I meant more along the lines of:

Code: Select all

function RenderQueue:screenSpaceCulling()
    --Removes elements that are not in the screen or in its projection
    local camBox    =   self:getCameraAABB()
    --Reverse loop idea/3loops ideas from:
    --http://stackoverflow.com/questions/12394841/safely-remove-items-from-an-array-table-while-iterating
    local new   =   {}
    local nc = 1
    for i=#self.queue,1,-1 do --Loop 1 : identify what to remove
        if self.queue[i].attached then
            if camBox:crosses( self.queue[i].boundingBox ) then
                --~ table.remove( self.queue, i )
                --tremove[self.queue[i]]=true
                new[nc] = self.queue[i]
                nc = nc + 1
            end
        else
            if v.boundingBox:crosses( screenAABB ) then
                --~ table.remove( self.queue, i )
                --tremove[self.queue[i]]=true
                new[nc] = self.queue[i]
                nc = nc + 1
            end
        end
    end
    self.queue = new
end
[/quote]
User avatar
kikito
Inner party member
Posts: 3153
Joined: Sat Oct 03, 2009 5:22 pm
Location: Madrid, Spain
Contact:

Re: Performance and screenSpace culling

Post by kikito »

I would start by using some sort of measuring, to see where time is being spent. When studying performance issues, evidence is king.

I remember someone had a very useful tool for studying the performance of a Lua script on this forum. Unfortunately I don't remember who it was or where was the script :(
When I write def I mean function.
User avatar
daviddoran
Prole
Posts: 30
Joined: Sun Mar 24, 2013 5:35 am

Re: Performance and screenSpace culling

Post by daviddoran »

kikito you may be thinking of the callgrind profiler posted recently:

http://love2d.org/forums/viewtopic.php? ... ile#p98245
User avatar
monsieur_h
Citizen
Posts: 65
Joined: Tue Oct 30, 2012 4:43 pm

Re: Performance and screenSpace culling

Post by monsieur_h »

bartbes wrote:I meant more along the lines of:
[some code]
...
Oh, it's clearer now. :awesome: I tried it, with a slight improvement over my 3-loop-thing. But still worse than a not-culled version. As kikito adviced, I'll take a moment to investigate where the program spends most time. I strongly suspect the AABB test to be less than optimal. But if you happen to recall that script name, itwould be welcome.
markgo wrote:You may be interested in a spatial hash to prevent off screen drawing. You would index the range of cells in a spatial hash with your viewing window, and only draw what's in the cells.
I'll read on skip lists and b-trees for sorting. About spatial hash, I've heard a lot of things about quad-trees and such, but I can't imagine how they can be faster, as we have to update them as well. Do you have a good documentation to begin on this topic?
User avatar
Kadoba
Party member
Posts: 399
Joined: Mon Jan 10, 2011 8:25 am
Location: Oklahoma

Re: Performance and screenSpace culling

Post by Kadoba »

kikito wrote:I would start by using some sort of measuring, to see where time is being spent. When studying performance issues, evidence is king.

I remember someone had a very useful tool for studying the performance of a Lua script on this forum. Unfortunately I don't remember who it was or where was the script :(
Do you mean ProFi?
Post Reply

Who is online

Users browsing this forum: No registered users and 143 guests