I have always loved programming - its like Lego without gravity.

Basic on my ZX81 graduating to assembler and Turbo Pascal during my teens.

Developed phone OS software - engineer, architect, product manager - but got made irrelevant by the iPhone and redundant by Android.

These days I mostly work with data, big data and fitting big data onto small boxes.

Drawing RTS maps fast

image

Techie bit follows, but first here’s some background:

At high-school I played the defining Dune 2 RTS game, and then at college I played C&C Tiberian Sun.  I poked around (internet was dial-up and slow in those days) and became part of the then-thriving game-modding community.  They had reverse-engined the Tiberian Sun model format, but no editors existed.  So I made one.  Will’s Voxel Editor, as it was called, was hugely popular in the limited world of Tiberian Sun modding.  I worked closely with a modder called Mr War - all modders had strange anonymous handles on-line in those days except me - and we made some really cool mods and games.

And I played Age of Empires II, without modding it, and then I kind of stopped playing games.

Fast forward to a couple of years ago and, in an idle moment, I went looking for modern, open-source RTS games.  And that’s when I found the Glest engine and I got in touch with Mr War again and we went to relive the glory days of our youth.

So I’ve been helping out on the Glest game, which is still under steady and active development.  I added stuff that Mr War wanted for mods such as chained particle effects, and then I moved my focus to the main game engine itself.  Here’s a nuke in our sci-fi mod, so go eradicate the martian invaders!:

I quickly moved on to making my own engines, but as I experiment with stuff that is relevant to Glest I still go back and try and help out getting it in.  I’ve done a few Ludum Dare contest entries with Mr War in the last year or so and its a super fun way to recharge mental batteries over a weekend.

The problem

Glest was written by college students (who later joined an EA studio as I understand) in the OpenGL 1.0 style, which was entirely appropriate in those days.  These days its a serious speed-brake and this has to move to vertex buffers (VBOs) and shaders.

So how do you feed the triangles that make up the landscape to the GPU fast enough?

Imagine you lay out your VBO in a normal sequence:

image

You upload the map to the GPU as a VBO.

In a 3D scene like this the camera is a frustum:

image

and so the part of the map you need to draw is a isoceles trapezoid:

image

So to draw the scene we have to tell the GPU to render the following VBO spans:

37, 53-54, 68-72, 84-89, 100-105, 115-121, 131-137, 147-153

So that’s 8 calls to DrawArrays or DrawElements or whatever.

Recursion

Are there alternative ways of ordering the scene so that we can easily emit less spans?  I came up with an alternative:

Imagine you have just a 2x2 map.  You’d draw it like this:

image

Simple enough.  You visit the top left, then top right, then bottom left then bottom right.  Four quadrants. Now imagine you are drawing a 4x4 map.  Instead of going across in scanlines, instead recurse into each quadrant:

image

This is easy enough to do and work with.  So now lets see how many spans our camera needs us to draw:

image

27, 30, 48-63, 98-99, 104-107, 133, 135, 144-151, 192-195

9 spans, but we can easily merge small gaps between spans and do some harmless overdraw:

27-30, 48-63, 98-107, 133-134, 144-151, 192-195 say.  6 spans.  And so on.

Experimental results were very exciting but ultimately this was all the wrong approach:

It turned out to be massively faster to just make a VBO from the visible tiles each and every time the camera moves!  The upload cost to the GPU was not a problem!

In webGL the expense of the always upload could well be much higher and the approach would win; in webGL the cost of glDrawElements would be prohibitive because the driver has to verify each and every index so glDrawArrays would be better.  And we have found glDrawRangeElements/glDrawRangeArrays to be terribly poor performing too.  You could also imagine a hybrid; use the recursively built quadrant VBO, find the biggest spans and emit them, and build a temporary span from all the small ones and upload only that each time the camera moves.

This wasn’t the last time we dramatically improved things for the low-end graphics card users, and I’ll explain more recent optimisations in another blog post.

jump to ↓



performance
Faster searches with non-prefix fields in composite indices
Compressing MySQL databases
What highscalability.com says about Scaling my Server
Scaling my Server: follow-up
old classics
The kid's computer
Making the History of Worlds Religions map
If you defend those involved in the OpenGL ES specification, you are an idiot
Stackoverflow unwinding?
general
Why Swift?
Python annotations and type checking
pycon 2014 Sweden: the bad bits
Table-based Template Translation in C++
recreation
games programming
Perlin Noise
Perlin Noise
Drawing RTS maps fast
WillCity update
ludum-dare
Ludum Dare #35 Mosaic
LudumDare 33 wallpapers
SSIM vs MSE for Mosaics
Ludum Dare 30 results are in!