I'm new to globulation, I tried it and liked it, but admittedly my main interest is porting it to the 770.
So I did some profiling because currently it doesn't even run smoothly on my 15" powerbook 1.5 Ghz ...
Now I am sitting on a reasonably large patch that helps in different area's:
The main offending problem is drawing, if you use SDL in software the alpha blitting is very slow. So I de-coupled the UI drawing from the rest of the engine, I do have some problems getting the framerate stable if frames need to be dropped. (The internal engine assumes we run at 25 fps, otherwise globs run faster ... however, running without drawing can be done at a speeds of 150+ fps on my laptop.) So I'll need some advise here. Please see somewhat further down.
When using opengl, the main problem becomes the gradient calculation. and then the minimap generation. And then the locate resources from gradient (unsure about how the func is called) ...
So I re-implemented the gradient calculation. I had an idea that would touch each tile only once, turned out the existing algorithm worked exactly the same, except that the code obfuscated everything, and there was no documentation in the code whatsoever. Also it seemed to have been optimized for memory instead of speed.
The re-implementation runs only a little faster, but it did enable me to merge the updateResourceGradient and updateGlobalGradient functions into one.Because the first already calculates some stuff that the second needs, 30% of the updateGlobalGradient was spend in recalculating this (making the first list of x,y coords where resource are located).
When I have some more spare hours, I will move the resource gradients into a container object, such that each gradient can be updated incrementally. And remove the horrible gradient[(y << wDec | x)] which is an example of memory optimization, while what we need is speed optimization gradient[y][x] ... (or gradient[y*256 + x] but lets leave that up to the compiler, shall we).
But as a final remark, the code is not in too good a shape. There is hardly any documentation or comments, which makes it hard to figure out what certain portions of the code do. And some programming practices are making it also difficult to read just the code: who can read this:
Code: Select all
Sint32 ticksToWait=computationAvailableTicks+ticksToDelayInside+missedTicksToWait;
It violates a few rules.
* First it declares a new variable mid function ... makes it hard to read where which data is use.
* There are no spaces between variables and operators, making it hard to read what it actually says. (WhatifIdon'tusespaceshere,canyoustillreadit?)
* the variable names don't are what they say they are, and are obscure
computationAvailableTicks is actually milliseconds, not Ticks, as SDL_GetTicks() gives you a timestamp (milliseconds since init of SDL). And the milliseconds are the time not spend in calculation; they are not available for calculation, as we are going to use them to wait! So ms_left_after_calculation is more accurate and readable.
(It is my personal opinion that camel casing is not useful as it is hard to read.)
Due to these issues I ignored the coding styles there were in the code, because this style actually leads to quality code ...
The new gradient implementataion:
Code: Select all
void Map::updateGlobalGradientSmall(Uint8 *gradient)
{
// this function does it the same way as previous function, except I wrote it :)
Uint8 * x_queue = new Uint8[size]; // todo, could be static!
Uint8 * y_queue = new Uint8[size];
Uint16 write_pos = 0;
Uint16 read_pos = 0;
Uint8 x;
Uint8 y;
Uint8 x_dec, x_inc, y_dec, y_inc; // x_dec = do_wrap(x - 1), x_inc = do_wrap(x + 1) ...
Uint8 v, cur_v; // value, current_value
Uint16 p; //position
// all squares with interesting items go in the queues to be processed
// this list should be given to the function, because the caller already knows about this list!
for (y=0; y < h; y++) {
for (x=0; x < w; x++) {
if (gradient[y << wDec | x] >= 3) {
y_queue[write_pos] = y;
x_queue[write_pos] = x;
write_pos++;
}
}
}
// each square in the queue is read and its neighbours are filled, unless already filled with higher number
// note that every square only gets touched once!
while (read_pos < write_pos) {
x = x_queue[read_pos];
y = y_queue[read_pos];
cur_v = gradient[y << wDec | x] - 1;
read_pos++;
x_dec = (x - 1) & wMask;
x_inc = (x + 1) & wMask;
y_dec = (y - 1) & hMask;
y_inc = (y + 1) & hMask;
// 1
p = y << wDec | x_dec;
v = gradient[p];
if (v < cur_v && v > 0) {
gradient[p] = cur_v;
x_queue[write_pos] = x_dec;
y_queue[write_pos] = y;
write_pos++;
}
// 2
p = (y << wDec | x_inc);
v = gradient[p];
if (v < cur_v && v > 0) {
gradient[p] = cur_v;
x_queue[write_pos] = x_inc;
y_queue[write_pos] = y;
write_pos++;
}
// 2
p = (y_dec << wDec | x);
v = gradient[p];
if (v < cur_v && v > 0) {
gradient[p] = cur_v;
x_queue[write_pos] = x;
y_queue[write_pos] = y_dec;
write_pos++;
}
// 4
p = (y_inc << wDec | x);
v = gradient[p];
if (v < cur_v && v > 0) {
gradient[p] = cur_v;
x_queue[write_pos] = x;
y_queue[write_pos] = y_inc;
write_pos++;
}
}
delete[] x_queue;
delete[] y_queue;
}
// a gradient is a number array that given a certain x,y gives you the closest resource
// this function puts 0 1 or 255 for no-go, can-go, resource then updateGlobalGradient generates the
// wave pattern, every neighbor of a 255 square gets 254, the neighbor of that get 253 etc ...
void Map::updateRessourcesGradient(int teamNumber, Uint8 ressourceType, bool canSwim)
{
Uint8 * x_queue = new Uint8[size]; // todo, could be static!
Uint8 * y_queue = new Uint8[size];
Uint16 write_pos = 0;
Uint16 read_pos = 0;
Uint8 x;
Uint8 y;
Uint8 x_dec, x_inc, y_dec, y_inc; // x_dec = do_wrap(x - 1), x_inc = do_wrap(x + 1) ...
Uint8 v, cur_v; // value, current_value
Uint16 p; //position
Uint8 *gradient=ressourcesGradient[teamNumber][ressourceType][canSwim];
assert(gradient);
Uint32 teamMask=Team::teamNumberToMask(teamNumber);
assert(globalContainer);
Case *c; // the tile we are about to check
for (y=0; y < h; y++) {
for (x=0; x < w; x++) {
p = y << wDec | x;
c = cases + p; // get current tile pointer
// there is something to discuss here:
// this means that going to somewhere, included knowing a route has been blocked somehow
// eventhough globs should know this because of fog of war and discovery ...
if (c->forbidden & teamMask) {
gradient[p] = 0;
} else if (c->ressource.type == NO_RES_TYPE) {
if (c->building != NOGBID) {
gradient[p] = 0;
} else if (!canSwim && (c->terrain >= 256 && c->terrain < 16 + 256)) {//!canSwim && isWater
gradient[p] = 0;
} else {
gradient[p] = 1; // can only walk here
}
} else if (c->ressource.type == ressourceType) {
// there is the longed for resource type on this tile!!
if (globalContainer->ressourcesTypes.get(ressourceType)->visibleToBeCollected && !(fogOfWar[p] & teamMask)) {
gradient[p] = 0;
} else {
// this is the thing we were looking for
gradient[p] = 255;
x_queue[write_pos] = x;
y_queue[write_pos] = y;
write_pos++;
}
} else {
// there is nothing on this tile
gradient[p] = 0;
}
}
}
// each square in the queue is read and its neighbours are filled, unless already filled with higher number
// note that every square only gets touched once!
while (read_pos < write_pos) {
x = x_queue[read_pos];
y = y_queue[read_pos];
cur_v = gradient[y << wDec | x] - 1;
read_pos++;
x_dec = (x - 1) & wMask;
x_inc = (x + 1) & wMask;
y_dec = (y - 1) & hMask;
y_inc = (y + 1) & hMask;
// 1
p = y << wDec | x_dec;
v = gradient[p];
if (v < cur_v && v > 0) {
gradient[p] = cur_v;
x_queue[write_pos] = x_dec;
y_queue[write_pos] = y;
write_pos++;
}
// 2
p = (y << wDec | x_inc);
v = gradient[p];
if (v < cur_v && v > 0) {
gradient[p] = cur_v;
x_queue[write_pos] = x_inc;
y_queue[write_pos] = y;
write_pos++;
}
// 2
p = (y_dec << wDec | x);
v = gradient[p];
if (v < cur_v && v > 0) {
gradient[p] = cur_v;
x_queue[write_pos] = x;
y_queue[write_pos] = y_dec;
write_pos++;
}
// 4
p = (y_inc << wDec | x);
v = gradient[p];
if (v < cur_v && v > 0) {
gradient[p] = cur_v;
x_queue[write_pos] = x;
y_queue[write_pos] = y_inc;
write_pos++;
}
}
delete[] x_queue;
delete[] y_queue;
}
1) in the Engine main loop, we wait for the network, do we wait specifically in the place where we do, or can we wait at any place. The problem is that the waiting code is now spread out through the loop, and it will be much more accurate when we place it completely at the end of the loop, where we know exactly how much time we spend where ... and how much time we need to make up for, or we need to sleep in order to get at 25 fps ...
Code: Select all
net->ticksToDelayInside();
Code: Select all
bool Map::getGlobalGradientDestination(Uint8 *gradient, int x, int y, Sint32 *targetX, Sint32 *targetY)
hope somebody takes the time to respond to this way to long forum message, but thanks for getting this far anyway!
happy hacking!