## PR3 final assignment

### PR3 final assignment

Hi guys,

The description of the final assignment for PR3 is now up on N@TSchool. Have fun!

- Jacco.

The description of the final assignment for PR3 is now up on N@TSchool. Have fun!

- Jacco.

### Re: PR3 final assignment

Just to make sure, pre-calculating still counts as a High Level Optimization right?

### Re: PR3 final assignment

No, in general.

A high-level optimization is one that reduces algorithmic complexity. So you start with a sorting algorithm that is O(N^2), and you replace it with one that is O(N log N); that is a high-level optimization.

Sometimes however, precalculations achieve this. And sometimes, one O(N) algo is cheaper than another, e.g. when you replace sprite scaling (which has a constant cost per pixel, thus O(N)) with precalculated sprites (also O(N), but now the cost per pixel is much lower). This would count as a high-level optimization.

So, you need to replace an algorithm to claim it as a high-level optimization. If this involves precalculations, you're fine. But often, the precalculations replace smaller scale operations, and do not effect the algorithm itself.

Does that help? Please say if it's still unclear; this is has proven to be a tricky concept in previous years.

- Jacco.

A high-level optimization is one that reduces algorithmic complexity. So you start with a sorting algorithm that is O(N^2), and you replace it with one that is O(N log N); that is a high-level optimization.

Sometimes however, precalculations achieve this. And sometimes, one O(N) algo is cheaper than another, e.g. when you replace sprite scaling (which has a constant cost per pixel, thus O(N)) with precalculated sprites (also O(N), but now the cost per pixel is much lower). This would count as a high-level optimization.

So, you need to replace an algorithm to claim it as a high-level optimization. If this involves precalculations, you're fine. But often, the precalculations replace smaller scale operations, and do not effect the algorithm itself.

Does that help? Please say if it's still unclear; this is has proven to be a tricky concept in previous years.

- Jacco.

### Re: PR3 final assignment

So when you have an algorithm that is O(n*k) where k = 100, reducing k to 50 would count as a high-level optimization?

Eric Polman

### Re: PR3 final assignment

Yes, unless 'k' denotes the cost of doing one operation. In 'big O' notation, you normally leave out the constants, so any algorithm that requires N iterations for N items in the input data set is O(N). Obviously, if one operation is a http lookup, the cost is going to be different from a situation where one operation is a dot product on data in L1 cache, but in terms of complexity, these algorithms would have the same complexity (they both scale linearly).

Strictly, an optimization is only high-level if you reduce time complexity, so O(N) becomes O(logN), or O(N^3) becomes O(N^2). In your example, O(N*k) should become O(logN * k) or O(N * log k). However, if reducing the constant factor is caused by a change in algorithm, I still consider it a high-level optimization. So replacing a sprite scale function by displaying pre-scaled sprites would be considered high-level.

More on this in tomorrow's optional work lecture at 13.00.

- Jacco.

Strictly, an optimization is only high-level if you reduce time complexity, so O(N) becomes O(logN), or O(N^3) becomes O(N^2). In your example, O(N*k) should become O(logN * k) or O(N * log k). However, if reducing the constant factor is caused by a change in algorithm, I still consider it a high-level optimization. So replacing a sprite scale function by displaying pre-scaled sprites would be considered high-level.

More on this in tomorrow's optional work lecture at 13.00.

- Jacco.

### Re: PR3 final assignment

Has anyone figured out a way to improve the Sprite::Draw function yet? And the background drawing? I can't seem to find any improvements in those (Without SIMD and multithreading)

### Re: PR3 final assignment

Here's the code I promised, with the SIMD / masking sticky balls:

Code: Select all

```
// Template, major revision 5
// IGAD/NHTV - Jacco Bikker - 2006-2013
#include "string.h"
#include "surface.h"
#include "stdlib.h"
#include "template.h"
#include "game.h"
using namespace Tmpl8;
static union { __m128 bx4[16]; float bx[64]; };
static union { __m128 by4[16]; float by[64]; };
static union { __m128 vx4[16]; float vx[64]; };
static union { __m128 vy4[16]; float vy[64]; };
Sprite* ball = new Sprite( new Surface( "assets/ball.png" ), 1 );
void Game::Init()
{
for ( int i = 0; i < 64; i++ )
{
bx[i] = Rand( SCRWIDTH - 100 ) + 50;
by[i] = Rand( SCRHEIGHT - 100 ) + 50;
vx[i] = Rand( 1 ) - 0.5f;
vy[i] = Rand( 1 ) - 0.5f;
float rl = 1.0f / sqrtf( vx[i] * vx[i] + vy[i] * vy[i] );
vx[i] *= rl * 0.1f;
vy[i] *= rl * 0.1f;
}
}
void Game::Tick( float a_DT )
{
m_Screen->Clear( 0 );
for ( int i = 0; i < 64; i++ )
{
ball->Draw( bx[i], by[i], m_Screen );
bx[i] += vx[i];
by[i] += vy[i];
if (bx[i] < 0) bx[i] = 0, vx[i] = -vx[i];
if (bx[i] > (SCRWIDTH - 50)) bx[i] = SCRWIDTH - 50, vx[i] = -vx[i];
if (by[i] < 0) by[i] = 0, vy[i] = -vy[i];
if (by[i] > (SCRHEIGHT - 50)) by[i] = SCRHEIGHT - 50, vy[i] = -vy[i];
}
// evade
#if 1
for ( int i = 0; i < 64; i++ )
{
__m128 i4 = _mm_set_ps1( float( i ) );
for ( int j = 0; j < 16; j++ ) // if (j != i)
{
// i i i i
// j*4 j*4+1 j*4+2 j*4+3
__m128 j4 = _mm_cvtepi32_ps( _mm_set_epi32( j * 4 + 3, j * 4 + 2, j * 4 + 1, j * 4 + 0 ) );
__m128 mask1 = _mm_cmpneq_ps( i4, j4 );
// float dx = bx[i] - bx[j];
// float dy = by[i] - by[j];
__m128 dx4 = _mm_sub_ps( _mm_set_ps1( bx[i] ), bx4[j] );
__m128 dy4 = _mm_sub_ps( _mm_set_ps1( by[i] ), by4[j] );
// float d = sqrtf( dx * dx + dy * dy );
__m128 d4 = _mm_sqrt_ps( _mm_add_ps( _mm_mul_ps( dx4, dx4 ), _mm_mul_ps( dy4, dy4 ) ) );
__m128 mask2 = _mm_cmplt_ps( d4, _mm_set_ps1( 50.0f ) );
__m128 finalmask = _mm_and_ps( mask1, mask2 );
// if (d < 50)
// float rd = 1.0f / d;
__m128 rd4 = _mm_rcp_ps( d4 );
// float ndx = dx * rd;
union { __m128 ndx4; float ndx[4]; };
ndx4 = _mm_mul_ps( dx4, rd4 );
// float ndy = dy * rd;
union { __m128 ndy4; float ndy[4]; };
ndy4 = _mm_mul_ps( dy4, rd4 );
ndx4 = _mm_and_ps( ndx4, finalmask );
ndy4 = _mm_and_ps( ndy4, finalmask );
// bx[i] += ndx, by[i] += ndy;
// bx[j] -= ndx, by[j] -= ndy;
for ( int q = 0; q < 4; q++ )
{
bx[i] += ndx[q];
by[i] += ndy[q];
}
bx4[j] = _mm_sub_ps( bx4[j], ndx4 );
by4[j] = _mm_sub_ps( by4[j], ndy4 );
}
}
#else
for ( int i = 0; i < 64; i++ )
{
for ( int j = 0; j < 64; j++ ) if (j != i)
{
float dx = bx[i] - bx[j];
float dy = by[i] - by[j];
float d = sqrtf( dx * dx + dy * dy );
if (d < 50)
{
float rd = 1.0f / d;
float ndx = dx * rd;
float ndy = dy * rd;
bx[i] += ndx;
by[i] += ndy;
bx[j] -= ndx;
by[j] -= ndy;
}
}
}
#endif
}
```

### Re: PR3 final assignment

Well for starters, in the Sprite::Draw function is an if condition in a loop that doesn't has to be in that loop.122828 wrote:Has anyone figured out a way to improve the Sprite::Draw function yet? And the background drawing? I can't seem to find any improvements in those (Without SIMD and multithreading)

For the background drawing, have you ever heard of the function called memcpy? It's the fastest way of copying data (that I know of). You can only use it with surfaces that are the same size. Luckily the background and the screen buffer are

EDIT: It might also be helpful to look at how sequential the data is in a sprite.

- Max Oomen 2GA-2PR (Avatar was made by Anthea van Leeuwen)

### Re: PR3 final assignment

Thanks 130802 for your answer. For the background I already use 1 line of code that does memcpy. Still it takes about 2.5 million ticks (with timerRDTSC) which is quite a lot, isn't it? Maybe I'm just wrong and it's supposed to be taking that amount of ticks. Everything is relatively fast ATM sitting on 250 FPS with 750 total tanks in the field. The problem now is, when tanks die they produce this smoke particle. When all of my tanks have died (150) it takes 100 FPS from the game to produce the smoke. That stuff is killing me Anyone found something nice for that yet?

### Re: PR3 final assignment

Or am I the only one experiencing this issue?

### Who is online

Users browsing this forum: No registered users and 0 guests