Into the many-fold

Ignace Saenen 3 days ago

New @kickstarter for @Larian! Join in on the journey to build the sequel to one of the most fun RPG games of late! twitter.com/LarAtLarian/st…

Ignace Saenen 3 days ago

@pimathman @nanexllc looks a lot like zenway.ru/uploads/09_201… imagine all the rhythmic trading blasting through your speakers..

Ignace Saenen 4 days ago

Congrats to @monkube for receiving a Creative Europe grant! eacea.ec.europa.eu/sites/eacea-si…

Ignace Saenen 1 month ago

The Guardian Ed Vaizey - video games are as important to British culture as cinema gu.com/p/4akm8

Ignace Saenen 2 months ago

Neato idea: Hi' = Lo + H264d( ipv6 (H264e(Hi - Lo) ) ) in cs.duke.edu/~lpcox/mobi093…

Ignace Saenen 4 months ago

Examining flower cells in 3D engt.co/1I5gpcS via @engadget

Ignace Saenen 4 months ago

@delijn De nieuwe tram van Gent zoekt een naam. "Pierewoajer" ? #tramzktnaam

Ignace Saenen 5 months ago

@rianflo @paniq hsvo and precomputed cache-bits then?

Ignace Saenen 5 months ago

@LargeMeshes Lucy is famous, and so is this scene. What's your biggest mesh? pic.twitter.com/6B0ANCMghw

Twitter Media
Ignace Saenen 5 months ago

Thanks @pixar for #free #nonprofit #renderman renderman.pixar.com/view/non-comme…. Animation sector up for a Game Tech type of float or drown?

Ignace Saenen 5 months ago

3rd VR meetup in #Brussels (BE) - schedule: meetup.com/Virtual-Realit… #VR #virtualreality #oculusrift #hololens #wearables @iMinds

Be.VR Meet-up 3. @ Co-Station (Brussels)

Thursday, Mar 26, 2015, 6:30 PM

Location details are available to members only.

162 BEaVeR Went

Find there the schedule of our upcoming BE.VR Meet-up on March 26:Schedule: 18.30: DemosBig Bad WolfToy Plane Heroes (GER)VR IntelligenceVR Pinball Controller (GER)Rewind FX (UK)Andrew CurtisOuat!Vigo UniversalNozonSidemaProjet Arturo - Martin Teller (Treatment of phantom pain)19.15: Presentations (10min each)Keynote: Andrew Curtis...

Check out this Meetup →

Ignace Saenen 7 months ago

@rygorous in soviet russia, bink audio etc..

Ignace Saenen 9 months ago

Fellow #demoscene friend #gongo a.k.a. Dimitri Smits suddenly died age 38 on 17/11. My thoughts are with family and friends. We'll miss you.

Ignace Saenen 9 months ago

@peter_lambert @MMLab_UGent @iMinds @ResearchUGent Thanks! None of the 1000's pathfinding agents were harmed in this research!

Ignace Saenen 10 months ago

If the ocean is too big, you need a bigger boat: youtu.be/nqERLsNTnXk via @YouTube

Ignace Saenen 10 months ago

Sick at home, rain outside: First attempt at SSAO: pic.twitter.com/8kz0trU6z7

Twitter Media
Ignace Saenen 10 months ago

Looking forward to #iMindsConf. Come pay us a visit and see our #RTS game where the GTTF team @ #MMLab shows off #nextgen #pathplanning

Ignace Saenen 11 months ago

x 10..

Ignace Saenen 12 months ago

@pointinpolygon FreeImage_OpenMultiBitmap with FIF_GIF, fyi

Ignace Saenen 12 months ago

Thanks @mike_acton for sharing a great cppcon talk! github.com/CppCon/CppCon2…

Subdivision surfaces

Subdivision

Examples of subdivision surfaces on triangle meshes, 3 iterations.

I started to play around a bit with subdivision surface methods.

A distinction is made between mesh-smoothing and mesh-subdivision, whereas the first aims to make the surface smooth (smoothness being defined as either C1 or C2), and the latter aims to represent the same surface with more triangles (without making changes to the geometry).

There are a number of subdivision schemes you can use to subdivide meshes, most of them relate to b-splines and other quadratic curves, but I’m a bit rusty on the math for that, so instead I found some great resources (see below for links). I’ve implemented the famous Loop subdivision (using the Warren scheme) and √3 subdivision, both operate on triangle meshes.

Something I finally grasped is that when they talk about even and odd vertices, they mean the positioning of the vertex in an ordered list: even means the original vertices from the base mesh (that may have been moved), and odd means the vertices that are created (i.e. computed from the base mesh) and inserted in between. I never could get my head around that terminology, but it starts to make sense once you do.

Subdivision schemes also need bordering-rules that define the way vertices are calculated when they are on a border of the mesh. To address bordering cases, first a full inspection of the mesh (iteration over all polygons and edges) is required to determine which edges are shared between 2 faces. An edge with only 1 polygon is a bordering edge. Once that analysis is ready, we can handle bordering cases correctly. It’s a bit of work, but it pays off in the end. So far I’m recomputing this information with every subdivision step, but if anyone knows how to update it instead of recomputing it again, I’m all ears! Of course I could always resort to a nifty allocation strategy, but maybe there’s a more clever way?

Two cube models in this test set: the left-most is a Cube-8, which are 6 planes ( a plane is represented by 2 triangles) and every plane is defined by 4 out of the 8 vertices. The cube next to it is defined by the same 6 planes, but this time each plane is defined by 4 unique vertices per plane (i.e. vertices are not shared between triangles). I make this distinction because in non-smooth mesh cases (where edges and creases need to remain intact) one needs the face normal in the vertex for that plane.

As you can see, the √3 subdivision keeps the Cube-24 edges intact. The Cube-8, Icosahedron and Sphere objects are treated as smooth surface objects because they share vertices and vertex normals. On the other hand, Loop subdivision on the Icosahedron generates a very nice sphere, useful for dome-shapes etc.

Next up: Catmull-Clark quad subdivision. The Doo-Sabin subdivision was implemented (it is interesting since it can be applied to both triangle and quad meshes) but my render framework currently supports either triangle or quad meshes, not both within the same mesh object. A mixed quad/triangle representation is a natural subdivision result for Doo-Sabin: edges revert to quads, while vertices revert to polygons of the same degree as the valence of the vertex. For example, if a vertex has 3 incident faces, then a triangle is generated in between the new points. One solution for solving the rendering problem could be to simply fake polygons of higher degree than 3 using more triangles, but this leads to incorrect iterations.

Some useful links:

https://www.graphics.rwth-aachen.de/media/papers/sqrt31.pdf

http://web.eecs.umich.edu/~sugih/courses/eecs487/lectures/38-Subdivision.pdf

2015 – A tribute to Neal Marlens and Carol Black ?

2015, a fresh new year full of new challenges and new ideals.

Continue reading 2015 – A tribute to Neal Marlens and Carol Black ?

In loving memory

1896959_10203332287497391_1101692686_n

Earlier this week, sad news reached us that an old friend, Dimitri Smits (a.k.a. ‘Gongo’ or ‘Discordis’) had died at age 38. He died at home in the morning of 17 November 2014. Today, together with lots of other people, we said good bye one final time.

Continue reading In loving memory

Politics revisited

 

Analysis

In the last few years, right-winged parties have gathered more and more voters, to the effect that much of mainland Europe now has ‘nationalist’ parties in offices and governments. After decades of political exclusion and unbridled tenacity, the constantly repackaged and restyled message is gaining more and more traction in the majority of generations. The message itself is often one of anti-, a disdain of certain common themes, situations or problems, and a profound wish to change the system. Almost all of them frame neo-liberal socialism tandems that have ruled most European countries – the (self-proclaimed) ‘democratic’ parties – as the cause of all problems, and specifically target the socialist legs.

The need for change in itself is not the point of discussion. All situations are dynamic, it is only normal to constantly re-evaluate and take action in case of serious derailment or problems. And the problems are abundant, the challenges huge. No surprises there either.

So why is the public warming up to nationalist idea’s today? It is of course partly the ideology, but 3 other factors are also at work:

Continue reading Politics revisited

On binning test results..

 

Binning results

The binning operation comes into play when you have a lot of test results. Say you’ve measured the rendering times of your rendered for different scenes and different parameter settings (e.g. several resolutions). You possibly did also measure the number of rendered triangles, the number of lit pixels, the number of overdraw per pixel.

If you brute-force all those settings and keep log files that contain your results, you end up with quite a bit of data. To make this data meaningful, you have to group that data into arbitrary subdivisions again, and then let loose all your statistical magic for that set (mean, averages, standard deviations, etc.. ) Suppose you are only interested in the most expensive frames, or all the data for model A, or all the measurements for resolution XYZ. While the interesting bit is in the statistics, getting those statistics on the right data is actually the hardest part, especially when the data set is large.

Now, you can argue that a few measurements are enough, and that you can extrapolate to other scenario’s. This is probably true for simple linear, quadratic and cubic relationships between statistics, but in some cases the performance landscape of your function can have surprising outliers, especially if you are measuring functions of systems that you do not yet fully understand.

You can drop all that data in database tables and rely on SQL to deliver the right answers for you. While that may be a viable approach for a number of good reasons, sometimes dumping massive amounts of data points into a database is just not feasible. The next best option, is to bin the data yourself. There are a number of approaches you can take. One is to use commercial software, which will often break down on large data-sets. One is to use R, the statistical scripting language. And on is to write it yourself, which is what I did.

Ground truth

First you have to set up the bins you are interested in. This means that for every ‘set’ (say: resolution 130 x 251), you aggregate all the samples for every specific feature that is relevant. This simply means: storing them in memory per ‘set-id’. When you’ve gone through all your data for this bin, you process all the relevant stats on it, clear everything, and move to the next bin, until you’ve exhausted all the bins.

The reason you have to store so much data is because we want to compute the sample and population variance to compute the standard deviation. To compute the sample variance and population variance, the summed squared error (relative to the mean of the bin) is divided by the total number of measurements for that bin (minus 1 for sample variance). The bin’s mean and the total number of measurements is only known after all measurements are added to it, so they all really need to be stored in memory, which is kinda sucky.

 

Optimizing in C#

There are 2 factors which can wildly run out of hand: memory and performance.I first concentrated mostly on performance, then gradually also started looking at memory consumption. The more memory that is touched and controlled in the garbage collector, the more time it takes for the application to context-switch and page-fault it’s way through the relevant memory.

Performance was greatly improved by doing a number of things: profiling quickly indicated which parts were slow, and with a bit of restructuring and taking my specific scenario into account, a number of conditionals could be removed and for loops tightened so that the total amount of work dropped.

Parallel = faster?

Next, I started playing around with “Parallel.Foreach”, so that all CPU kernels were busy doing part of the work. Parallel.Foreach is great, but it can even worsen performance if you keep copying data around. It also brings up the issue of keeping the GUI up to date (using delegate Invokes – if you launch it from every thread, you generate a massive event overhead which ultimately results in UI -stall and serious performance drain). At first, it was difficult to rewrite the code and find good CPU coverage. All cores would be busy, but application processing would consume about 20% of the CPU time, the rest of it going to stalling and overhead. The initial changes are surprisingly easy, but getting good performance out of this is quite another ballpark. Once again, profiling helped fine-tune the application so that the cores were maximally loaded and overhead of scheduling was reduced to a minimum. This brought down the processing time to 2 days, from taking over half a week.

Concurrent = better?

After more twiddling I discovered that the “return yield x” allows an iterator function to delay the computation necessary to return an element until the iterator function is effectively queried for it. But this proved to be tricky, since simply calculating the size of the Enumerated elements would already cancel out that functionality. Additionally, it meant that Dispose() methods (file handles and streams) were now only run much later in the application, leading to serious memory consumption. Still, the return yield meant 1 day of crunching, instead of 2.

Taking a better look at code coverage, I discovered that most of the overhead was due to memory copying and stalling on behalf of the UI, as well as a few conditionals that could possibly be removed. I started using parallel containers such as ConcurrentBag and once again rewrote the code. A warning though: you have to be very careful when you use such constructs, because contention can be extremely expensive. ConcurrentBag keeps bags per thread, and locks them at least 2 times per opeation (Add/Take). In my case, I was just adding items, and reading happened after all items were added, and on a single thread. If you have high contention with many different threads reading and writing at the same time, lock-less patterned containers will probably be a better choice. The CPU consumption shot up to 85-95% range.

At this point something amazing happened: the profiler started showing String.Contains() as one of the prime contenders for spin-overhead. I was partly surprised to find it so openly up for grabs, and wondered if something could be done here. Of course, you can shorten the strings, but since it was based on actual data on disk, I didn’t want to touch that.

Now String.Contains is fast..

.. but somewhat surprisingly, this is even faster:

 

return ((line.Length - line.Replace(searchText, String.Empty).Length) / searchText.Length) > 0;

 

I came across this gem on this excellent benchmarking site. I had Contains() up on top in Intel’s VTune performance analyzer in the spin-overhead section.

After applying this, the the spin-overhead on a test-scenario was reduced from 240s to 40s for the very same run. Instead of a full day, the results now took under one hour to compute. And the data set was actually enlarged by 40% since the previous tests, as new cases had to be evaluated.

Obviously stripping out any stored members in your measurements is going to win you space and time. After the initial cut, the deferred de-allocation started playing a role. I played around with the GC but did not really find that it helped much. Instead I came across another setting that you can use: server-based garbage collection:

 

<gcServer enabled="true"/>

 

Afterwards, I switched the memory model in 64bit to avoid any other memory problems altogether.

 

The media desert

I am not a journalist or reporter

I am a journal-ismist. And probably not a very good one, but I leave that judgment up to you.

Before we dive into the question of what is good and what is bad, I’ll first clear up the terminology. Obviously, journalism is what journalists do. It’s a profession. It’s reporting on the truth of the day. Or at-least, cover a story that could probably be the truth.
TLDNR; how about facts instead of news?

Continue reading The media desert

Vectored Exception Handling and OutputDebugString

Try me

If you write high-preformance game code in c++, there is an important language feature that is usually simply not available: structured exception handling, in the form of try / catch statement blocks:

try
{
  /* code goes here that throws an exception, like so: */
  throw MyException(/*arguments*/);
}
catch(MyException e)
{
  /* handle exception e that was thrown in the try block */
}

A quarrel of stances

Firstly, try/catch is a bit broken in c++ because the language ‘forgot’ to add ‘finally’, which is present in Java and c# to handle situations where all code paths converge and handle off the function block. In those languages ‘finally’ statement blocks are very explicit about releasing resources like file handles or memory buffers, after handling the exceptions. Stroustrup defends his language design omission decision by stating that the problem can be solved using RAII (Resource Acquisition Is Initialization) which is a valid, more object-oriented alternative.

While that is true, it forces people to generate class structure overhead, which you have to keep under control if you’re writing game code. Secondly, Microsoft went so far to actually add the ‘finally’ syntax as a microsoft-specific compiler extension. Thirdly, if you look at Java and C#, Ruby, they do provide a finally statement block syntax.

But apart from language syntax issues, these are the real reasons to stay away from try/catch blocks in game code:

  1. try/catch can lead to exceptions being thrown from one call to the parent call etc. The code pointer typically jumps around in the stack, meaning that your code cache gets utterly trashed. Performance-wise, this is by far the worst way of handling exceptional conditions.
  2. The definition of what exactly can be understood by the notion of an ‘exception‘ is not always clear to all programmers, and can become quite a philosophic debate. I’ll argue here that to have certain use cases handled as exceptions betrays that your code reserves a certain bias as to what is an exceptional operational condition and what is not. For game code, this is weird, since you know all conditional code paths at all times. The last thing that should happen is to have control yanked from under your feet by some 3rd party library call that bubbles an unknown exception up to the surface at totally the wrong place. And to make matters even more complex: there are 2 types of exceptions: system exceptions and standard exceptions.
  3. Stack unwinding during exception catching does not travel across threads and exceptions have to be explicitly handed over to the proper thread in multi-threaded applications. So in the end, programmers end up with a bit of a garbage bin of catch clauses at the top of the application threads. It’s obviously bad, but it happens more that you think.
  4. try/catch usage implies that you can trace back to the object type that generated the exception. This requires Run-Time Type Information (RTTI) during compilation to be enabled, so that you are able to use instanceof and typeid operators, and maybe also some dynamic_cast. Obviously the performance penalty of dynamic_cast is malicious and the operators trash the address registers and cache lines, but also the increased footprint of the executable and the associated memory costs per object is adding to the wrong side of the equation. And since we’re talking exceptions, most of your code should not even be needing it.

Uh, wait..

Ok, so basically game code is totally in love with RAII, though it might not always be implemented in a ‘nice’ (if such a thing should exist) Object Oriented approach. But wait! You said there still are system exceptions. What about those?

Indeed. The system is sometimes throwing exceptions that relate to OS operations that fail, for example, to load a DLL. In such cases, it can be interesting to trap the exception somehow, without having to resort to the whole RTTI/try/catch overhead. This can be done in Windows using Microsoft’s Vectored Exception Handling.

Basically it allows the application to install an additional custom exception handler for each exception error code that the system is able to throw, using the functions:

LONG WINAPI VectoredExceptionHandler(PEXCEPTION_POINTERS /*pExceptionInfo*/){...}

AddVectoredExceptionHandler(1, VectoredExceptionHandler);
RemoveVectoredExceptionHandler(VectoredExceptionHandler);

Check the MSDN for details.

Deja vu

The point I wanted to make in this post, is that while you’re in your VectoredExceptionHandler function, there is a problem with using OutputDebugString(), as it again throws an exception when the debugger is not attached.

It does however seem to work when the debugger is attached. Suspicious as this is, at first I blamed my own code and it had me immediately looking for uninitialized values or other memory corruption, but nothing came up, and the same code runs fine in other code paths (frequently). After isolating it into a test program, it seems it is actually re-entering the VEH exception handler, and thus causes stack overflow.

To conclude, this is not just an OutputDebugString() issue of course, it can happen at any point in your exception handler. As soon as you make a library call that depends on standard or system libraries, the VEH is vulnerable to re-entrant code. So guard your VEH’s against re-entrant code paths, or you may end up never leaving the VEH at all :)

Cheers!

Natvis Matrix visualizer

Long story short,

I moved to Visual Studio 2012 and then realized that all that tweaky autoexp.dat scripting that sports those fancy instance viewers during debugging was for naught. Since I didn’t move up to VS2013 yet, I’m a bit stuck with the “new and improved” Natvis system. Natvis stands for Native Visualizer, because your visializer can be compiled into a DLL (e.g. from c# code), which is a performance move from the purely scripted approach (using autoexec.dat) that was in use before. The scripting was arguably hairy, and after a while, hairpulling, so it’s only reasonable that Microsoft made an effort to improve things. So, out with the old, in with the new!

I encountered 3 problems with Natvis. The original feature set somehow got trashed, and only a thin set of features remain. This means for example that some information can no longer be displayed, or that you can’t format it correctly. Secondly, there are no conversion tools, everything that once worked is gone (although someone claims that autoexp.dat can be re-enabled for native edit-and-continue debugging in VS2013). Thirdly, if you don’t want to jump into and out of c# projects to change your debugger, you can use…. wait for it… XML-based .natvis scripts instead.

That’s right. I used XML and scripting in the same sentence. Off with my head!

A Matrix Class

Suppose you have a matrix class, say:

 template<typename T, int rows, int cols>
 struct MyMatrix
 {// your matrix members and methods here..
 };

That’s all great, but how do we visualize it? It seems the .natvis system can only list array items (or other containers) if the count is known. But in the case above, MyMatrix has template arguments for rows and columns dimensions and is of rank 2. In fact, the matrix structure can be a recursive template type definition (and to be fair, that’s what I am actually using at the moment, but I omitted that for clarity). The template arguments can be captured using $T1, $T2, $T3, etc.. but here’s where things get interesting: you have to put them in curly brackets (e.g. {$T1}) in the DisplayString element to fetch their values. Of course, when expanding the elements, the curly brackets should not be used! That took me a while to find out.

Second point of interest, and perhaps the gist of this post: all examples out there refer to internal members of the structs, but what if you have a recursive type? Well, it was mentioned in passing in the official documentation, but you can use the this pointer just as well, and index-cast it anyway you fancy, even using template types. This gives you access to anything that might be defined in the class/struct. The former type specifiers to print the values in a particular format are gone (apart from ,su for character strings it seems). Attempts to refer to other scoped types (i.e. (othertype*)this) failed so far, but maybe I made a typo.

 

A solution

I came up with the following. It’s far from perfect, and it tends to clutter the debug view with all the floats (because “,g” does no longer work), but it does show how to expand elements in a 2 dimensional ‘array’ after the this pointer.

<?xml version="1.0" encoding="utf-8"?>
 <!-- Place file into My Documents/Visual Studio 2012/Visualizers/ -->
<AutoVisualizer xmlns="http://schemas.microsoft.com/vstudio/debugger/natvis/2010">
 <Type Name="MyMatrix&lt;*,*,*&gt;">
 <DisplayString >[{$T2}x{$T3}]({(*(($T1*)this))}, {*((($T1*)this)+1)}, {*((($T1*)this)+2)}, {*((($T1*)this)+3)}),({*((($T1*)this)+4)},{*((($T1*)this)+5)},{*((($T1*)this)+6)},{*((($T1*)this)+7)}),({*((($T1*)this)+8)},{*((($T1*)this)+9)},{*((($T1*)this)+10)},{*((($T1*)this)+11)}),({*((($T1*)this)+12)},{*((($T1*)this)+13)},{*((($T1*)this)+14)},{*((($T1*)this)+15)})</DisplayString>
   <Expand>
     <ArrayItems>
       <Direction>Forward</Direction>
       <Rank>2</Rank>
       <Size>$T2</Size>
       <ValuePointer>(($T1*)this)</ValuePointer>
     </ArrayItems>
   </Expand>
 </Type>
</AutoVisualizer>

 

Look, html-ified < and > brackets!

 

You typically put a *.natvis file containing such scripts under the My Documents/Visual Studio 2012/Visualizers folder. You don’t have to restart Visual Studio, just restart your debugging session and it should work. If it does not work (i.e. you get the standard visualizer for a typed instance when you hover above it during debugging), you have to dig into the XML to find the problem.

Room for Improvement

The DisplayString is hard-coded for a 4×4 matrix. The obvious approach is to put conditional clauses for every possible combination of

{$T2}x{$T3}

but that just sucks as much as what I have now. Better would be to have some sort of iteration going on in the DisplayString, so that the length of it is dependent on the actual type, but that does not seem to be supported. Also, being able to split strings across multiple lines would be nice.

I must say that during debugging, I found peculiar recursive behavior when improperly scoping the this pointer in brackets, so watch out for that. I expected 16 repetitions at some point, but only 9 displayed in the visualizer. This can be investigated further, possibly leading to a recursive debug visualizer, so that larger types can be viewed as a collation of smaller types (i.e. a 4×4 = 3×3+7 = 2×2 + 5 + 7 or somesuch).

I tested this on 4×4 matrices. For 2×2, or 3×3 to work, you obviously need to remove or shorten the list of elements from the DisplayString (or make the list size dependent). For non-square matrices I’m not sure if the correct size is automatically inferred from a given dimensional size (row or col) and the size of the struct. If not, you have to drop the Rank element and set the size to row*col.

Conclusion

This demonstrates how to make the .natvis display a matrix array of values (or any kind of list), with the limit feature set, regardless of the type structure that is underneath it.

Some of the reasonings for switching from the autoexp.dat syntax to something more modern make sense. You can write visual debugging tools for your data, which is fantastic, and the performance improvement undoubtedly is beneficial in any debugging session.. But I think most people are fine with using simple tools, and being able to script them is something that has been downplayed a bit too much here. That the ‘old’ autoexp.dat system still functions under certain conditions makes forcing people to rewrite their visualizers in XML ill-founded and unreasonable.

The bad bad bad idea of trying to frame scripts with XML expressions has bitten any seasoned programmer at least once (and then hopefully we remember the pain long enough), but seeing even Microsoft fall into that trap reminds us to be thoughtful and repeat “I shall not script in XML” to yourself every once in a blue moon. Ruby or Python would have made more sense.

To be fair, the natvis feature set was extended in Visual Studio 2013 – still XML though – and this article is written from a VS2012 perspective. So it might be useful for people like me who can’t find a good reason to update every half-moon to a new VS version. Would be nice if VS2012 could be patched up to 2013 levels regarding these debugger issues, but that’s another story.

If it’s useful, let me know how it fares, and of course all comments are welcome.

Achievement

 

Coding again for a while now, but yesterday (that means half a year ago, since this post had been sitting in my drafts section that long) I managed to tick another ancient to-do on my list. Time for me to give back a few things that I learned the hard way.

The brief of it: I wrote a little effect intro containing a number of objects rendering with 2 passes, one shadow map and percentage closer filtering (single light source) and one that has a spotlight lit normal map using hardware. It’s using render to texture and alpha testing. Nothing truly incredible and earth shocking, but I’m quite proud of it because I feel I finally crossed a line. Actually 2 lines: the technical challenge, and the fact that I finally bulldozed through my own ignorance.

The vertex buffer trickery earlier in this blog is still totally there, so I guess that works well.

Here’s my list:

  • The creation of vertex buffers creates and destroys a separate worker thread in DirectX9. It is best to minimize the creation of buffers per frame by caching / reusing them. The performance gain in debug mode is dramatic.
  • FVF can not be mapped to something meaningful if we use tangent, normal, bi-normal formats. Use D3DVERTEXELEMENT9 instead for declarations.
  • A small issue I learned is that if you apply the template construction for for tangent space coordinates (normal map), the order in the vertex definition matters and you must make sure the tangent template is mentioned after all the fixed pipeline definitions (position, normal, color,  texcoord).
  • I used a D3DFMT_R32F type surface for the depth buffer / shadow map. It took me a while to realize I could only read the red component. It also supports alpha. D3DFMT_A32B32G32R32F works equally well, but I could not manage to store/read from the channels g and b (like e.g. w)
  • Unless I oversaw the obvious, it seems you can’t use the POSITION input in a pixel shader since the GPU will have consumed them. If you need them, you have to pass them 2ce, once as input and once as a TEXCOORD (which will interpolate them). That’s quite ridiculous.
  • HLSL matrices are column major by default. It usually means a transpose of your matrices.
  • There are 2 ways to upload the TBN values. A smooth surface (curves, spheres) needs a new tangent each fragment, whereas a non-smooth surface (polygonal) needs a tangent per tri. In the latter case, because since we interpolate between vertices, we still have to duplicate them in each vertex + you can’t share the vertices because they may belong to neighboring non-smooth tri’s with different normals. In the first case, computation happens in the vertex shader (re-normalize at fragment!), whereas the latter case can use pre-computed normals.
  • Normal maps behave badly in typical cases like e.g. spheres that converge in zenith points. Use appropriate topologies to get around this (e.g. geosphere)
  • Arrays are only supported locally in shaders? Passing them values from outside did not seem to work. I came across this issue when setting multiple light sources for a fragment shader.
  • It seems bind / render / unbind is mandatory, even if you render multiple objects with the shame shaders. I guess one possible way to optimize draw-call performance is to bundle objects together using vertex list tunnels.
  • I wasted a bit of time trying to combine the depth output of my shaders with forward rendered lines with z-test clipping. I solved it by using a vertex shader for the lines as well. For now I was a bit too focused on getting the shadows to work to really dig into this, but I’m pretty sure this isn’t necessary.
  • The tex2DProj function only works from profiles ps_4_0 onwards :/. Implementing PCS on ps_3_0 hw results in cutting back the filter to 9 samples instead of 16 with noticeable loss of quality.
  • Filtering a 4×4 kernel in 2 for loops did not work for me on ps_3_0. Not quite sure why it does not work, maybe compiler is out of registers? Unrolling the loops worked fine.

If you have any remarks, ideas questions or answers, I’d be glad to hear about them!

That said, this achievement has only kicked my enthusiasm up a few more notches and kept my coding skills brimming with new ideas. Here’s what happened in the second half of the year: After this little PC shadow mapping try-out I added another couple of other tricks to my proverbial bag:

  • I redid the whole DirectX but now with OpenGL shader model!
  • I built my own little demo rendering environment.
  • I gave a crash course on C++ AMP, yay!
  • I have Oculus Rift working on OpenGL, also yay. Might post some code although it’s pretty easy to port from the tutorial /SDK code.
  • I literally went all-out and turned my knowledge of GPGPU computing upside-down writing all sorts of stuff in Cuda and OpenCL , including direct integration with DirectX and OpenGL respectively. Cuda is blazingly fast and rather well designed, while OpenCL is just as clean as you would expect from OpenGL’s cousin, and actually a little more friendly. I’m curious where that leaves us with AMD’s new Mantle spec. My experiences with C++AMP were also OK. I quite liked the fact that it’s totally integrated in your compilation, but debugging this shit remains a drag.
  • I now have this ‘R’ language on my radar for some reason.
  • Implemented a full testing bench for thousands of tests (including the publicly available bg, doa, sc1, wc3 etc.. maps) and ran it on various kinds of path finding algorithms. Lots of data to read and lots of tweaking ahead. I also did work on path-finding and path smoothing, and I’m totally digging this almost theoretical shit! I should be able to publish something out of this soon.
  • The whole testing bench is also working for 3D models (PLY), includes a voxelizer and path finder and ray-caster. All of this is working pretty fast, but there’s still room for improvement. The indexing scheme and octile storage structures are blazingly fast.
  • For DragonCommander, after refactoring the hell out of some parts, I finally got around to finish the path-re-use scheme and add orientation-based position selection, which works remarkably well.  I’m anxious to see how close I can get to good group behavior, but I’m rather short on time. Kept a full diary of my progress, so I should be able to write something on this too.
  • Devised a little tracking routine to track the motion of n unrelated and unsorted objects in k-space, by estimating frame coherence based on path direction, velocity and acceleration. This routine was used in an IPEM project and is probably published soon.

I might start posting code fragments or at least some nifty demo material here, if there’s a need for it, but at the moment I’m knee deep in project work and paper publish deadlining, and I’ll stay in that zone for a little while longer.

My wish list for next year: something with robotics? Drones perhaps?

Anyway, I wish you all a warm and safe X-mas, or at least a happy new year! Cheers!

South of here

 

In July we got ourselves a well deserved holiday. Well, I was forced to take it at work, to be more precise, but you don’t hear me complaining. I actually enjoy it fully with my wife and kid, since we’re currently in the one single spot in Europe that has amazing sun, blue sky + water and supportable heat, combined with a great outdoors, good food and friendly people. I guess if you belong to my ‘friends’ category, the envy types are already going ballistic over this, since it’s all rain and downpour in the north. :)

Some thoughts about the time passed. Last year, after headaches and financial worry, we wrestled ourselves through our house-reconstruction. That went extremely well, actually. I’m now praying that the financial woes that are raging in the south of Europe are not going to affect our situation, but deep down, I can’t believe that won’t be the case. I’ve read far too much of ZeroHedge to keep an ignorant stance on the matter. Time will tell how this will develop.

In may of this year (2012) we got ourselves a lovely son, Arthur, who is 2 months old at the time of writing. He’s “growing like cabbage” – a valid Dutch expression which I’m hoping translates into correct English – and we are very lucky that he eats and sleeps like a pro. Even in this fairly hot summer at the Cote d’ Azur, he’s managing unbelievably well, laughs a lot, and apart from not exposing him too much to the burning sun around 12.am to 4pm, we don’t have to do anything special to keep him happy. A totally adorable darling he is. The women we encounter – without exception – all fell victim to his charms, overloading us with compliments. The kid’s more fad than a space rock star!

Having such a happy little child really changed a few things in my life. It puts more focus on my relationship, makes me work harder to get things ready and happening, and there is of course the feeding routine, the nappies and diapers, the bathing ritual, singing songs and acting like and allround idiot. Some of these things are easier to do than others.. I used to be convinced that breaking into my sacred 8 hours of sleep would wreck me, and while some nights can definately be rough, this proves to be untrue for the most part. Isabelle takes a lot of the work out of my hands, so of course that skews the picture a bit, but the gist of it is still true. She often tells me of her admiration of single mothers who have to cope on their own, and I can totally relate to that. The total net effect of the 3 of us surviving through all of the days and nights – and this totally surprised me – is the realization that a human is much stronger than he thinks he is, and can bare much more than he thought possible. I generalized the statement because I can’t believe I’m that special at all.

Let me explain: up until recently, I’d been having a pretty minimalistic view on my personal capacities. I’m quite a sensitive person, and when people repeatedly tell me how hellish it would be to care for a small baby, well, you start to accept a mantra if people repeat it to you enough. I sympathized and nodded, smiling politely as they told their stories, and reserved an ever diminishing amount of hope that just maybe things would be different for me if I ever became fortunate enough to have kids. This was tough. And I mean really tough. For more than 10 years on end, people have been giving me lectures on what it means to have kids. I hold no grudge against them of course, but it eats at your confidence, and when you are looking forward to the experience of fatherhood, it also eats part of your idealistic world view. After a certain amount of time, the sincere hopes and longings I started out with gradually were buried under layers of fearful goo, leaving me a mechanical hunch to pursue a faint shadow of what once were pristine motives, only adding to the doubts. Add to this a few biological fertility problems, and you may start to ‘get’ the overall picture of what we went through.

And then Isabelle did get pregnant, and Arthur did get born, and all of sudden, you get to experience everything first-hand. This was nothing short of an emotional wall that collapsed. Suddenly the hopes and yearnings that were buried in our deepest caves and canyons resonated firmly through our fibers. All the reasons and ideas literary came back to life, and one by one I started to revisit all the things people told us about. And one by one, those absolute truths crumbled to pieces. Yes, it’s an added responsibility, and yes, you have to adjust your daily life and give up things, and yes, things smell bad sometimes, but it is by no means hard. Maybe it’s because I’ve heard everything that there is to hear about having babies for the last 15 years – I had plenty of time imagining what that would be like. Or maybe it’s because I’m older than most parents are when they get kids and can, to some extend, let go of my favorite occupations more easily. (It still stings sometimes.) Or maybe it is because enduring all the psychological hardship is finally paying dividends. Or maybe it’s simply because Arthur is just a super kid. Whatever the reason, what I can say is that I tremendously enjoy being a father. It has made me stronger and a bit more confident. I’m giving it my very best and I know it’s working out when I hear the little guy laugh out, content, like he does every day. That’s in fact really all the feedback I need to regain my composure. It wipes out all that effort to store my hopes away in that private place full of dusty cobwebs. It slowly dissolves the feeling of driving with a handbrake that I’ve lived with for the last few years.

In the process of all this, and next to my increased interest in following the financial developments around the world, I also discovered that I grew fond of writing things down. For a lot of people (including me) structuring thoughts into strings of words realigns our minds to whatever meaning speaks from them. For those people, writing can be a means to grow, to build on top of what is already written. Mind you, there is absolutely no importance tied to the amount of people reading what I write here, nor do I care if people ‘like’ what I’ve written. It’s nice to hear that, of course. But if that expels me from the facebook generation, so be it. The important thing for me has already happened. I wrote down this text. And realizing the metaphysical importance of that process is another leap. As you write, you tend to reform sentences, replace words, restructure the content, elaborate or cut out parts that have no added value. But the operations on the text also reflect a mental transformation. If you practice this a lot, you automatically evolve your brain patterns, too. It’s totally unsharable, but probably the most important aspect. I guess you could compare it to sitting at a bar. The first few times, you feel a bit uneasy starting conversations, but as you tend to repeat the act, you grow more proficient at talking to people. I just never was the type to sit in a bar much, and with writing, there’s the added benefit that no one has to go through the various drafts of your musings. At university, besides “nerding” around in the demo-scene and playing chess on-line, the poetry mailing list was one of my favorite time-sinks. If people would read the contents of it now – I secretly hope that nothing of it survived the test of time, though in this digital era you never know – they would probably have a good laugh. I suspect the unbridled creative nature of that occupation helped me to develop a taste for writing.

So yeah. Confidence. Not an easy topic to write about. It’s now been 2 months since I wrote the previous parts, and I’m still gathering bits and pieces every day, and at the same time I learn to blend in into this city of surrealism with increasing success. The process teaches me that protective environments have to be temporary, or they do more harm than good. I know I still have a lot of work to do, but I’m out there pushing for it. So here it is. My honest account shared and published. I win this war. All that remains are the battles to survive, and I know these. Here I come again.

More good news reached us today. One of my best friends had a new baby born! A little girl that already has a big sister and 2 great parents, whom we don’t get to see often these days, which is a shame. Me and Isabelle wish the whole family all the best!