30 Sept 2008

Line-counting

Some line-count trivia:

Nebula3 Foundation Layer
66,502 lines
Nebula3 Render Layer
73,465 lines
Nebula3 Application Layer
22,706 lines
Nebula3 Addons
32,847 lines
Nebula3 All
195,520 lines
Nebula2
239,279 lines
Mangalore
181,592 lines
N2 + Mangalore
420,871 lines

The N3 line-count includes the current code for 3 platforms (Win32, Xbox360 and Wii), the N2+Mangalore count only includes one platform (Win32). Looks like N3 will end up a lot leaner then the old code which is a good thing :)

29 Sept 2008

Nebula3 September SDK

Here's the new SDK:

N3SDK_Sep2008.exe.

Please check my previous post for details, which got f*cked up pretty badly by importing it from Google Docs :(

27 Sept 2008

What's New in the September Nebula3 SDK


I finally got around to pack a new N3 SDK together. I'll upload it on Monday when I'm back in the office, in the meantime here's a rough What's New list. A lot of under-the-hood-stuff has changed, and I had to remove a few of the fancy front-end-features for now (for instance, the N2 character rendering had to be removed when I implemented the multi-threaded renderer, and the shader lighting code is broken at the moment). I'll care about this front-end stuff in the next release.

General Stuff


  • changes to enable mixing Nebula2 and Nebula3 code, mainly macro names are affected (DeclareClass -> __DeclareClass, ImplementSingleton -> __ImplementSingleton etc...)
  • started to remove #ifndef/#define/#endif include guards since pretty much all relevant compilers (VStudio, GCC, Codewarrior) support #pragma once 
  • moved identical Win32 and Xbox360 source code into a common Win360 namespace to eliminate code redundancies
  • added a new Toolkit Layer which contains helper classes and tools for asset export
  • added and fixed some Doxygen pages

Build System

  • re-organized VStudio solution structure, keeps all dependent projects in the same solution, so it's no longer necessary to have several VStudios open at the same time
  • it's possible now to import VStudio projects through the .epk build scripts (useful for actual Nebula3 projects which do not live under the Nebula3 SDK directory)
  • new "projectinfo.xml" file which defines project- and platform-specific attributes for the asset batch-export tools
  • split the export.zip archive into one platform-neutral and several platform-specific archives (export.zip contains all platform-independent files, export_win32.zip, export_xbox360.zip, export_wii.zip contain the platform-specific stuff)
  • added general multiplatform-support to the asset-pipeline (e.g. "msbuild /p:Platform=xbox360" to build Xbox360-assets)
  • new command-line build tools (with source):
    • audiobatcher3.exe (wraps audio export)
    • texturebatcher3.exe (wraps texture export)
    • shaderbatcher3.exe (wraps shader compilation)
    • buildresdict.exe (generates resource dictionary files)
    • these tools mostly just call other build tools (like xactbld3.exe, nvdxt.exe, or build tools for game-console SDKs)
  • note that the public N3-SDK only contains Win32 support for obvious legal reasons 

Foundation Layer

  • fixed thread-safety bugs in Core::RefCounted and Util::Proxy refcounting code
  • added WeakPtr<> class for better handling of cyclic references
  • added type-cast methods to Ptr<>
  • simplified the System::ByteOrder class interface
  • added platform-specific task-oriented "virtual CPU core id's" (e.g. MainThreadCode, RenderThreadCore, etc...)
  • added a System::SystemInfo class
  • added Threading::ThreadId type and static Threading::Thread::GetMyThreadId() method
  • proper thread names are now visible in the VStudio debugger and other debugging tools
  • SetThreadIdealProcessor() is now used to assign threads to avaible CPU cores on the Win32 platform
  • new HTTP debug page for the Threading subsystem (currently only lists the active Nebula3 threads)
  • MiniDump support: crashes, n_assert() and n_error() now write MiniDump files on the Win32 platform
  • new Debug subsystem for code profiling:
    • offers DebugTimer and DebugCounter objects
    • HTTP debug page allows to inspect DebugTimers and DebugCounters at runtime
  • new Memory::MemoryPool class for allocation of same-size memory blocks (speeds up allocation and reduces heap fragmentation)
  • some new and renamed methods in Math::matrix44
  • Http subsystem now runs in its own thread
  • added SVG support to Http subsystem (Http::SvgPageWriter and Http::SvgLineChartWriter)
  • added IO::ExcelXMLReader stream reader class, allows to read XML-formatted MS Excel spreadsheet files
  • added Behaviour mode to Messaging::AsyncPort, defining how the handler thread should wait for new messages:
    • WaitForMessage: block until message arrives
    • WaitForMessageOrTimeOut: block until message arrives or time-out is reached
    • DoNotWait: do not wait for messages
  • added Remote subsystem, allows remote-controlling N3 applications through a TCP/IP connection
Render Layer
  • moved rendering into its own thread (InternalGraphics subsystem on the render-thread side, and Graphics front-end subsystem on the main-thread side)
  • added CoreAnimation and Animation subsystems (under construction)
  • added UI subsystem for simple user interfaces (under construction)
  • added CoreAudio and Audio subsystems (under construction):
    • CoreAudio is the back-end and runs in its own thread
    • Audio is the "client-side" front-end in the main-thread (or any other thread)
    • designed around XACT concepts
    • comes with XACT wrapper implementation
  • added CoreGraphics::TextRenderer and CoreGraphics::ShapeRenderer classes, both intended for rendering debug visualizations
  • added debug rendering subsystem (currently under the Debug namespace)
  • Frame subsystem: FramePostEffects may now contain FrameBatches
  • Input subsystem: disconnected XInput game-pad slots now only check every 0.5 seconds for connected game-pads
  • Resources subsystem: added ResourceAllocator/ResourceLump system to prepare for true resource streaming on console-platforms
Application Layer and Addons:
  • removed CoreFeature (this stuff had to go into the GameApplication class to prevent some chicken-egg problems)
  • added NetworkFeature (under construction)
  • added UIFeature (under construction)
  • new CoreNetwork and Multiplayer addon wrapper subsystems for RakNet

Please note the special RakNet licensing conditions. Basically, RakNet is not free if used for a commercial project (http://www.jenkinssoftware.com/). Licensing details for 3rd party libs can be found on the Nebula3 documentation main page.

Stuff I want to do soon

  • fix the shader lighting code
  • add more shaders to bring the shader-lib up-to-par with N2
  • finish the CoreAnimation and Animation subsystems
  • design and implement proper skinned character rendering subsystem
  • add missing functionality to Audio subsystems (for instance sound categories) 
  • make shaders SAS compatible so they work with tools like FXComposer
  • implement a proper resource-streaming system on the 360 (as proof on concept)
  • optimize messaging (use delegate-mechanism for dispatching, optimize message object creation, add double buffering behaviour to AsyncPort for less thread-synchronization overhead)

20 Sept 2008

Adding functionality to threaded subsystems

Moving subsystems into their own thread introduces restrictions on how other threads can interact with the subsystem. It is no longer possible to simply invoke methods on objects running in the context of a threaded subsystem. The only way to interact with the subsystem is by sending messages to it. From a system design point-of-view this is a good thing. There's a very clear demarcation line defined by the message protocol to interact with the subsystem. It is pretty much impossible to invoke undocumented functionality from the outside and it is complicated to "accidently" use the subsystem's functionality in a way not intended by the subsystem's designer.

But of course those restrictions also have their dark side. All tasks which either require a lot of communication, or which require exact synchronization should better not be spread across threads. Although the messaging system is fast (and will remain an optimization hotspot) it is not free, it's not a good idea to send thousands (or even hundreds) of messages around per-frame. Also, a message sender should never wait for the completion of a message to work around the synchronization problem (at least not while the game loop is running), as this would pretty much nullify the advantage of running the subsystem in its own thread.

Nebula3 offers a relatively simple way to add functionality which shall run in the context of a subsystem thread. The basic idea is to create a new message-handler class (which is running in the subsystem's thread) and a new set of messages which can be processed by an instance of the new handler-class.

We recently did this to add debug-visualization capability to Nebula3. We wanted to have a simple way to (a) render debug text, and (b) render shapes (cubes, spheres, etc...) to make it simple to render debug-visualizations from anywhere in Nebula3.

The whole system is split into 3 parts:

  • The front-end classes running on the client-side (client-side means: every thread other then the render thread):
    • the Debug::DebugTextRenderer singleton offers text rendering
    • the Debug::DebugShapeRenderer singleton offers shape rendering
    • both are thread-local singletons, each thread which wants to render debug text or shapes needs to instantiate those
  • The back-end classes running in the render-thread:
    • CoreGraphics::TextRenderer
    • CoreGraphics::ShapeRenderer
    • these singletons implement the actual text- and shape-rendering functionality and are also platform-specific (under Windows, they use D3DX methods to do their jobs)
  • The communication components:
    • the Debug Render message protocol, this is a NIDL-XML-file (Nebula Interface Definition Language) which defines 2 messages: RenderDebugText and RenderDebugShapes
    • the DebugGraphicsHandler object, whose class is derived from Messaging::Handler, runs in the render thread, and processes the above 2 messages 

This is how the system works:

  1. the main thread instructs the GraphicsInterface singleton (which creates and manages the render-thread) to add a DebugGraphicsHandler object (that's at least how it SHOULD work,  at the moment, the GraphicsHandler simply creates and attaches a DebugGraphicsHandler on its own)
  2. client threads create one DebugTextRenderer and one DebugShapeRenderer singleton if they want to do debug visualization
  3. a client-thread calls directly one of the DebugTextRenderer or DebugShapeRenderer methods to render text or shapes
  4. the DebugTextRenderer and DebugShapeRenderer singletons collect a whole frame's worth of text elements and shapes and once per frame, create a single RenderDebugText and RenderDebugShapes message, so at most only 2 messages are sent into the render thread per-frame from each client-thread, not one message per shape and text element, that's a very important optimization!
  5. Once per render-frame, the DebugGraphicsHandler processes incoming RenderDebugText and RenderDebugShapes by calling the CoreGraphics::TextRenderer and CoreGraphics::ShapeRenderer singletons

That's it basically. Nebula3 applications can add their own functionality to subsystem threads by following the described pattern.

With the first naive implementation we stumbled across an obvious problem: when the main-thread runs slower then the graphics thread, debug shapes and text would start to flicker, since the render thread would only receive render-debug-messages every other frame. So we had to add a way to identify shapes and text elements by their origin-thread-id, and keep them around until the next message comes in from the same thread, but this was a trivial thing to do.

A positive effect is that debug visualization no longer needs to happen at a specific point in the render loop. This was a problem in Nebula2/Mangalore where classes had to provide an "OnRenderDebug()" method which was called by the rendering system from within the render loop. Instead debug visualization can now happen from anywhere in the code (although at the cost of some more memory and communications overhead, but especially debug visualizations is an area where convenience and ease-of-use is more important then raw performance).

FYI, this is how the NIDL-file looks like, which defines the messages of the DebugRender protocol:

<?xml version="1.0" encoding="utf-8"?>
<Nebula3>
    <Protocol namespace="Debug" name="DebugRenderProtocol">
        <!-- dependencies -->
        <Dependency header="util/array.h"/>
        <Dependency header="threading/threadid.h"/>
        <Dependency header="coregraphics/textelement.h"/>
        <Dependency header="debugrender/debugshaperenderer.h"/>

        <!-- render text string on screen for debugging -->
        <Message name="RenderDebugText" fourcc="rdtx">            
            <InArg name="ThreadId" type="Threading::ThreadId"/>
            <InArg name="TextElements" type="Util::Array<CoreGraphics::TextElement>" />
        </Message>

        <!-- render debug shapes -->
        <Message name="RenderDebugShapes" fourcc="rdds">
            <InArg name="ThreadId" type="Threading::ThreadId"/>
            <InArg name="Shapes" type="Util::Array<CoreGraphics::Shape>" />
        </Message>

    </Protocol>
</Nebula3>    
    
This will be compiled by the Nebula3 NIDL-compiler-tool into one C++ header and one source file (debugrenderprotocol.h and debugrenderprotocol.cc). 

I hope to have a new source drop out "really-soon-now", so you can check for yourself what I'm actually talking about :)

8 Sept 2008

Mercenaries 2

I'm currently having a lot more fun with Mercs 2 then I ever had with GTA4. Proves that blowing up shit is a lot more important then story in sandbox games. At least to me he. The game is a bit rough around the edges and has a number of minor bugs and glitches, but all the important stuff (controls, gun feedback, immersion, frequency of oh-shit moments) is much better then in GTA (IMHO of course). Battling a group of 3 or 4 heavy tanks in an army-occupied city only with RPGs and C4 is an absolutely exhausting experience, but so much fun with buildings crumbling left and right, tank shells and RPGs whizzing by and the distinctive sound of distant sniper fire over the general combat noise. Of course the player could opt to level the entire city with a few air strikes, but that would cost a lot of civilian lifes and wouldn't be well received by the Guerilla. And besides, air strikes are freaking expensive ;)