Santiago Orgaz's blog about xNormal, CGI and 3D graphics

Saturday, May 16, 2009

xn 3.16.10 / xn4.0 alpha 1 soon

The 3.16.10 has been released. Solved the AO/bent faceted/hard edges problems as well as other bugs.

Also, I plan to release the xn 4.0 Alpha 1 this summer for linux, Windows, MacOSX and Solaris/OpenSolaris. The objectives for this alpha version are:

1. To test the portability and early potential platform problems.
2. To show the new native look-and-feel standard UI ( and not that ugly "thing" I created for xn3 )
3. To implement basic realtime custom shaders(GLSL by now) + a functional 3D viewport.
4. To implement basic offline custom shaders ( normal mapping by now ).
5. To implement a basic mutithreaded tool ( the HM2NM probably ).

Stay tuned!

Monday, April 20, 2009

xNormal 3.16.8 released


xNormal 3.16.8 released:

- Added 3dsmax 2010 support.
- Improved rendering speed for bent normals
- Solved a lot of bugs ( Maya tangents, DX10 starfield, weld vertices, ps i|10n plugin folder, etc.. ).

Next step: June , xNormal 4.0 Alpha 1(Windows,linux... and perhaps MacOSX/Solaris) + perhaps a hw-accelerated ray tracing surprise :p

Friday, March 27, 2009

Raytracing: what's coming up!

Ok, today is the Intel Larrabee's GDC 2009 ISA presentation. Here's a good resume:


1. Larrabee is a discrete GPU aimed to compete vs the NVIDIA GeForces 2XX/3XX and also the ATI Radeon 4XXX/5XXX. Although it's oriented to rasterization ( it gonna be DirectX10/11 and OpenGL3 compatible), it has a very interesting GPGPU architecture which makes it unique.... and also a very good thing: if DirectX 12 appears, it will support it because Larrabee uses internally a software renderer!... so it could be improved just updating its drivers or firmware


2. Larrabee is based on the Intel's x86 SSE and many-core architecture. Well, it's sightly different... it has 512bits registers ( 16 single precision floats, 8 double precision ). In comparison, a Core 2/i7 uses just 128 bits. The Larrabee's SSE has also more advanced instructions like a masking register, improved conditionals/branching, recursive function call stack and /16/24/32/48 cores. It also executes much more SSE instructions per cycle than a Core i7 ( I think 160 vs 8 ). It's rumored to have 2 Teraflops of raw power ( like 2 or 3 Radeon 4870 ).

You can see more info about the LRBni SSE instructions here:
http://software.intel.com/en-us/articles/prototype-primitives-guide/

typedef struct { float v[16]; } _M512
typedef struct { double v[8]; } _M512D
typedef struct { int v[16]; } _M512I
typedef unsigned short __mmask;

MADDN132_{PS,PD} – Multiply, Add and Negate Vectors
Performs an element-by-element multiplication between vector v1 and vector v3, adds the result to vector v2, and negates the sum.
_M512 _mm512_madd132_ps(_M512 v1, _M512 v2, _M512 v3)
_M512 _mm512_mask_madd132_ps(_M512 v1, __mmask k1, _M512 v2, _M512 v3)
_M512D _mm512_madd132_pd(_M512D v1, _M512D v2, _M512D v3)
_M512D _mm512_mask_madd132_pd(_M512D v1, __mmask k1, _M512D v2, _M512D v3)

3. Larrabee has a good cache architecture: each core has 256Kb of cache communicated by a 1024 bits(512 x 2 ) bi-directional ring bus. The cache is good to hide the video memory latency and also to reduce the programming complexity. It must be big to allow to traverse the ray tracing structures fast!


4. Larrabee has a complete virtual memory system ( like a CPU ). This is good to manage a scene of zillions of polygons without getting a nice "out of memory" error.

5. Larrabee is made in 32nm using a base frequency of 2,5Ghz... In comparison, some Radeons have severe problems passing the 1Ghz at 40nm. The TDP estimated is 300W ... so Larrabee could use a very efficient and innovative refrigeration system.. like this ionic wind one :p



Or Gallium metal ones, like they did in this ATIX850 experiment:


Or the ones explained here:
http://www.electronics-cooling.com/articles/2005/2005_nov_article2.php

6. Larrabee could use low-voltage 7Ghz GDDR5 memory.


7. Larrabee gonna be very easy to program via Intel Compiler, Thread Blocks, OpenMP, VTune/GPA. Here is a screenshot of their graphics performance analyzer profiler being adapted for Larrabee:


8. Intel bought Project Offset's company to make the 1st game using Larrabee. Here is a preview :p




Sooooooooo.... I have only good words for Larrabee (on the paper)... I see lots of possibilities for ray tracing using it! We can't wait to see a photo of the PCB!

On the other hand, NVIDIA is occupied these days preparing a new thing called NViRT ( NVIDIA Ray tracing API ). It's the API used to run this Siggraph 2008 scene using CUDA ray tracing:

It's almost ready. You can find a preview here:
http://realtimerendering.com/downloads/NVIRT-Overview.pdf

Seems very good, speciallized and fast. I'm impatient to use it in xNormal.

Meanwhile, a new ray tracing "actor" appeared into scene: Caustic Graphics ( www.caustic.com ) . This is a hardware-accelerated graphics card called CausticOne:

It's just a prototype for developers... which could explain the strange SO-DIMMs and JTAG connectors.

Here's a video showing the real PCB:



Its API it's very versatile and it can be used for both realtime applications and offline renderers.
I'm pretty sure I would be able to improve the xNormal speed with that a lot!
Well... we need to wait to see more.

I must also mention an incredible discovery made this week... A graphene oscillator.
Graphene is just a form of carbon discovered recently ( I think in 2004 ) which has very interesting electrical and thermal properties:

It's just a layer of atoms of carbon as you can see :p


The famous "carbon nanotubes" are made of graphene :


But it's the most resistant material discovered ever ( a small film of that can parry a bullet ), it's harder than a diamond, it's cheap, it can be used to protect against the heat and cold ( much more than ceramic materials ), it can be a super-conductor or a semi-conductor... and, now, it can oscillate like Quartz but up to 1Thz !

Here's a picture of the prototype:

The authors claim that this technology could be implemented in two years for desktop computers! Imagine a CPU of 1 terahertz (1000Ghz) consuming 2 Watts!

You can read more about this amazing discovery here:
http://web.mit.edu/newsoffice/2009/graphene-palacios-0319.html

Of course, with a 1Thz CPU I could be able to accelerate the ray tracing for xNormal a lot :p

Soooooooooo... all these news are good news for the future of ray tracing!

Tuesday, March 17, 2009

xNormal 3.16.7 released



xNormal 3.16.7 is available for download.

  • The software AO rendering now is 200% faster.
  • Solved a bug that could cause the Simple AO tool to hang when the user enables the “CPU rendering” option.
  • Now, by default, the mesh cage will be created using averaged vertex normals.
  • Added the “break” and “weld” buttons to the 3D viewer's cage editor.
  • Added some words about mesh T-junctions in the documentation.

Tuesday, March 10, 2009

The future of realtime graphics

How could be the future next-next-generation realtime graphics? Well... somebody could say "mix rasterization and ray-tracing using triangular/quad patches displacement mapping". Ok, there's nothing bad with that... but I think the future could be a bit different than that :D

I think the future could be a sparse octree of VOXELS.

In the past, some games like Comanche 3 and Outcast used voxels with very good results:







Remember we used a 486 to play them... now we have much more poweful computers...

Recently, Jon Olick(Zelex) from ID Software demonstrated a realtime rendering of an sparse-octree of voxels:





Also you can see the always funny Voxelstein 3D:



Crysis is using voxels to render terrain holes and caves:




A product called "Unlimited Detail" it's experimenting with a voxel engine too:


http://www.tkarena.com/Articles/tabid/59/ctl/ArticleView/mid/382/articleId/38/Death-of-the-GPU-as-we-Know-It.aspx

I'm also writing a voxel renderer ( using OpenCL ) with global illumination and radiosity for the xNormal 4's realtime viewer :D

The main problem with this technique is the amount of data to manage: A 4k x 4k x 4k 3D space occupies 58Gb. You need to multiply that by 7 bytes ( 1 voxel = 1 RGBA color + 1 normal ). So that's 406Gb!

Hopefully, the data can be compressed and streamed from a SSD hard disk:



SSDs will be very good to solve that problem because they can reach really amazing reading speed ( almost 1Gb/s ) and their access time ( 0.1ms ) rocks!

Once we have a method to render the compressed voxels and we could stream them, would be possible to perform the radiosity computations to achieve physically-correct realtime lighting. This video shows an approximation to that:



The final thing to solve would be animation... but won't be a problem. Vertex/geometry shaders could be executed on triangular meshes... then triangles can be transformated to voxels on the fly using a simple ( and parallel ) algorithm... and then a sparse octree can be created on the fly.
If you need just some animations or mobile objects, a precomputed set of voxels can be used... so you just really need to pass the camera's rays into the local space of the object.

Finally, to solve the blocky appearance, a special form of trilinear interpolation could be done. See this:

Ok... a bit blocky... but now see what happens if we apply some kind of trilinear interpolation:

Much better, isn't it? More info here:

http://www.cs.utah.edu/~wyman/research/fall2003/index.html

Sooooooooooo.... I think the future are voxels. They offer good detail. With streaming and compression we could render infinite detail. They render fast.
They are compatible with transparent surfaces. You can use raycasting, raytracing and global illumination. They can be animated, you can use constructive solid geometry / to apply booleans... Imagine a 100% destructible environment... They HAVE LOTS OF POSSIBILITIES, that's for sure!

Friday, February 06, 2009

xNormal 3.16.6

xNormal 3.16.6 has been released.

Some bugs were fixed:

- OpenGL ".0f" error in the Simple GPU AO tool.
- Misalignment affecting AO, normal maps, cavity and bent normals.
- Minor ones.

I also added an unseful new parameter (bias) to the HM2NM tool for radiosity normal mapping.

Saturday, January 10, 2009

3D r.e.d.e.f.i.n.e.d

The following post is not necessarily related to xNormal... but you know I like to talk about other things different than xNormal sometimes.

I like films. I read some time ago about James Cameron's next film called Avatar. I think James tries always to innovate and to surprise us. He did with Aliens, Terminator and Titanic. These last two used very innovative CGI effects... and, of course, all of them were excellent movies! But his next project just makes me to drool! Avatar. Avatar. Avatar! ( a sci-fi action movie )

Keep this name in mind because gonna S.H.O.C.K you. Movies will be never the same! Before explaining you why, let me mention a related thing ... and you'll see why I must mention it now:

Today NVIDIA presented its 3D glasses on the CES ( I know, ELSA had something similar some time ago for the NVIDIA cards too ... but these ones are very improved ):


As you can see these are a pair of POLARIZED glasses, not the old blue-red ones ( do you remeber Back to the Future's Beefs's gang? :D ). To achieve the 3D effect correctly you'll need also a special 120Hz LCD ( so each 60Hz the image is swapped for each eye ). There are also some 3D monitors which generates automatically the effect but you need to be exactly on its front or the 3D effect will be lost... I think the glasses are a better solution because they allow you to move and turn your head with more freedom.

More info here:
http://www.fudzilla.com/index.php?option=com_content&task=view&id=11324&Itemid=1

By the way, DisplayPort 1.2 gonna include native hardware support for stereoscopy ( two different video signals, one for the right eye and other for the left one ).

That technology is very interesting. I don't know if you can see the 3D effect using simple paired-cross-eye images ( somebody simply cannot without getting a nice headache) ... but it's amazing if you can overlap these images together:


You can find more of these wonderful photographs at

http://digital-photography-school.com/blog/9-crazy-cross-eye-3d-photography-images-and-how-to-make-them/

Why I tell you all this? Because Avatar gonna use this kind of 3D technology! ( although I don't know if NVIDIA is involved in the Avatar's game ... probably not ). See this video with a very good explanation of the cameras which are being used to make Avatar and why you'll need a pair of polarized glasses:



The idea is to use two digital cameras... one gonna film for the right eye and the other for the left eye. Then, the image will be combined by the cinema's projector or a 120Hz monitor in the case of a PC. To achieve that, a pair of polarized glasses are needed... so the light incoming to you can be directed to the right or to your left eye. The red-blue glasses only generate ugly and cheesy artifcats! Polarized ones are much better in my opinion.

The only problem is that the cinema's projector needs to be adapted to this new technology... they will use two independent projectors: one will project the left eye frames and othe other the ones for the right eye. The frames will overlap on the screen... so the polarized glasses are used to filter the overlapped images for your eyes!
The only main problem I find with this is one: are you going to need to buy your own glasses or does the cinema will provide them? I prefer to buy my own polarized glasses in a site like Amazon:

http://www.amazon.com/TV-Eyes-3-D-Glasses-Watch/dp/B000W9Y294/ref=tag_tdp_sv_edpp_i/188-3984203-8836026

I don't like to touch objects that other strange person used :p
There's also other problem... Avatar is expected to come by summer 2009 ( or december 2009... it was delayed to avoid other film's competency like Harry Potter or Star Trek XI and to finish last-minute things ). By then, the most important cinemas in the world must be already adapted for this new amazing 3D technology ( like IMAX did some time ago ) .. and that costs a lot of money and time!

Now let's talk a bit about the Avatar movie...
Here's a concept sketch showing one of the exotic world's creatures:


And the teaser poster:



The synopsis is something like this: a space marine enters a worm's hole and travels into an exotic world plenty of wild life. There are some intelligent creatures there slaving other. He will help the oppresed to fight for their freedom... or something similar! ( the argument is not yet unveiled ... but reminds me to the "The Time Machine" movie )

You can find much more details about the 3D system and the film on this excellent Vanity Fair's article:


http://www.variety.com/article/VR1117983864.html?categoryid=1043&cs=1

I can't wait to watch it in 3D!

xNormal 3.16.4


xNormal 3.16.4 has been released!

- Added SBM mesh importer/exporter support for Maya 8.5, 2008 and 2009.

- Added an option to see the vector displacement map's seams in the 3D viewer

- Added a new tool: the SBM file converter.

- Added a occluded/unoccluded color to the Simple GPU AO tool.

- Now it's possible to resize the DX9 viewport window. Solved some D3DERR_DEVICE_LOST problems too.

- Solved some bugs ( AO from command line, unsigned height map, unnecessary tangent computation, etc... )

Thursday, November 27, 2008

I like Ubuntu linux !



After several years using propietary and non-free OSs, I'm going to try something different. xNormal 4.0 gonna be (mainly)programmed using Ubuntu linux. I'm using it now and it's currently the operating system I like the more !

Some reasons:

- It's free. We're in the middle of an economical crisis(read well... crisis... not crysis :p). Anything is fine to save money. You can choose to pay lots of money for an operating system or just download for free a linux distro. I think the choice is clear... linux wins!

If you use a computer beowulf cluster then this can help you to save lots of money. I plan to improve the distributed renderer for xNormal 4.0. For an enterprise you could construct a small beowulf with old deprecated computers ( P4s, Athlon XPs, etc... ) to create a nice and cheap rendering farm. If you put linux there it will even cheaper because you'll save like 20 or 30 OS licenses!

People think that linux has more maintainance costs than other OSs... but I think today are more or less the same. We passed from a command line/text file-governed linux to a dialog-visually-based one... with Ubuntu it's much easier to setup and to configure all.

- It's easy to use. Remember the Ubuntu's slogan..... Ubuntu is "linux for human beings". You won't need to use the command line/configuration files like it happens with other complicated linux distros. Almost all in Ubuntu is dialog-based, easy to understand, fast to find and intuitive. You can share files across the network, manage the screen, change file permissions or to alter some configurations just using a few clicks.

- It's secure. There are almost no virus/trojans due to the OS structure(and also because there is less user base... let's be objective). You have also a good network firewall, a free antivirus and a friendly administrator security system(click/password confirmation.. like UAC but better).

- It runs like a charm. With the exception of very rare hardware, I think almost all the "common" hardware should work. I got no problems with SerialATA/USB/mouse/keyboard and both NVIDIA and ATI have excellent graphics drivers. If you have a problem you can inform the Ubuntu team using the hardware test tool.

The system uses 80Mb of RAM only ! It occupies 3-4Gb in the hard disk and the installation CD is just one CD(sounds trivial but today is hard to find a OS with an installer using less than 640Mb). The installation CD is a LiveCD which you can use to test the OS or to boot without having to install it!

And the most important part... you DON'T need a supercomputer to run it! There's no need to buy a new CPU, motherboard, RAM, graphics card, hard disk ... nothing of that.

- The updating/packaging system is fantastic. Synaptic is called. You can install, uninstall or upgrade hardware drivers and applications just with a few clicks... and the service it's free too! You don't need to subscribe, to pay, to pass a weird piracy test system... neither of that!

- There are several desktop systems available. Other OSs are attached to a specific look and UI behavior. You have distros of Ubuntu with Gnome(Ubuntu), KDE(Kubuntu) or Xfce(Xubuntu)... so you can choose the more you like. I prefer Gnome but well...

On the other hand, it includes Compiz which is great for 3D desktop effects.... and it's completely superior to anything you've seen.

- Includes the software the user needs. Gimp(to manipulate images), Firefox/flash player/java plugin(to browse the Web), OpenOffice(to write and read documents)/PDF/PPT viewer, Brasero(to burn CDs/DVDs) and Totem(media player). You have also the standard(but useful) set of applications: calculator, text editor, some games, thumb viewer, screensavers, administration tools, etc...

Optionally, you can also download gcc/g++(probably one of the best compilers you can find), Netbeans, Eclipse, wxFormsBuilder/wxGlade, JDK, Mono, MySql, Php,), OpenGL, OpenAL, etc... to develop applications.

Also there are a lot of 3rd party software available: Maya, XSI, Quake Wars, Doom3, Unreal Tournament, Second Life, 7Zip, ClamAV, aMule, etc... You can also use Wine to run your Windows programs in linux...

I think to port an application to linux is not very hard if you plan it well and you try to avoid non-portable libraries. So what more I could say? I like Ubuntu and now it's my default desktop OS... and I'm going to develop almost all the code of xNormal 4 with it!

Ubuntu is ready to conquest the desktop and your heart ! Give it an opportunity as I did !

Friday, November 21, 2008

xNormal 3.16.3

The 3.16.3 has been released.

It includes realtime true vector displacement mapping for the DX10 graphics driver.

This is an example using just normal mapping:


And this is with vector displacement mapping applied:


It's a convincing trick, isn't it? :D

This technique works with DX9 SM3.0 or DX10 and consumes much less video memory than a statically-tessellated mesh. It's suitable for animation too. As you can see in this image, I'm using micropolygons to render:


There is no trick like parallax mapping, relief mapping or prisms.... it's true vector displacement mapping using the ZBuffer and realtime tessellation. By the way... the sphere has 162 faces and it's using a 512x512 VDM, so it occupies like 1M in memory ( vs the 200Mb of a 18M poly statically-tessellated mesh... ) and runs very fast using a modest GF8500GT.

Here's the video:

video

Currently I'm using a brute force approach, but I'm gonna make it adaptive using the distance to the camera.

The 3.16.3 also corrects some bugs ( fixed problems with the command line, cavity+matchUVs , Simple GPU AO cosine modes and Windows 2000 ).