from: http://jacksondunstan.com/articles/1864
Speed Up Alpha Textures With Stage3D By 4x
Now that we know how to use textures with an alpha channel in rendering Stage3D
scenes, let’s see if we can cut the performance cost so we can use them more often. Today’s article will show some tricks to optimize your rendering loop.
The following test app started with the test app from last time and has some modifications made to it:
- New option to switch between “original” and “fast” sorting
- Rendering is now in two stages. First, view frustum culling is done to form a
Vector
of visible objects. Second, the visible objects are drawn. - Opaque texture and sorting options removed for simplicity’s sake
-
enableErrorChecking
no longer set on theContext3D
The “fast” sorting option is at the heart of this article’s optimization. You’ll recall that using alpha textures necessitates a back-to-front sort of the 3D objects in the scene. There are two ways that the “fast” sorting option speeds this up:
- Use Skyboy’s fastSort rather than
Vector.sort
to sort the 3D objects on a cached “distance from camera” field of the cube - Sort only the 3D objects that pass the view frustum culling step. Don’t bother sorting objects that will never be drawn.
Both of these are important optimizations, but the second is the major algorithmic change. Here’s the difference between the “original” and “fast” sorts: (pseudo-code)
/////////// // Original /////////// // Sort all cubes allCubes.sort(backToFront); // Draw all cubes that are in the view frustum for each (cube in allCubes) { if (cube.isInViewFrustum()) { draw(cube); } } /////// // Fast /////// // Make a list of all cubes that are in the view frustum visibleCubes = []; for each (cube in allCubes) { if (cube.isInViewFrustum()) { visibleCubes.push(cube); } } // Sort just those cubes visibleCubes.sort(backToFront);
There are two main “wins” here. First, sorting fewer 3D objects is clearly going to be faster. Second, good sorting algorithms run N * log2(N) times where N is the number of objects to sort. So each 3D object that’s being sorted adds more than one step to the sorting algorithm, making the increase more and more important as the number of 3D objects increases.
Now let’s take a look at the test app:
package { import skyboy.utils.fastSort; import com.adobe.utils.*; import flash.display.*; import flash.display3D.*; import flash.display3D.textures.*; import flash.events.*; import flash.geom.*; import flash.text.*; import flash.utils.*; /** * Test of faster ways of drawing alpha textures with Stage3D * @author Jackson Dunstan, http://JacksonDunstan.com */ public class FasterAlphaTextures extends Sprite { /** UI Padding */ private static const PAD:Number = 5; /** Number of cubes per dimension (X, Y, Z) */ private static const NUM_CUBES:int = 32; /** Number of total cubes */ private static const NUM_CUBES_TOTAL:int = NUM_CUBES*NUM_CUBES*NUM_CUBES; /** Positions of all cubes' vertices */ private static const POSITIONS:Vector.<Number> = new <Number>[ // back face - bottom tri -0.5, -0.5, -0.5, -0.5, 0.5, -0.5, 0.5, -0.5, -0.5, // back face - top tri -0.5, 0.5, -0.5, 0.5, 0.5, -0.5, 0.5, -0.5, -0.5, // front face - bottom tri -0.5, -0.5, 0.5, -0.5, 0.5, 0.5, 0.5, -0.5, 0.5, // front face - top tri -0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, -0.5, 0.5, // left face - bottom tri -0.5, -0.5, -0.5, -0.5, 0.5, -0.5, -0.5, -0.5, 0.5, // left face - top tri -0.5, 0.5, -0.5, -0.5, 0.5, 0.5, -0.5, -0.5, 0.5, // right face - bottom tri 0.5, -0.5, -0.5, 0.5, 0.5, -0.5, 0.5, -0.5, 0.5, // right face - top tri 0.5, 0.5, -0.5, 0.5, 0.5, 0.5, 0.5, -0.5, 0.5, // bottom face - bottom tri -0.5, -0.5, 0.5, -0.5, -0.5, -0.5, 0.5, -0.5, 0.5, // bottom face - top tri -0.5, -0.5, -0.5, 0.5, -0.5, -0.5, 0.5, -0.5, 0.5, // top face - bottom tri -0.5, 0.5, 0.5, -0.5, 0.5, -0.5, 0.5, 0.5, 0.5, // top face - top tri -0.5, 0.5, -0.5, 0.5, 0.5, -0.5, 0.5, 0.5, 0.5 ]; /** Texture coordinates of all cubes' vertices */ private static const TEX_COORDS:Vector.<Number> = new <Number>[ // back face - bottom tri 1, 1, 1, 0, 0, 1, // back face - top tri 1, 0, 0, 0, 0, 1, // front face - bottom tri 0, 1, 0, 0, 1, 1, // front face - top tri 0, 0, 1, 0, 1, 1, // left face - bottom tri 0, 1, 0, 0, 1, 1, // left face - top tri 0, 0, 1, 0, 1, 1, // right face - bottom tri 1, 1, 1, 0, 0, 1, // right face - top tri 1, 0, 0, 0, 0, 1, // bottom face - bottom tri 0, 0, 0, 1, 1, 0, // bottom face - top tri 0, 1, 1, 1, 1, 0, // top face - bottom tri 0, 1, 0, 0, 1, 1, // top face - top tri 0, 0, 1, 0, 1, 1 ]; /** Triangles of all cubes */ private static const TRIS:Vector.<uint> = new <uint>[ 2, 1, 0, // back face - bottom tri 5, 4, 3, // back face - top tri 6, 7, 8, // front face - bottom tri 9, 10, 11, // front face - top tri 12, 13, 14, // left face - bottom tri 15, 16, 17, // left face - top tri 20, 19, 18, // right face - bottom tri 23, 22, 21, // right face - top tri 26, 25, 24, // bottom face - bottom tri 29, 28, 27, // bottom face - top tri 30, 31, 32, // top face - bottom tri 33, 34, 35 // top face - bottom tri ]; [Embed(source="flash_logo_alpha.png")] private static const TEXTURE:Class; private static const TEMP_DRAW_MATRIX:Matrix3D = new Matrix3D(); private var context3D:Context3D; private var vertexBuffer:VertexBuffer3D; private var vertexBuffer2:VertexBuffer3D; private var indexBuffer:IndexBuffer3D; private var program:Program3D; private var texture:Texture; private var camera:Camera3D; private var cubes:Vector.<Cube> = new Vector.<Cube>(); private var fps:TextField = new TextField(); private var lastFPSUpdateTime:uint; private var lastFrameTime:uint; private var frameCount:uint; private var driver:TextField = new TextField(); private var draws:TextField = new TextField(); private var tempCameraPosX:Number; private var tempCameraPosY:Number; private var tempCameraPosZ:Number; private var fastSorting:Boolean; private var visibleCubes:Vector.<Cube> = new <Cube>[]; public function FasterAlphaTextures() { stage.align = StageAlign.TOP_LEFT; stage.scaleMode = StageScaleMode.NO_SCALE; stage.frameRate = 60; var stage3D:Stage3D = stage.stage3Ds[0]; stage3D.addEventListener(Event.CONTEXT3D_CREATE, onContextCreated); stage3D.requestContext3D(Context3DRenderMode.AUTO); } protected function onContextCreated(ev:Event): void { // Setup context var stage3D:Stage3D = stage.stage3Ds[0]; stage3D.removeEventListener(Event.CONTEXT3D_CREATE, onContextCreated); context3D = stage3D.context3D; context3D.configureBackBuffer( stage.stageWidth, stage.stageHeight, 0, true ); // Setup camera camera = new Camera3D( 0.1, // near 100, // far stage.stageWidth / stage.stageHeight, // aspect ratio 40*(Math.PI/180), // vFOV -6, -8, 6, // position 0, 0, 0, // target 0, 1, 0 // up dir ); // Setup cubes for (var i:int; i < NUM_CUBES; ++i) { for (var j:int = 0; j < NUM_CUBES; ++j) { for (var k:int = 0; k < NUM_CUBES; ++k) { cubes.push(new Cube(i*2, j*2, -k*2)); } } } // Setup UI fps.background = true; fps.backgroundColor = 0xffffffff; fps.autoSize = TextFieldAutoSize.LEFT; fps.text = "Getting FPS..."; addChild(fps); driver.background = true; driver.backgroundColor = 0xffffffff; driver.text = "Driver: " + context3D.driverInfo; driver.autoSize = TextFieldAutoSize.LEFT; driver.y = fps.height; addChild(driver); draws.background = true; draws.backgroundColor = 0xffffffff; draws.text = "Getting draws..."; draws.autoSize = TextFieldAutoSize.LEFT; draws.y = driver.y + driver.height; addChild(draws); var buttonsTopY:Number = makeButtons( "Move Forward", "Move Backward", null, "Move Left", "Move Right", null, "Move Up", "Move Down", null, "Yaw Left", "Yaw Right", null, "Pitch Up", "Pitch Down", null, "Roll Left", "Roll Right" ); var fastSortingCB:Sprite = makeCheckBox( "Fast Sorting?:", fastSorting, onFastSortingChecked ); fastSortingCB.x = PAD; fastSortingCB.y = buttonsTopY - fastSortingCB.height - PAD; addChild(fastSortingCB); var assembler:AGALMiniAssembler = new AGALMiniAssembler(); // Vertex shader var vertSource:String = "m44 op, va0, vc0\nmov v0, va1\n"; assembler.assemble(Context3DProgramType.VERTEX, vertSource); var vertexShaderAGAL:ByteArray = assembler.agalcode; // Fragment shader var fragSource:String = "tex oc, v0, fs0 <2d,linear,mipnone>"; assembler.assemble(Context3DProgramType.FRAGMENT, fragSource); var fragmentShaderAGAL:ByteArray = assembler.agalcode; // Shader program program = context3D.createProgram(); program.upload(vertexShaderAGAL, fragmentShaderAGAL); // Setup buffers vertexBuffer = context3D.createVertexBuffer(36, 3); vertexBuffer.uploadFromVector(POSITIONS, 0, 36); vertexBuffer2 = context3D.createVertexBuffer(36, 2); vertexBuffer2.uploadFromVector(TEX_COORDS, 0, 36); indexBuffer = context3D.createIndexBuffer(36); indexBuffer.uploadFromVector(TRIS, 0, 36); // Setup textures var bmd:BitmapData = (new TEXTURE() as Bitmap).bitmapData; texture = context3D.createTexture( bmd.width, bmd.height, Context3DTextureFormat.BGRA, true ); texture.uploadFromBitmapData(bmd); // Start the simulation addEventListener(Event.ENTER_FRAME, onEnterFrame); } private function makeButtons(...labels): Number { var curX:Number = PAD; var curY:Number = stage.stageHeight - PAD; for each (var label:String in labels) { if (label == null) { curX = PAD; curY -= button.height + PAD; continue; } var tf:TextField = new TextField(); tf.mouseEnabled = false; tf.selectable = false; tf.defaultTextFormat = new TextFormat("_sans"); tf.autoSize = TextFieldAutoSize.LEFT; tf.text = label; tf.name = "lbl"; var button:Sprite = new Sprite(); button.buttonMode = true; button.graphics.beginFill(0xF5F5F5); button.graphics.drawRect(0, 0, tf.width+PAD, tf.height+PAD); button.graphics.endFill(); button.graphics.lineStyle(1); button.graphics.drawRect(0, 0, tf.width+PAD, tf.height+PAD); button.addChild(tf); button.addEventListener(MouseEvent.CLICK, onButton); if (curX + button.width > stage.stageWidth - PAD) { curX = PAD; curY -= button.height + PAD; } button.x = curX; button.y = curY - button.height; addChild(button); curX += button.width + PAD; } return curY - button.height; } public static function makeCheckBox( label:String, checked:Boolean, callback:Function, labelFormat:TextFormat=null): Sprite { var sprite:Sprite = new Sprite(); var tf:TextField = new TextField(); tf.autoSize = TextFieldAutoSize.LEFT; tf.text = label; tf.background = true; tf.backgroundColor = 0xffffff; tf.selectable = false; tf.mouseEnabled = false; tf.setTextFormat(labelFormat || new TextFormat("_sans")); sprite.addChild(tf); var size:Number = tf.height; var background:Shape = new Shape(); background.graphics.beginFill(0xffffff); background.graphics.drawRect(0, 0, size, size); background.x = tf.width + PAD; sprite.addChild(background); var border:Shape = new Shape(); border.graphics.lineStyle(1, 0x000000); border.graphics.drawRect(0, 0, size, size); border.x = background.x; sprite.addChild(border); var check:Shape = new Shape(); check.graphics.lineStyle(1, 0x000000); check.graphics.moveTo(0, 0); check.graphics.lineTo(size, size); check.graphics.moveTo(size, 0); check.graphics.lineTo(0, size); check.x = background.x; check.visible = checked; sprite.addChild(check); sprite.addEventListener( MouseEvent.CLICK, function(ev:MouseEvent): void { checked = !checked; check.visible = checked; callback(checked); } ); return sprite; } private function onButton(ev:MouseEvent): void { var mode:String = ev.target.getChildByName("lbl").text; switch (mode) { case "Move Forward": camera.moveForward(1); break; case "Move Backward": camera.moveBackward(1); break; case "Move Left": camera.moveLeft(1); break; case "Move Right": camera.moveRight(1); break; case "Move Up": camera.moveUp(1); break; case "Move Down": camera.moveDown(1); break; case "Yaw Left": camera.yaw(-10); break; case "Yaw Right": camera.yaw(10); break; case "Pitch Up": camera.pitch(-10); break; case "Pitch Down": camera.pitch(10); break; case "Roll Left": camera.roll(10); break; case "Roll Right": camera.roll(-10); break; } } private function onFastSortingChecked(checked:Boolean): void { fastSorting = !fastSorting; } private function sortByCameraDistance(a:Cube, b:Cube): int { var deltaX:Number = a.posX - tempCameraPosX; var deltaY:Number = a.posY - tempCameraPosY; var deltaZ:Number = a.posZ - tempCameraPosZ; var aDist:Number = deltaX*deltaX + deltaY*deltaY + deltaZ*deltaZ; deltaX = b.posX - tempCameraPosX; deltaY = b.posY - tempCameraPosY; deltaZ = b.posZ - tempCameraPosZ; var bDist:Number = deltaX*deltaX + deltaY*deltaY + deltaZ*deltaZ; return bDist - aDist; } private function sortFast(): void { // Cache camera position tempCameraPosX = camera.positionX; tempCameraPosY = camera.positionY; tempCameraPosZ = camera.positionZ; // Only add cubes that pass frustum culling to visible list var numVisibleCubes:int; visibleCubes.length = 0; for each (var cube:Cube in cubes) { if (camera.isSphereInFrustum(cube.sphere)) { visibleCubes[numVisibleCubes++] = cube; // Compute distance of cube to camera var deltaX:Number = cube.posX - tempCameraPosX; var deltaY:Number = cube.posY - tempCameraPosY; var deltaZ:Number = cube.posZ - tempCameraPosZ; cube.camDist = deltaX*deltaX + deltaY*deltaY + deltaZ*deltaZ; } } // Sort all visible cubes fastSort(visibleCubes, "camDist", Array.NUMERIC); } private function sortOriginal(): void { // Sort all cubes tempCameraPosX = camera.positionX; tempCameraPosY = camera.positionY; tempCameraPosZ = camera.positionZ; cubes.sort(sortByCameraDistance); // Only add cubes that pass frustum culling to visible list var numVisibleCubes:int; visibleCubes.length = 0; for each (var cube:Cube in cubes) { if (camera.isSphereInFrustum(cube.sphere)) { visibleCubes[numVisibleCubes++] = cube; } } } private function onEnterFrame(ev:Event): void { // Set up rendering context3D.setProgram(program); context3D.setVertexBufferAt(0, vertexBuffer, 0, Context3DVertexBufferFormat.FLOAT_3); context3D.setVertexBufferAt(1, vertexBuffer2, 0, Context3DVertexBufferFormat.FLOAT_2); context3D.setTextureAt(0, texture); context3D.clear(0.5, 0.5, 0.5); context3D.setBlendFactors( Context3DBlendFactor.SOURCE_ALPHA, Context3DBlendFactor.ONE_MINUS_SOURCE_ALPHA ); // Cull and sort var beforeCullingTime:int = getTimer(); if (fastSorting) { sortFast(); } else { sortOriginal(); } var afterCullingTime:int = getTimer(); // Draw visible cubes var worldToClip:Matrix3D = camera.worldToClipMatrix; var drawMatrix:Matrix3D = TEMP_DRAW_MATRIX; var numDraws:int; for each (var cube:Cube in visibleCubes) { cube.mat.copyToMatrix3D(drawMatrix); drawMatrix.prepend(worldToClip); context3D.setProgramConstantsFromMatrix( Context3DProgramType.VERTEX, 0, drawMatrix, false ); context3D.drawTriangles(indexBuffer, 0, 12); numDraws++; } context3D.present(); // Update stat displays draws.text = "Draws: " + numDraws + " / " + NUM_CUBES_TOTAL + " (" + (100*(numDraws/NUM_CUBES_TOTAL)).toFixed(1) + "%)\n" + "Culling Time: " + (afterCullingTime-beforeCullingTime); frameCount++; var now:int = getTimer(); var elapsed:int = now - lastFPSUpdateTime; if (elapsed > 1000) { var framerateValue:Number = 1000 / (elapsed / frameCount); fps.text = "FPS: " + framerateValue.toFixed(1); lastFPSUpdateTime = now; frameCount = 0; } lastFrameTime = now; } } } import flash.geom.*; class Cube { private static var NEXT_ID:int = 0; public var id:int = NEXT_ID++; public var posX:Number; public var posY:Number; public var posZ:Number; public var mat:Matrix3D; public var sphere:Vector3D; public var camDist:Number; public function Cube(x:Number, y:Number, z:Number) { posX = x; posY = y; posZ = z; mat = new Matrix3D( new <Number>[ 1, 0, 0, x, 0, 1, 0, y, 0, 0, 1, z, 0, 0, 0, 1 ] ); sphere = new Vector3D(x, y, z, 2); } }
I ran this test app in the following environment:
- Flex SDK (MXMLC) 4.6.0.23201, compiling in release mode (no debugging or verbose stack traces)
- Release version of Flash Player 11.2.202.235
- 2.4 Ghz Intel Core i5
- Mac OS X 10.7.4
- NVIDIA GeForce GT 330M 256 MB
And here are the results I got:
32768 | 53 | 30 |
0 | 40 | 10 |
These two tests show the two optimizations in full effect. When all of the cubes are visible (first test), both approaches end up sorting all the cubes since they all pass the view frustum check. Therefore the only optimization being applied is the switch from sorting using Vector.sort
(which uses a compare function) and sorting using Skyboy’s fastSort
function (which uses a “distance from camera” field). This alone makes sorting twice as fast as it otherwise was.
The second case is where I’ve pointed the camera away from the cubes and none of them pass the view frustum check. In this case, zero cubes are being sorted in the “fast” method and all 32768 are being sorted in the “original” method. This results in a 3x speedup over the “fast” approach with all of the cubes present and a 4x speedup over the “original” method.
The above optimizations are just a couple of ways of improving performance when alpha textures are used in a 3D scene. If you have more techniques to suggest or have simply spotted a bug or have a suggestion, post a comment and let me know!
https://github.com/skyboy/AS3-Utilities/blob/master/skyboy/utils/fastSort.as
相关推荐
Stage3D是Adobe公司针对Flash平台推出的一种3D图形API,它允许开发者利用GPU硬件加速来创建高性能的2D和3D内容。在深入探讨Stage3D技术之前,有必要了解其产生的背景以及为何被选择作为Web3D技术的解决方案。 ### ...
本文使用 Modeling Textures with Total Variation Minimization and Oscillating Patterns in Image Processing的算法,能对图像进行分解,希望对大家有用
3D Models: Import 3D models with Model I/O and discover what makes up a 3D model. Coordinate Spaces: Learn the math behind 3D rendering. Lighting: Make your models look more realistic with simple ...
Every shader stage is explored, starting with the basics of modeling, lighting, textures, etc., up through advanced techniques such as tessellation, soft shadows, and generating realistic materials ...
Introduction to 3D Game Programming with DirectX 9.0c Shader Approach 源代码 This book presents an introduction to programming interactive computer graphics, with an emphasis on game development, ...
3D Graphics with Metal -- 共7分卷,此为分卷4 May 28, 2019 Video 3D Graphics with Metal English | MP4 | AVC 1920×1080 | AAC 48KHz 2ch | 3h 12m | 1.85 GB In this course you’ll get an introduction to...
3D Graphics with Metal -- 共7分卷,此为分卷1 May 28, 2019 Video 3D Graphics with Metal English | MP4 | AVC 1920×1080 | AAC 48KHz 2ch | 3h 12m | 1.85 GB In this course you’ll get an introduction to...
3D Graphics with Metal -- 共7分卷,此为分卷6 May 28, 2019 Video 3D Graphics with Metal English | MP4 | AVC 1920×1080 | AAC 48KHz 2ch | 3h 12m | 1.85 GB In this course you’ll get an introduction to...
3D Models: Import 3D models with Model I/O and discover what makes up a 3D model. Coordinate Spaces: Learn the math behind 3D rendering. Lighting: Make your models look more realistic with simple ...
Every shader stage is explored, starting with the basics of modeling, lighting, textures, etc., up through advanced techniques such as tessellation, soft shadows, and generating realistic materials ...
implement fundamental tasks in Direct3D, such as initialization, defining 3D geometry, setting up cameras, creating vertex, pixel, and geometry shaders, lighting, texturing, blending, and stenciling. ...
3D Models: Import 3D models with Model I/O and discover what makes up a 3D model. Coordinate Spaces: Learn the math behind 3D rendering. Lighting: Make your models look more realistic with simple ...
This book presents an introduction to programming interactive computer graphics, with an emphasis on game development, using Direct3D 10. It teaches the fundamentals of Direct3D and shader programming...
You'll build several graphics programs -- progressing from simple to more complex examples -- that focus on lighting, textures, blending, augmented reality, optimization for performance and speed, and...
《ShopWindows 3D:构建逼真的虚拟购物体验》 ShopWindows 3D 是一个专为模拟亚马逊(Amazon)交互体验而设计的三维技术。它旨在通过创新的视觉效果和互动设计,为用户提供更加生动、直观的在线购物体验。在这个...
Part III is largely about applying Direct3D to implement a variety of interesting techniques and special effects, such as working with meshes, character animation, terrain rendering, picking, particle...
3D Graphics with Metal -- 共7分卷,此为分卷3 May 28, 2019 Video 3D Graphics with Metal English | MP4 | AVC 1920×1080 | AAC 48KHz 2ch | 3h 12m | 1.85 GB In this course you’ll get an introduction to...
While lights and materials add a great deal of realism to a scene, nothing adds more realism ... This tutorial covers how to load textures, set up vertices, and display objects with texture. This tuto