Hitting a Stable Frame Rate
Frame rate stability matters more than peak numbers. A game that averages 55 fps but hitches to 15 fps every few seconds feels far worse than one locked at a consistent 30 fps. Players perceive inconsistency as lag, stutter, or broken controls. The goal is not the highest possible frame rate but the most consistent one. If your target hardware cannot sustain 60 fps, lock to 30 fps and deliver every frame on time rather than letting the rate bounce between 30 and 60.
Frame rate problems fall into two categories: CPU-bound and GPU-bound. When the CPU cannot finish game logic and draw call submission within the frame budget, the GPU sits idle waiting for work. When the GPU cannot finish rendering within the frame budget, the CPU sits idle waiting for the previous frame to complete. Diagnosing which side is the bottleneck is the first step toward fixing it. Chrome DevTools performance profiler and WebGL-specific tools like SpectorJS help identify which stage is over budget.
Measure Your Frame Budget
Open Chrome DevTools, go to the Performance panel, and record a few seconds of gameplay. The flame chart shows how long each frame takes and breaks the time down into scripting, rendering, painting, and idle. Frames that exceed 16.67ms will appear as long bars. Click on individual frames to see exactly which functions consumed the most time.
Separate your timing into categories. Add performance.mark() and performance.measure() calls around your update loop, physics step, AI update, and render call. This tells you whether the bottleneck is in game logic (CPU) or rendering (GPU). If your update function takes 12ms and your render call takes 2ms, you are CPU-bound and rendering optimizations will not help.
Build a simple in-game FPS counter that displays frame time in milliseconds rather than frames per second. Frame time is more useful because it scales linearly: going from 16ms to 20ms is the same absolute increase as going from 50ms to 54ms, but the FPS equivalents (62 to 50 versus 20 to 18) make the first seem worse than it is. Frame time tells you exactly how much budget you have left or how far over you are.
Reduce Draw Calls
Every draw call (a gl.drawElements or gl.drawArrays invocation) has CPU overhead as the driver validates state and submits work to the GPU. On desktop hardware, 1000 to 2000 draw calls per frame is manageable. On mobile, keep it under 200. Each draw call that you eliminate saves a fixed amount of CPU time regardless of how many triangles it draws.
Static batching merges the geometry of non-moving objects that share the same material into a single vertex buffer. Instead of 50 draw calls for 50 static trees, you have one draw call for all 50 trees. Build these merged buffers at load time. The trade-off is increased memory usage because each tree's vertices are duplicated in the merged buffer rather than instanced.
Instanced rendering is the best approach for repeated objects like particles, grass blades, crowd members, and asteroids. You upload a single copy of the mesh and a buffer of per-instance transforms (position, rotation, scale). The GPU draws all instances in a single call, reading each instance's transform from the buffer. WebGL 2.0 supports instanced rendering natively with gl.drawElementsInstanced, and most engines expose it as a built-in feature.
For 2D games, sprite batching combines all sprites that share the same texture atlas into a single draw call. The renderer builds a temporary vertex buffer each frame with the quad vertices for every visible sprite, then draws them all at once. Phaser, PixiJS, and most 2D engines do this automatically, but you can sabotage it by using too many separate textures or by interleaving sprites from different atlases in the render order.
Minimize State Changes
Switching shader programs, binding different textures, and changing blend modes are expensive operations because the GPU must flush its pipeline before applying the new state. Minimizing these switches is often more impactful than reducing triangle count.
Sort your draw calls to minimize state changes. Group all objects that use the same shader program together. Within each shader group, sort by texture. Within each texture group, sort by other state like blend mode or depth test configuration. This ordering means the GPU switches state as few times as possible while still drawing everything.
Some engines call this sort order the "render queue" or "render priority." Assign materials to render groups at asset creation time so the runtime sort is fast. For transparent objects that must be sorted back-to-front for correct blending, maintain a separate list and sort only that list by camera distance each frame.
Cull Invisible Objects
Frustum culling skips objects that are entirely outside the camera's view frustum. Without it, the engine submits draw calls for every object in the scene regardless of visibility, wasting both CPU time (for draw call setup) and GPU time (for vertex processing before the clip stage discards the triangles). Every modern 3D engine includes frustum culling, but verify that it is enabled and working. A scene with 5000 objects might have only 200 visible at any moment.
Distance culling removes objects beyond a maximum draw distance. This is cruder than frustum culling but effective for open-world scenes where thousands of small objects (rocks, grass, debris) exist at distances where they would be invisible anyway. Set per-object-type draw distances so important landmarks remain visible while small details fade out.
Occlusion culling skips objects that are hidden behind other opaque objects. A building behind a wall does not need to be rendered even if it is within the camera frustum. Occlusion culling is more complex to implement than frustum culling and has CPU overhead for the visibility queries, but it produces large gains in indoor scenes, urban environments, and any setting with significant occlusion.
Reduce Overdraw
Overdraw happens when the GPU shades a pixel multiple times because multiple objects overlap at that screen position. A naive rendering order draws distant objects first and near objects last, which means every pixel might be shaded three, five, or ten times. Sorting opaque objects front-to-back lets the depth buffer reject hidden fragments before the fragment shader runs, saving all that redundant work.
Transparent objects are the biggest overdraw offender because they cannot use the depth buffer for early rejection (they need to blend with what is behind them). Minimize transparent surface area in your scenes. Use alpha testing with a hard cutout threshold instead of alpha blending for foliage, fences, and other objects with sharp transparency edges. Alpha testing writes to the depth buffer and benefits from front-to-back sorting, while alpha blending does not.
Particle systems are notorious for overdraw. A fire effect with 500 overlapping, screen-filling, alpha-blended particles can consume the entire fragment shader budget by itself. Use fewer, smaller particles. Render particles at half resolution into an off-screen buffer and composite them into the main frame. Consider soft particles that fade near surfaces instead of hard clipping, which reduces the need for large, overlap-heavy particle counts.
Optimize Shaders
Fragment shaders run once per pixel per draw call. A complex fragment shader on a full-screen quad runs millions of times per frame. Move any calculation that is constant across all pixels into the vertex shader, which runs only once per vertex, or into a JavaScript uniform, which runs once per frame. For example, light direction in world space does not change per pixel and should be a uniform, not a calculation in the fragment shader.
Use shader LOD to apply simpler shaders to distant or small objects. A character at the far end of a level does not need subsurface scattering skin shading, normal mapping, and specular highlights. A flat-shaded fallback shader with a baked lighting texture looks identical at that distance and runs ten times faster.
Avoid branching in fragment shaders. GPU architectures execute both branches of an if statement when different pixels within the same warp take different paths, then discard the unused results. Replace conditionals with math: mix(a, b, step(threshold, value)) replaces an if-else without branching. This does not apply to all GPUs equally, but it is a safe general practice for WebGL where you do not control the hardware.
Scale Resolution Dynamically
Dynamic resolution scaling renders the game at a lower resolution when the frame rate drops below the target and returns to full resolution when the GPU has headroom. This is the most effective adaptive quality technique because fragment shader cost scales linearly with pixel count. Rendering at 75% resolution reduces fragment work by 44%. Rendering at 50% resolution reduces it by 75%.
Implement dynamic resolution by creating a framebuffer object at a variable size, rendering your scene into it, and then blitting it to the full-size canvas with bilinear filtering. Monitor frame time each frame: if the previous frame exceeded your budget, reduce the render scale by a step (for example, 5%). If the previous frame was well under budget, increase the scale by a step. Use a smoothing window to avoid oscillating between resolutions every frame.
The visual cost of resolution scaling is modest. Most players do not notice a 10 to 15% reduction, especially during fast-paced gameplay. UI elements should always render at native resolution by drawing them after the upscale step, so text and HUD elements remain crisp regardless of the 3D render resolution.
Stable frame rate comes from staying within your per-frame budget every frame. Reduce draw calls with batching and instancing, minimize state changes by sorting materials, cull what the player cannot see, and scale resolution dynamically when the GPU is over budget.