AR Games on the Web

Updated June 2026
WebXR's augmented reality capabilities let you build games that blend virtual content with the player's real environment, all running in the browser. Players can place game boards on their kitchen table, defend their living room from virtual invaders, or grow a garden on their floor, with no app install required. This guide covers the AR-specific WebXR modules and how to use them for game development.

AR on the web works through the same WebXR Device API used for VR, but with a different session mode and a set of environment-sensing modules that give your game awareness of the physical world. The key difference from VR is that your rendered content is composited over the real environment rather than replacing it entirely. This means your game needs to understand where surfaces are, place objects convincingly on those surfaces, and ideally handle occlusion so real furniture can hide virtual objects behind it.

The devices that support WebXR AR each bring different capabilities. The Meta Quest 3 offers full passthrough AR with plane detection, hit testing, anchors, depth sensing, and hand tracking. The Apple Vision Pro provides transparent display AR with gaze-and-pinch input. Android phones running Chrome support basic hit testing and plane detection through ARCore. Each device gives you a different set of tools, so feature detection at runtime is essential.

Step 1: Start an Immersive AR Session

Request an immersive-ar session instead of immersive-vr. The session request includes optional and required features that tell the browser which AR modules your game needs. A typical AR game session might require "hit-test" and optionally request "plane-detection", "anchors", and "depth-sensing".

When the AR session starts, the browser makes the rendering background transparent. In Three.js, the renderer's alpha property must be true and the scene background must be null. In Babylon.js, the scene's clearColor alpha should be set to 0. Any pixel you do not render becomes transparent, showing the real world behind it. This is handled automatically by most framework AR helpers, but it is important to understand so you do not accidentally render an opaque skybox that blocks the passthrough view.

The AR session's reference space should typically be "local-floor" for games that place content relative to the player's floor, or "unbounded" for experiences where the player can walk around freely and the content should remain anchored to physical locations. The unbounded space is supported on headsets like the Quest 3 and Vision Pro, while phones typically support only local reference spaces.

Provide a clear "Enter AR" button that communicates what the experience involves. Players are more willing to grant sensor permissions when they understand why the game needs camera and environment access. If the browser denies the session or a required feature is not available, fall back gracefully to a VR mode or a flat 3D preview rather than showing an error.

Step 2: Implement Hit Testing for Surface Placement

Hit testing is the foundation of most AR interactions. It lets you cast a ray from a device input source (controller, hand, or gaze) into the real world and find where that ray intersects with a physical surface. The result is a position and orientation in 3D space that you can use to place virtual objects.

To use hit testing, request an XRHitTestSource at the start of your session. The source defines where the ray originates. For controller-based AR, use the controller's target ray space as the hit test source. For phone-based AR, use the viewer reference space to cast a ray from the center of the screen. The browser handles the intersection calculation using its understanding of the environment's geometry.

Each frame, call frame.getHitTestResults(hitTestSource) to get an array of intersection results. Each result includes a pose relative to your reference space. Position a reticle mesh (a simple ring or disc) at the first result's position and orient it to match the surface normal. This gives the player visual feedback about where their content will be placed before they confirm.

When the player taps or pulls the trigger to place an object, create a new mesh at the reticle's position. Apply the hit test result's orientation so the object sits flush with the surface. For a table-top game, you might place a game board here. For a room-scale game, you might place a portal, a spawn point, or a piece of furniture. The placement action should feel immediate and precise since the reticle already showed the player exactly where the object would go.

Hit testing works best on well-lit, textured surfaces. Plain white walls and featureless floors give the device less visual information to track against, reducing hit test accuracy. Your game's instructions should mention this: "Point at a textured surface in good lighting for best results."

Step 3: Use Plane Detection for Environment Awareness

While hit testing finds individual intersection points, plane detection discovers entire surfaces in the player's environment and classifies them. Detected planes include floors, walls, tables, ceilings, and other flat surfaces, reported as polygons with position, orientation, and boundary vertices.

Enable plane detection by requesting the "plane-detection" feature when creating your session. The browser continuously scans the environment and reports detected planes through the XRFrame's detectedPlanes set. Each plane has a polygon boundary (an array of vertices defining its outline), an orientation (horizontal or vertical), and a semantic label when available (floor, wall, table, ceiling).

For game development, planes serve several purposes. Use horizontal planes as physics collision surfaces so virtual objects land on real tables and floors instead of falling through them. Use wall planes as boundaries for room-scale games, preventing virtual characters from walking through walls. Use the floor plane to define the play area, spawning enemies at the room's edges or placing game elements within the detected floor boundary.

Visualizing detected planes during development helps you understand what the device sees. Render each plane's boundary as a semi-transparent polygon with a wireframe outline. Color-code them by type: green for floors, blue for walls, yellow for tables. Remove these visualizations in your final build, but consider keeping a "scan room" mode that players can activate to see what the system has detected, similar to the Quest 3's guardian setup.

Planes update over time as the device refines its understanding of the environment. A plane might start small and grow as the device scans more of the surface, or its boundary vertices might shift slightly. Your game should handle these updates gracefully. If a game object was placed on a plane that shifts, reposition the object to match the updated plane rather than leaving it floating in mid-air.

Step 4: Anchor Objects to the Real World

Anchors solve a fundamental problem in AR: keeping virtual objects in the right place as the device's spatial understanding evolves. Without anchors, a virtual chess piece placed on a table might drift a few centimeters over time as the tracking system refines its estimate of the table's position. Anchors lock objects to specific physical locations and automatically correct for tracking drift.

Create an anchor by calling frame.createAnchor() with a pose and a reference space. The browser returns an XRAnchor object that maintains a stable position in the real world. Each frame, query the anchor's pose to get its current position (which may have been corrected by the tracking system) and update your game object's transform to match.

The anchor lifecycle requires management. Anchors consume device resources, so delete them when they are no longer needed using anchor.delete(). If a player removes a game object from their table, delete the associated anchor. If the player pauses the game, consider keeping anchors alive so the game can resume with objects in their original positions.

Persistent anchors survive across sessions on supported devices. The Quest 3 supports anchor persistence, meaning a player can place a virtual trophy on their shelf, close the browser, and find the trophy in the same spot when they return. To use persistent anchors, request the "persistent-anchors" feature and use the session's persistentAnchors API to store and retrieve anchors by UUID. This opens up game designs where the player's physical space becomes a permanent game world that accumulates content over time.

Not all devices support anchors equally. On devices without anchor support, virtual objects will drift slightly over extended play sessions. Mitigate this by re-running hit tests periodically and nudging objects back to detected surfaces, or by designing games where slight positional drift does not affect gameplay.

Step 5: Add Depth Sensing and Occlusion

Depth sensing provides a per-pixel depth map of the real environment, enabling realistic occlusion. Without occlusion, virtual objects always render on top of the real world, even when a real object should logically be in front. This breaks the illusion instantly: a virtual ball sitting "behind" a real couch still appears to float in front of it.

Request the "depth-sensing" feature when creating your session. The browser provides depth data as either a CPU-accessible array or a GPU texture, depending on the usage mode you request. The GPU texture approach is more performant since you can use it directly in a custom shader without copying data between CPU and GPU.

To implement occlusion, use the depth buffer in your rendering pipeline. Before drawing each virtual pixel, compare its depth (distance from the camera) against the real-world depth at the same screen position. If the real-world depth is closer, discard the virtual pixel since a real object is in front of it. In Three.js and Babylon.js, this can be implemented as a custom shader pass or by writing the depth data into the renderer's depth buffer before drawing your scene.

Depth sensing also enables physics interaction between virtual and real objects. If you have the depth map of a real table, you can generate a collision mesh that matches the table's shape. Virtual balls can roll off the real table's edge, bounce off real walls, and slide along real floors with physical accuracy. The depth map updates each frame, so even moving real objects (like the player's hands) can interact with virtual physics.

Keep in mind that depth sensing is computationally expensive and not available on all devices. Android phones with ARCore support basic depth estimation but with lower resolution and accuracy than dedicated headsets. The Quest 3 provides high-quality depth data from its stereo depth cameras. Design your game to work without depth sensing (objects always render on top) and enable occlusion as an enhancement when the feature is available.

Key Takeaway

WebXR AR games use four core modules: hit testing for surface placement, plane detection for environment awareness, anchors for positional stability, and depth sensing for realistic occlusion. Each module is optional and must be feature-detected at runtime. Design your game around hit testing as the minimum viable AR capability, then layer in planes, anchors, and depth as progressive enhancements for devices that support them.