ARKit’s Motion Capture

Develop AR using ARKit + RealityKit

YLabZ
10 min readAug 18, 2023

ARKit

ARKit is a framework developed by Apple that allows developers to create augmented reality (AR) experiences for iOS devices, including iPhones and iPads. Augmented reality is a technology that overlays digital information, such as images, videos, or 3D models, onto the real world in real time. ARKit provides tools and resources for developers to build interactive and immersive AR applications.

Here are some key features and aspects of ARKit:

  • Motion Tracking

ARKit uses device camera and sensor data to track the movement of the device in 3D space. This enables virtual objects to be placed and anchored within the real world, and they will stay in the correct position as the user moves around.

  • Environmental Understanding

ARKit can detect and understand the physical environment using techniques like plane detection. It can identify horizontal surfaces like floors and tables, allowing virtual objects to interact with real-world surfaces.

  • Lighting Estimation

ARKit can estimate the lighting conditions in the environment, enabling virtual objects to cast realistic shadows and reflect the ambient lighting of the real world.

  • Face Tracking

ARKit provides face tracking capabilities, which is especially useful for applications that involve facial expressions and animations. This is commonly used in apps that add virtual masks or effects to a user’s face.

  • Image and Object Recognition

ARKit supports the recognition of 2D images and 3D objects. This allows developers to trigger AR experiences based on specific visual markers, such as posters, logos, or objects.

  • Integration with SceneKit and Metal

ARKit integrates with SceneKit and Metal, Apple’s graphics frameworks, allowing developers to render 3D content efficiently and with high quality.

  • Multiuser AR Experiences

ARKit 4 introduced features for creating collaborative AR experiences where multiple users can interact with the same virtual objects in a shared real-world space.

  • Persistent AR

ARKit supports saving and loading AR experiences, which means users can revisit a location-based AR experience exactly as they left it.

  • LiDAR Support

Devices with LiDAR scanners (like some iPhone and iPad models) can leverage this technology for more accurate depth sensing, enhancing AR experiences.

ARKit is widely used by developers to create a diverse range of AR applications, from gaming and entertainment to education, shopping, navigation, and industrial applications. It’s part of Apple’s efforts to bring AR technology to mainstream users and provide developers with the tools they need to build compelling and engaging AR experiences on iOS devices.

RealityKit

RealityKit is a high-level framework developed by Apple that simplifies the creation of augmented reality (AR) experiences on Apple devices. It provides a set of tools, APIs, and components to seamlessly blend virtual objects with the real world. Here’s a summary of RealityKit’s key features:

  • Easy AR Development

RealityKit abstracts complex AR tasks, making it easier for developers to create AR applications without deep expertise in graphics or 3D programming.

  • Entity-Component System

RealityKit uses an entity-component system to manage virtual objects in the AR environment. Entities are the objects in the scene, and components define their attributes and behaviors. This modular approach allows for easy customization and extensibility.

  • Realistic Rendering

RealityKit’s rendering engine produces realistic visuals by applying advanced lighting, shading, and physics effects to virtual objects. This ensures that virtual content blends seamlessly with the real world.

  • Spatial Audio

The framework supports spatial audio, allowing sounds to emanate from specific virtual positions in the AR scene, enhancing immersion.

  • Swift Integration

RealityKit is designed to work seamlessly with Swift, Apple’s programming language. This integration streamlines the development process and allows for clean, expressive code.

  • Animations and Physics

Developers can easily create animations and define physical behaviors for virtual objects, making the AR experience dynamic and interactive.

  • AR Interaction

RealityKit enables user interactions with virtual content through gestures, taps, and device movements, enhancing engagement.

  • SwiftUI Integration

RealityKit can be integrated into SwiftUI interfaces, allowing developers to seamlessly combine AR experiences with native app UIs.

  • AR Quick Look

This feature lets users preview 3D models in AR directly from apps, websites, or messages, enhancing product visualization.

  • Xcode Integration

RealityKit is integrated into Xcode, Apple’s development environment, providing a comprehensive toolkit for building AR applications.

  • Collaboration with ARKit

RealityKit works hand in hand with ARKit, Apple’s framework for AR tracking and scene understanding.

ARKit provides the real-world data and spatial context, while RealityKit handles rendering and interaction with virtual content.

In essence, RealityKit abstracts the technical complexities of AR development, allowing developers to focus on creating engaging, interactive, and visually stunning AR experiences for users of Apple devices.

RealityKit & ARKit

RealityKit and ARKit work together to create immersive augmented reality (AR) experiences on Apple devices. ARKit is a framework that provides tools for building AR applications by detecting and tracking real-world objects and surfaces, while RealityKit focuses on rendering and interacting with virtual objects in the AR environment. Here’s how they collaborate:

AR Scene Management
— ARKit is responsible for tracking the device’s position and orientation in the real world and detecting features like surfaces and objects.
— ARKit provides information about the environment, such as camera images, depth information, and tracking status.
— RealityKit uses this information to create and update the AR scene. It places virtual entities in the AR space, aligning them with the detected real-world features.

Entity-Component System
— RealityKit’s entity-component system allows developers to create and manage virtual entities (objects) in the AR scene.
— These entities are placed and oriented based on the data provided by ARKit. For example, a virtual object can be anchored to a detected surface.
— Components define the attributes and behaviors of entities. For instance, a component might define the visual appearance, physical properties, and interaction behaviors of an entity.

Visual Rendering
— RealityKit uses its rendering engine to display virtual entities in the AR environment.
— It applies realistic lighting and shading effects, making virtual objects blend seamlessly with the real-world surroundings.
— The rendering engine takes advantage of ARKit’s lighting information and environmental textures to enhance the visual quality.

Interaction and Animation
— ARKit provides information about user interactions, such as taps, gestures, and object detections.
— RealityKit uses this input to trigger animations, responses, and interactions with virtual entities.
— Developers can use RealityKit’s animation and physics APIs to create interactive and dynamic behaviors for virtual objects.

Spatial Audio
— Both ARKit and RealityKit support spatial audio, allowing virtual objects to emit sounds from specific directions in the AR environment.
— This enhances the sense of realism and immersion in AR experiences.

Swift and Xcode Integration
— RealityKit is designed to work seamlessly with Swift, Apple’s programming language, and integrates with Xcode, Apple’s development environment.
— Developers can easily combine ARKit’s scene understanding capabilities with RealityKit’s rendering and interaction features.

AR Quick Look
— RealityKit powers AR Quick Look, a feature that allows users to view 3D models in real-world environments through the camera. AR Quick Look is powered by ARKit’s tracking and RealityKit’s rendering capabilities.

In summary, ARKit handles the real-world tracking and scene understanding aspects of AR experiences, while RealityKit focuses on creating, rendering, and interacting with virtual objects in the AR environment. This collaboration between the two frameworks enables developers to build engaging and immersive AR applications for Apple devices.

RealityView in visionOS: Overview

RealityView is a component in visionOS, a platform designed for creating immersive AR experiences. RealityView allows you to seamlessly integrate rich 3D content created with RealityKit, including content produced in Reality Composer Pro, into your visionOS app. By conforming to the RealityViewContentProtocol, you can manipulate 3D entities, add or remove them from your view, and respond to view updates.

Here’s a simplified example of how you can use RealityView to display a custom ModelEntity using SwiftUI:

struct ModelExample: View {
var body: some View {
RealityView { content in
if let robot = try? await ModelEntity(named: "robot") {
content.add(robot)
}
Task {
// Asynchronously configure content after rendering.
}
}
}
}
  • The RealityView closure is asynchronous (async) and is used to load content in the background. You can load content from your app's bundle or a URL. The closure takes a parameter (content) that conforms to RealityViewContentProtocol, which is used to add and remove RealityKit entities.
  • While your content is loading, RealityView automatically displays a placeholder view, which you can customize using the optional placeholder parameter. Asynchronous loading is recommended to avoid app hangs.
  • You can use the optional update closure to update your RealityKit content in response to changes in your view's state.
  • On visionOS, RealityView showcases RealityKit content inline in true 3D space, occupying the available space in your app’s 3D boundaries. This is represented by the RealityViewContent type.
  • By default, RealityView has a flexible size and doesn’t adjust its size based on the RealityKit content it displays.

For more advanced uses, such as subscribing to RealityKit events, performing coordinate conversions, or working with AR capabilities, you can refer to the RealityViewContentProtocol types.

In essence, RealityView bridges the gap between the immersive 3D content of RealityKit and the SwiftUI-based user interface in visionOS, allowing you to seamlessly integrate interactive 3D experiences into your app.

Putting everything together

In visionOS, the integration of RealityKit, ARKit, and RealityView creates a cohesive framework for building immersive augmented reality

RealityKit — Build 3D Content

RealityKit is a high-level framework that simplifies the creation of AR and VR experiences. It provides tools for rendering 3D content, handling animations, managing lighting, and more. In the context of visionOS, you can create and manipulate 3D entities, apply materials, animations, and behaviors to them, and define the overall visual and interactive aspects of your AR environment.

ARKit — Understand Real-World Context

ARKit is Apple’s augmented reality framework that enables you to detect and track real-world objects, map the environment, and place virtual content in a user’s physical space. It provides essential functionalities like plane detection, image tracking, face tracking, and world mapping. ARKit is responsible for understanding the real-world context in which your AR experience is taking place.

RealityView — 3D Interactions with SwiftUI

In visionOS, the integration of RealityKit, ARKit, and RealityView creates a cohesive framework for building immersive augmented reality experiences. Let’s break down how these components interact:

RealityKit — Build 3D Models

RealityKit is a high-level framework that simplifies the creation of AR and VR experiences. It provides tools for rendering 3D content, handling animations, managing lighting, and more. In the context of visionOS, you can create and manipulate 3D entities, apply materials, animations, and behaviors to them, and define the overall visual and interactive aspects of your AR environment.

ARKit — Understand your envirnment

ARKit is Apple’s augmented reality framework that enables you to detect and track real-world objects, map the environment, and place virtual content in a user’s physical space. It provides essential functionalities like plane detection, image tracking, face tracking, and world mapping. ARKit is responsible for understanding the real-world context in which your AR experience is taking place.

RealityView — Interaction with 3D Models using SwiftUI

RealityView is the bridge between the high-level 3D rendering capabilities of RealityKit and the low-level AR functionalities of ARKit within the visionOS environment. It allows you to embed RealityKit content directly into your visionOS app using SwiftUI. RealityView provides a RealityViewContentProtocol that enables you to interact with RealityKit content using SwiftUI views. This means you can control the placement, behavior, and interactions of your 3D entities using SwiftUI's declarative syntax.

Here’s how they interact:

  • Creating and Manipulating 3D Content:

With RealityKit, you can design and customize 3D entities, apply textures and animations, and set up interactions. This content is rendered visually using RealityKit’s rendering engine.

  • Defining the AR Environment

ARKit works in the background to understand the real-world environment by detecting surfaces (like tables or floors), recognizing images, and tracking the device’s position and orientation. This information is crucial for placing virtual objects realistically.

  • Embedding RealityKit Content

The RealityView in visionOS allows you to integrate your RealityKit content seamlessly into your SwiftUI-based user interface. You can use SwiftUI's familiar syntax to define how the 3D content should appear and behave within the AR environment.

  • Interactions and Behavior

You can define interactions and behaviors for your 3D content using RealityKit. This can include animations, physics simulations, and even user interactions like tap gestures.

  • Real-World Mapping and Positioning

ARKit provides the necessary data to correctly position and anchor your RealityKit content in the real world. This ensures that your virtual objects appear and behave as if they exist in the user’s physical environment.

In essence, RealityView acts as the conduit that brings together the artistic and interactive capabilities of RealityKit and the spatial awareness and tracking capabilities of ARKit. This integration enables you to build immersive and engaging AR experiences where virtual content seamlessly interacts with the real world.

Tools

  • Reality Composer Pro — VisionOS apps

Meet the all-new Reality Composer Pro, designed to make it easy to preview and prepare 3D content for your visionOS apps. It integrates tightly with the Xcode build process to preview and optimize your visionOS assets.

  • Reality Composer for iOS — iOS & iPadOS

Reality Composer for iOS and iPadOS makes it easy to build, test, tune, and simulate AR experiences for iPhone or iPad.

  • Reality Converter — VisionOS, iOS & iPadOS

Convert (.obj, .gltf and .usd), view, and customize USDZ 3D objects on Mac. You can even preview your USDZ object under a variety of lighting and environment conditions with built‑in IBL options.

In the next article we will show how to write the code … please stay tuned.

Simple SwiftUI example that demonstrates how RealityKit, ARKit, and RealityView work together to display a 3D model in an AR environment using visionOS …

import SwiftUI
import RealityKit
import ARKit

struct ARViewContainer: View {
var body: some View {
// Create a RealityView using RealityViewContentProtocol
RealityView { content in
// Load a 3D model asynchronously
if let modelEntity = try? await ModelEntity.load(named: "ship") {
// Add the loaded model entity to the RealityView
content.add(modelEntity)
}
}
}
}

@main
struct ARApp: App {
var body: some Scene {
WindowGroup {
ARViewContainer()
}
}
}

In this example, we’re creating a SwiftUI app that uses RealityView to embed a 3D model (named “ship”) from RealityKit into an AR environment using ARKit. The ARViewContainer is a SwiftUI view that uses RealityView to display the 3D content.

Here’s a step-by-step breakdown of what’s happening:

  1. ARViewContainer is defined as a SwiftUI view. Inside its body, we create a RealityView. This is where the interaction between RealityKit and ARKit occurs.
  2. Inside the RealityView's closure, we load a 3D model entity asynchronously using the ModelEntity.load(named:) method. This method asynchronously loads the model from the app's bundle.
  3. If the model entity is successfully loaded, we add it to the RealityView's content using the add(_:) method. This places the 3D model in the AR environment.
  4. In the ARApp struct, we set up the SwiftUI app's entry point. The ARViewContainer is embedded in the WindowGroup.

This code creates a simple AR app that displays a 3D model in an AR environment. The RealityView manages the integration of the 3D content with the AR environment provided by ARKit. It handles the coordination between RealityKit’s rendering capabilities and ARKit’s spatial tracking and mapping functionalities.

Please stay tuned for the code example in the next part …

~Ash

Please learn more about us …

--

--

Responses (1)