Introduction to RealityKit on iOS— Entities, Gestures, and Ray Casting | by Anupam Chugh

Leveraging RealityKit, Imaginative and prescient, and PencilKit frameworks. Time to say goodbye to SceneKit?

Drawn by DALL-E 2

The introduction of iOS 13 introduced a serious improve to Apple’s augmented actuality framework. ARKit 3 arrived with loads of fascinating new options — folks occlusion, movement monitoring, simultaneous back and front digital camera, and collaborative assist. These enhancements to ARKit strongly point out Apple’s ambition for pushing AR immersion even additional.

Up till iOS 12, we had SceneKit, SpriteKit, and Steel as the first rendering frameworks. Amongst these, SceneKit, the 3D graphics framework, had been probably the most logical alternative for constructing ARKit apps.

Whereas loads of enhancements within the SceneKit framework have been anticipated to be introduced at WWDC 2019, Apple stunned us by introducing a very new and impartial 3D engine framework—RealityKit, which permits builders to create AR experiences and scenes extra simply than ever. Moreover, it comes with a utility app, Actuality Composer, which permits us to create our personal 3D objects and customizations.

The aim of this text is to get you began with RealityKit and set you as much as begin constructing superior augmented reality-based purposes. We’ll begin off by establishing an Xcode Mission for our AR-based iOS utility, adopted by a short tour by means of the varied key elements of the RealityKit framework.

As we work by means of this tutorial, we’ll put the varied items collectively to finish up with a very cool AR utility that lets customers add 3D fashions and constructions to the RealityKit’s digital scene and work together with them by utilizing gestures.

Moreover, we’ll arrange a drawing canvas view for dealing with person enter. On this case, the person enter that may embody digits inferred utilizing the MNIST Core ML mannequin, which can then be transformed into 3D textual content that finally will get positioned within the digital scene.

Moreover RealityKit and ARKit, we’ll be utilizing the next iOS frameworks in our utility:

  • PencilKit — It is a drawing framework launched in iOS 13 that enables us to create customized, canvas-based purposes. We’ll leverage this framework for dealing with the enter.
  • SwiftUI and Imaginative and prescient — SwiftUI is the favored new declarative UI framework, and Vision abstracts advanced laptop imaginative and prescient algorithms with an easy-to-use API.
Key gamers of our utility

To begin off, open Xcode 11 or above and create a brand new venture. Go to the iOS tab and choose the Augmented Actuality App template. Within the wizard, be certain to decide on RealityKit because the expertise and SwiftUI because the person interface, as proven beneath:

SwiftUI assist isn’t obtainable for SceneKit. One other indication of Apple’s inclination in direction of RealityKit.

Should you take a look at the left panel in Xcode you’ll see a file named Expertise.rcproject. It is a Actuality Composer file. By default, it comes with a single scene consisting of a metal field. You possibly can create your individual scenes with customized fashions, 3D property, and results.

The starter venture that you just’ve simply created consists of an ARView, through which the field entity is loaded and added to the anchor of the ARView. Upon constructing the venture, the next field shall be displayed in the midst of your AR app’s display:

A glimpse of the default field from the Actuality Composer scene file.

The starter venture is devoid of any gestures and interactions with the digital scene. As we go alongside, as a substitute of utilizing the Actuality Composer to assemble scenes and constructions, we’ll create our personal 3D entities programmatically. However earlier than we do this, let’s speak concerning the core elements that construct a RealityKit scene and deal with the flamboyant phrases — scenes, entities, anchors, and many others.

RealityKit’s ARView is the view accountable for dealing with the AR expertise. From establishing the onboarding expertise (extra on this later) to configuring ARKit configurations, digital camera, and interactions, every little thing goes by means of the ARView.

Each ARView consists of a single scene—a read-only occasion over which we add our AnchorEntities.

From Apple Docs

An Entity is crucial part of RealityKit. All objects in a RealityKit scene are entities. An AnchorEntity is the basis of all entities. Much like the ARAnchor of ARKit, it’s accountable for holding the entities and their kids.

We will add Components to an entity to additional customise it. A ModelComponent lets us outline the geometry of the 3D object, and a CollisionComponent lets us deal with collisions between objects.

RealityKit makes it very easy to generate easy 3D shapes, reminiscent of containers, spheres, planes, and textual content.

The next code showcases easy methods to create a ModelEntity that represents a dice:

let field = MeshResource.generateBox(dimension: 0.3) // dimension in metreslet materials = SimpleMaterial(colour: .inexperienced, isMetallic: true)
let entity = ModelEntity(mesh: field, supplies: [material])

The Materials protocol is used to set the colour and texture of the entity. At present, the three built-in forms of Materials obtainable with RealityKit are:

  • SimpleMaterial — For setting the colour and whether or not or not the entity is metallic.
  • OcclusionMaterial — An invisible materials that hides objects rendered behind it.
  • UnlitMaterial — This sort of entity doesn’t react to lights within the AR scene.

An entity is added to the scene within the following method:

let anchor = AnchorEntity(aircraft: .horizontal)

With a view to add the entity to the digital scene, we have to be certain that it conforms to the HasAnchoring protocol or is added as a baby to an Anchor with this property, as we did above.

So the next gained’t work, for the reason that ModelEntity doesn’t conform to the HasAnchoring protocol:

arView.scene.anchors.append(entity) //this could not work

Earlier than we create our first customized entity and add it to the scene, let’s examine what ARCoachingOverlay is and easy methods to combine it into our ARView.

The ARCoachingOverlayView is used to offer visible directions to the person to be able to facilitate ARKit’s world monitoring. For this, we have to add this view as a subview of the ARView and arrange the objective property, which specifies the monitoring necessities — horizontalPlane, verticalPlane, anyPlane, or monitoring (tracks characteristic factors). As soon as the objective is decided, the ARCoachingOverlayView is dismissed.

A glimpse of the teaching overlay view
extension ARView: ARCoachingOverlayViewDelegate 
func addCoaching()

let coachingOverlay = ARCoachingOverlayView()
coachingOverlay.delegate = self
coachingOverlay.session = self.session
coachingOverlay.autoresizingMask = [.flexibleWidth, .flexibleHeight]

coachingOverlay.objective = .anyPlane

public func coachingOverlayViewDidDeactivate(_ coachingOverlayView: ARCoachingOverlayView)
//Prepared so as to add entities subsequent?

The delegate’s coachingOverlayViewDidDeactivate perform will get triggered as soon as the objective is met. The ARCoachingOverlay is computerized by default. This implies if, throughout the scene, the characteristic factors or the aircraft is misplaced, onboarding would begin once more. You possibly can forestall this by setting it as a one-off operation and disable the automated habits by setting coachingOverlayView.activatesAutomatically = false.

Subsequent, simply execute the addCoaching perform from above on the ARView occasion as proven beneath:

struct ARViewContainer: UIViewRepresentable 

func makeUIView(context: Context) -> ARView

let arView = ARView(body: .zero)

let config = ARWorldTrackingConfiguration()
config.planeDetection = .horizontal, choices: [])

return arView

func updateUIView(_ uiView: ARView, context: Context)

Subsequent up, we’ll create a customized entity and add it to the scene as soon as the ARCoachingOverlayView is dismissed.

We will create our personal Entity subclasses of customized form and sizes by conforming to the HasModel and HasAnchoring protocols. Moreover, the HasCollision protocol is used to allow interactions with the entity — ray casting (extra on this later), gesture dealing with (scale, translate, rotate), and many others.

The next code reveals easy methods to create a customized entity field construction:

class CustomBox: Entity, HasModel, HasAnchoring, HasCollision 

required init(colour: UIColor)
self.elements[ModelComponent] = ModelComponent(
mesh: .generateBox(dimension: 0.1),
supplies: [SimpleMaterial(
color: color,
isMetallic: false)

comfort init(colour: UIColor, place: SIMD3<Float>)
self.init(colour: colour) = place

required init()
fatalError("init() has not been carried out")

There’s additionally a comfort initializer that enables us to specify the place of the entity within the scene with respect to the digital camera:

let field = CustomBox(colour: .yellow)
let field = CustomBox(colour: .yellow, place: [-0.6, -1, -2])
self.scene.anchors.append(field) //self is arView
A field positioned at a sure distance from the digital camera

Now we’ve added an entity to our AR scene, however we will’t carry out any interactions with it but! To try this we’ll want so as to add gestures, which we’ll discover subsequent.

RealityKit gives us with a bunch of built-in gesture interactions. Particularly, it permits scaling, rotating, and translating the entities within the AR Scene. To allow gestures on an entity, we have to be certain that it conforms to the HasCollision protocol (which we did within the earlier part).

Additionally, we have to “set up” the related gestures (scale, translate, rotate or all) on the entity within the following method:

let field = CustomBox(colour: .yellow, place: [-0.6, -1, -2])
self.installGestures(.all, for: field)
field.generateCollisionShapes(recursive: true)


The perform generateCollisionShapes generates the form of the Collision Part of the entity with the identical dimensions because the entity’s Mannequin Part. The collision part is accountable for interacting with the entity.

To put in a number of gestures, we invoke the strategy with the record of gestures in an array, as proven beneath:

arView.installGestures(.init(arrayLiteral: [.rotate, .scale]), for: field)

With this, our entity is able to be interacted and performed round with within the AR scene.

Including an entity to a different entity

We will additionally add little one entities to the present entity and place them relative to it. Let’s lengthen our present case by including a 3D textual content mesh on high of the field, as proven beneath:

let mesh = MeshResource.generateText(
extrusionDepth: 0.1,
font: .systemFont(ofSize: 2),
containerFrame: .zero,
alignment: .left,
lineBreakMode: .byTruncatingTail)

let materials = SimpleMaterial(colour: .purple, isMetallic: false)
let entity = ModelEntity(mesh: mesh, supplies: [material])
entity.scale = SIMD3<Float>(0.03, 0.03, 0.1)


entity.setPosition(SIMD3<Float>(0, 0.05, 0), relativeTo: field)

The next is a glimpse of our RealityKit utility with the textual content positioned above the field:

As a word, the world’s surroundings has an affect on the lighting of the entities. The identical field that appears pale yellow within the above illustration would look brighter in numerous environment.

Now that we’ve added interactivity to the entities and created a 3D textual content mesh, let’s transfer on to the final section of RealityKit — ray casting.

Ray casting, very like hit testing, helps us discover a 3D level in an AR scene out of your display level. It’s accountable for changing the 2D factors in your contact display to actual 3D coordinates by utilizing ray intersection to search out the purpose on the real-world floor.

Although hitTest is out there in RealityKit for compatibility causes, ray casting is the popular methodology, because it constantly refines the outcomes of tracked surfaces within the scene.

We’ll lengthen the above utility to permit contact gestures within the ARView in SwiftUI to be transformed into the 3D factors, the place we’ll finally place the entities.

At present, the TapGesture methodology in SwiftUI doesn’t return the placement of the view — the place it’s pressed. So we’ll fall again onto the UIKit framework to assist us discover the 2D location of the faucet gesture.

Within the following code, we’ve arrange our UITapGestureRecognizer within the ARView, as proven beneath:

  • Be aware of the findEntities perform — this helps us discover close by entities in 3D area primarily based on the 2D display level.
  • The setupGestures methodology shall be invoked on our ARView occasion.
  • The makeRaycastQuery creates an ARRaycastQuery, through which we’ve handed the purpose from the display. Optionally, you possibly can move the middle level of the display for those who intend to simply add the entities to the middle of the display every time. Moreover, the aircraft kind(precise or estimated) and orientation(you possibly can set both amongst horizontal, vertical or any).
  • The outcomes returned from ray casting are used to create an AnchorEntity on which we’ve added our field entity with the textual content.
  • overlayText is what we’ll obtain from the person enter because the label for the 3D textual content (extra on this later).

Earlier than we leap onto PencilKit for creating enter digits, let’s modify the ARViewContainer that masses the ARView with the modifications we’ve made to date.

Within the following code, the Coordinator class is added to the ARViewContainer to be able to enable knowledge to move from the PencilKitView to the ARView.

The overlayText is picked up by the ARView scene from the Coordinator class. Subsequent up, PencilKit meets the Imaginative and prescient framework.

PencilKit is the brand new drawing framework launched in iOS 13. In our app, we’ll let the person draw digits on the PencilKit’s canvas and classify these handwritten digits by feeding the Core ML MNIST mannequin to the Imaginative and prescient framework.

The next code units up the PencilKit view (PKCanvasView) in SwiftUI:

struct PKCanvasRepresentation : UIViewRepresentable 

let canvasView = PKCanvasView()

func makeUIView(context: Context) -> PKCanvasView

canvasView.device = PKInkingTool(.pen, colour: .secondarySystemBackground, width: 40)
return canvasView

func updateUIView(_ uiView: PKCanvasView, context: Context)

Now it’s time to merge the ARView and PKCanvasView in our ContentView. By default, SwiftUI views occupy the utmost area obtainable to them. Therefore, each of those views would take up nearly half of the display.

The code for the ContentView.swift file is offered beneath:

The next code does the styling for the SwiftUI button:

struct MyButtonStyle: ButtonStyle 
var colour: Shade = .inexperienced

public func makeBody(configuration: MyButtonStyle.Configuration) -> some View

.background(RoundedRectangle(cornerRadius: 5).fill(colour))
.shadow(colour: .black, radius: 3)
.opacity(configuration.isPressed ? 0.5 : 1.0)
.scaleEffect(configuration.isPressed ? 0.8 : 1.0)

Lastly, our app is prepared! An illustration of a working RealityKit + PencilKit iOS utility is given beneath:

An output from an iPad

As soon as the digit is extracted from the PencilKit drawing, all we do is a ray solid from the purpose the place the ARView is touched on the display to create an entity on the aircraft. At present, the entities don’t assist collision and may be dragged out and in of one another. We’ll be dealing with collisions and extra interactions within the a subsequent tutorial, so keep tuned!

RealityKit is right here to summary loads of boilerplate code to permit builders to concentrate on constructing extra immersive AR experiences. It’s absolutely written in Swift and has come as a substitute for SceneKit.

Right here, we too a very good take a look at the RealityKit entities and elements and noticed easy methods to arrange a training overlay. Moreover, we created our personal customized entity and little one entities. Subsequently, we dug into the 3D gestures at the moment supported with RealityKit and built-in them on the entities, after which explored ray casting. Lastly, we built-in PencilKit for dealing with person inputs and used the Imaginative and prescient framework for predicting hand-drawn digits.

The total supply code together with the MNIST Core ML mannequin is out there on this GitHub Repository.

Transferring on from right here, we’ll discover the opposite fascinating functionalities obtainable in RealityKit. Loading completely different sorts of objects, including sounds, and the power to carry out and detect collisions shall be up subsequent.

That’s it for this one. Thanks for studying.

More Posts