Skip to content

Blog

Release v0.8.9

The new release includes support for flipping the input video stream from a camera. This feature is designed to accommodate various camera configurations, such as back or selfie positions, portrait orientation, or mirroring on immersive displays. When combined with transposition, flipping along X or Y axes allows for any necessary transformation of the input stream into the desired state.

Video flipping is handled within the capturing module early in the processing pipeline. It can be requested during engine initialization by providing the flip field within the VideoParams interface to the setup() method.

Since the SDK now provides a mechanism to flip the input video stream directly, the mirroring functionality in renderers and render plugins is now obsolete and has been removed. This change also helps prevent unintended double-mirroring. Mirror option is removed from base CanvasRenderer and underlying ResponsiveCanvas and inherently from all its derivatives:

And their specializations like:

Mirroring is now not needed in several plugins and was removed as well:

Examples distributed with the SDK are updated to demonstrate these changes. The simplest way to flip the feed for the selfie camera, for instance, is as follows:

let rear = urlParams.has("rear");
await engine.setup({
rear, flip: { x: !rear, y: false },
size: { width: 1920, height: 1080 } });
cameraSwitch.onclick = async () => {
cameraSwitch.disabled = true;
rear = !rear;
await engine.setup({
rear, flip: { x: !rear, y: false },
size: { width: 1920, height: 1080 } });
await engine.start();
cameraSwitch.disabled = false;
}

Pose fitting plugins (including their twin counterparts) now only align well-visible bones. A bone is aligned with pose keypoints only when both ends meet the required reliability and visibility thresholds. If these thresholds are not met, the bone will remain in its rest pose. Updated plugins are:

This feature is enabled by default, one can use constructor parameter to disable it: fitVisible.

Pose fitting plugins return skeleton to the rest pose in setNode() before parsing the armature and caching references. We’ve found this is necessary to preserve accuracy when the same node is assigned to a plugin several times in certain usage patterns.

To maintain accuracy when a node is repeatedly assigned to a plugin, pose fitting plugins return the skeleton to its rest pose before parsing the armature and caching its metrics in setNode(). We’ve found this might be helpful in specific usage patterns.

Babylon.js renderers now execute engine.endFrame() upon scene rendering completion. This action is necessary because certain Babylon.js effects rely on the global frame counter being incremented, which is an undocumented side effect of endFrame(). Since calling this method was not strictly required within the standard render loop for general use, it had not been included previously.

Even when the pipeline is only partially initialized, the Engine’s init() method will return false. When this occurs, the Engine will also emit an init event with a false boolean argument. You can subscribe to this event using an event handler with a (boolean) => void signature.

Release v0.8.7

We have added another extra utility module to Engeenee SDK - @geenee/repacker Repacker remuxes input video files to improve compatibility with various video players and media sharing apps. Additionally, it fixes common issues and artifacts of MediaRecorder outputs. The most common trait of video containers recorded in a browser is a file fragmentation. Browser records a video in chunks and flushes them periodically, so for example we get fragmented mp4 file that may be not supported by some apps for sharing, or they will truncate the video to the first chunk/fragment of mp4 file. Repacker remuxes a file to obtain a whole monolithic container. Being compiled into wasm it’s blazingly fast and memory efficient. Utilizing efficient memory map and zero-copy strategies it remuxes buffers in place avoiding redundant deep copies and conversions. Repacker supports mp4 container and h264/avc1 video codec. The set of supported formats is limited to make size of the wasm binary as small as possible (mp4+h264 combination is the most widely used). Support for more containers and codecs may be added in the future.

Repacker is very straightforward to use and essentially has one method implementing remuxing of media file stored in memory. First we need to initialize an instance of Repacker:

import { Repacker } from "@geenee/repacker";
const repacker = new Repacker();
await repacker.init();

Asynchronous init() method is required to finish until further usage of Repacker, it downloads and initialized corresponding .wasm module distributed with the package. This module contains codecs, muxers, and native implementation of streams remuxing. One needs to put .wasm module in the root of a web app, e.g. into public/ folder like we usually do with bodyutils.wasm.

After repacker.init() is successfully completed you can call repacker.remuxBlob(blob) or repacker.remuxBuf(buf) to fix fragmentation of any mp4 file. For example it’s recommended to remux files captured with Recorder from @geenee/armature:

let blob = await recorder?.stop();
blob = blob && await repacker.remuxBlob(blob);

The whole capture + remux pipeline may look like this:

// Recorder
const repacker = new Repacker();
await repacker.init();
const recorder = new UniRecorder("video/mp4;codecs=avc1,opus", 8 * 1024 * 1024);
const recordBtn = Button({ size: "icon-lg", variant: "outline",
class: "absolute top-4 left-4 z-1 bg-opacity-60",
onclick: async () => {
recorder?.start(renderer, 0, renderer.audioTrack);
setTimeout(async () => {
let blob = await recorder?.stop();
blob = blob && await repacker.remuxBlob(blob);
if (!blob)
return;
const url = URL.createObjectURL(blob);
const link = document.createElement("a");
link.hidden = true;
link.href = url;
link.download = "capture.mp4";
link.click();
link.remove();
URL.revokeObjectURL(url);
}, 5000);
}

To increase the stability of web applications built using the Engeenee SDK, we have implemented several fail-safe features. The SDK now monitors the health of the GPU contexts and the camera being utilized. While the SDK itself is free from memory leaks and undefined behaviors, external code or third-party packages within an application might occasionally fail to release GPU resources (such as textures). Furthermore, the camera connection or its driver may experience temporary failures and subsequent automatic recovery. These new features are designed to mitigate the impact of such external issues. Given the specific nature of the browser environment, where a memory leak in one context can lead to failure in another, the SDK monitors the state of all controlled resources critical for its lifecycle. A global exception is triggered if any of these resources fail.

We track health of the next contexts:

  • GPGPU context which is used by the tracking engine for neural networks inference and image post- and pre-processing.
  • GPU context of the renderer.
  • Camera capturing context.

Adding the following line of code at the very beginning of index.ts enables the page to catch these exceptions and automatically reload, recovering from the failure:

window.onerror = () => { window.location.reload(); };

Release v0.8.6

We’ve updated all packages used internally to the latest versions. This release doesn’t introduce any API or logic changes, as well as updates within body tracking or image/video post-processing. Basically, only build system was updated: esbuild and rollup are bumped to the latest versions, C++ modules implementing heavy math computations switched to C++17 from C++14, and we’ve updated their native dependencies as well.

One can migrate from the previous version of the SDK without any code changes. All examples are updated to the 0.8.6 version. They can be found on the getting started page.

Release v0.8.5

To provide more efficient and optimized WASM modules that perform all heavy computations in body tracking pipelines we’ve switched to the newer version of the Emscripten SDK. More recent versions of Emscripten dropped support of ES2017 and now require some features from ES2020. Therefore, starting from this release we are bundling all packages of the SDK in ES2020 modules. Required ES2020 features are widely available across major browsers so there shouldn’t be any compatibility issues when switching to the newer ECMAScript.

Release v0.8.3

In this release we introduce the concept of priority score for tracking targets. During the detection phase of a tracking pipeline several regions of interest passing confidence threshold may be detected. Number of targets selected for further tracking is limited by computational resources of a user’s device. Web runtime can track only 1 target because of performance limitation of browsers, especially on mobile. By default region of interest having the highest confidence score is chosen for tracking.

We’ve added a set of utilities that make process of target selection more flexible and controllable outside of Engeenee SDK. Namely, processors now have additional optional parameter - rate function that evaluates measure of priority for detected regions of interest. Detections with higher rates will be considered first for further tracking. For example, these methods may guide selection of tracking targets to prioritize bigger or more centered RoIs, downweight score according to some rules. Default implementation forwards confidence score as rate. Targets with rate <= 0.0 will be completely ignored by tracking pipeline.

PoseProcessor can tune target selection using rateRoI() parameter. It is called for every detected target and returns its priority score. Tracking pipeline will sort RoIs by evaluated priority scores, filter out ones that have <= 0, and choose the top N (1 for web) RoIs for tracking. RoIs are usually defined in normalized coordinates, therefore callback has aspect ration as a parameter, so implementation can use it to calculate correct measures.

We provide a set of simple predefined priority measures that can be used as PoseParams.rateRoI:

As an example of custom priority measure that one can implement, this is the code of rateByRoI method:

const rateByRoI = (roi: BodyCircle, ratio: number) => {
// Primary axis (incline)
const axis: Coord2D = [
roi.top[0] - roi.center[0],
roi.top[1] - roi.center[1]];
// Correct aspect ratio
axis[0] *= ratio;
// Radius relative to image height
const r = Math.sqrt(axis[0] ** 2 + axis[1] ** 2);
// Unit length axis
axis[0] /= r;
axis[1] /= r;
// Ignore RoIs rotated more than acos(0.5)=60deg
const penaltyA = axis[1] >= -0.5 ? 1.5 + axis[1] : 0;
// Penalize score by size if r<=0.25 of image height
const penaltyR = 0.75 + Math.min(r, 0.25);
// Penalize score by center with factor 0.5 on the edge
const penaltyC: Coord2D = [
1 - 0.5 * Math.abs(roi.center[0] - 0.5),
1 - 0.5 * Math.max(0.5 - roi.center[1], 0)];
// Assume score >= 0.75 is 100% confident
const score = Math.min(roi.score / 0.75, 1.0)
// Rate detection
return score * penaltyR * penaltyC[0] * penaltyC[1] - penaltyA;
}

FaceProcessor can tune target selection using rateRoI() parameter. It is called for every detected target and returns its priority score. Tracking pipeline will sort RoIs by evaluated priority scores, filter out ones that have <= 0, and choose the top N (1 for web) RoIs for tracking. RoIs are usually defined in normalized coordinates, therefore callback has aspect ration as a parameter, so implementation can use it to calculate correct measures.

We provide a set of simple predefined priority measures that can be used as FaceParams.rateRoI:

As an example of custom priority measure that one can implement, this is the code of rateByRoI method:

const rateByRoI = (roi: FaceRect) => {
// Normalized height
const h = roi.box[1][1] - roi.box[0][1];
// Center
const center: Coord2D = [
0.5 * (roi.box[0][0] + roi.box[1][0]),
0.5 * (roi.box[0][1] + roi.box[1][1])];
// Penalize score by heigh if h<=0.5
const penaltyH = 0.5 + Math.min(h, 0.5);
// Penalize score by center with factor 0.5 on the edge
const penaltyC: Coord2D = [
1 - 0.5 * Math.abs(center[0] - 0.5),
1 - 0.5 * Math.abs(center[1] - 0.5)];
// Assume score >= 0.75 is 100% confident
const score = Math.min(roi.score / 0.75, 1.0)
// Rate detection
return score * penaltyH * penaltyC[0] * penaltyC[1];
}