Jump to content

How as3Kinect Works: Features and Challenges

+ 3
  imekinox's Photo
Posted Mar 17 2011 02:00 PM

This article walks you through the features and development challenges of the as3Kinect wrapper for OpenKinect.

The as3Kinect project is an application that acts like a socket server; it reads data from a Kinect device and sends that data over the network to an ActionScript client. This application is known as the as3kinect server, or sometimes as the as3kinect wrapper.

To use it, you need a client application. This client needs to know which protocol the server uses when sending the data. To accomplish that, the as3Kinect project includes an ActionScript 3 library known as the "as3kinect client library." Flash developers can use this API to communicate with the server/device.

To recap the first article in this series just a little, the diagram below shows a detailed description of the communication flow between these components:

Attached Image

During OpenKinect's early development stages, there was a little uncertainty about how far the wrapper could go. In addition, as this wrapper became part of the main repository, its development had to follow the OpenKinect guidelines and goals.

At a very high level, I'll try to explain what that means.

When you use your Kinect with your Xbox 360, you see a 3D representation of your body movements. To achieve this, the console generates a skeleton model, which it then applies to a 3d model, so that the model moves the same way you do. The Xbox generates this skeleton after processing the images and depth data sent by the Kinect to the Xbox; in other words, the skeleton is generated in software. But because the OpenKinect project is just a driver, the only thing you will get from the Kinect are the raw images and depth data. To actually control or move anything using your body's movements, you need to mimic what the Xbox does in your applicationóit's up to you to analyze the raw unprocessed dataóthe OpenKinect driver does not (and will not) perform this analysis at any stage of its development.

The first approach taken when building the wrapper was to send this unprocessed data directly to ActionScript. We could not add server-side processing of this data, because that's not the intention of the OpenKinect project. A wrapper should function only as a bridge, allowing the driver's capabilities to be used with other languages. But the data processing to build a model to drive applications must be performed in that language.

Therefore, very first thing that the as3kinect wrapper needed to do was to duplicate the glview demo from OpenKinect. In Flash, this demo shows only the depth image and the RGB image.

Here is a video that demonstrates these initial results.


Even this simple demonstration presented some big challenges during the as3kinect server and client development process.

Server-Client Communication

The first challenge was to define exactly how the server and the client would communicate. The client needed to:

  • Process data as fast as possible.
  • Transmit different kinds of data between server and client, which necessitated defining a protocol so both the client and server could be aware of the types of data sent and received.
  • Convert the raw bytes sent by the server to images on the client side.
  • Clients needed to be able to make requests for data to avoid lost packages (remember, ActionScript is slower than C)


The following sections explain how these goals were met in more detail.

Processing Socket Data in ActionScript 3

When you run an SWF in standalone mode (at least on OS X) you don't get the same hardware acceleration you get in the IDE. In this case, we needed to transfer about 30 MB/sec per channel (depth and video), so processing this data had to be extremely fast. The client first needed to analyze the raw data quickly before converting it to a bitmap image for each frame received.

At the beginning, I used the WiiFlash socket class as a reference example. Unfortunately, this approach becomes far too slow when processing large chunks of data. The WiiFlash application avoids this problem because it processes very small chunks of data, so speed never becomes an issue:

while ( socket.bytesAvailable > I0 )
{
   buffer.writeByte( socket.readByte() ); 
}


This code loops as long as the number of bytes available exceeds the I0 value. For example, if I0 is 80 bytes, the code extracts 1 byte per loop using the socket.readByte() call, and the loop repeats 80 times as it fills the buffer. However, at the same time socket.bytesAvailable is increasingóand that value is driven by the faster C code, which makes it highly probable that your client code will get stuck in this loop when transferring large packages. To avoid that bottleneck the current implementation for reading data in as3kinect socket class is shown below (you can find this code at https://github.com/i...kinectSocket.as.

private function onSocketData(event:ProgressEvent):void {

   //Wait for the socket to have data in his buffer
   if(_socket.bytesAvailable > 0) {

      //We need to determine which size the package is 
      // (this a protocol definition between as3kinect server and client) 
      if(_packet_size == 0) {

         //Tell flash that our data transmission was in little endian
         _socket.endian = Endian.LITTLE_ENDIAN;

         // First byte header (so we know what kind of data we are receiving)
         _first_byte = _socket.readByte();

         // Second byte header (so we know what kind of data we are receiving)
         _second_byte = _socket.readByte();

         // We read the size of the packet we are receiving
         _packet_size = _socket.readInt();
      }

      // Only if we received data greater or equal to the 
      // expected (package size) we process the data
      if(_socket.bytesAvailable >= _packet_size && _packet_size != 0){

         //copy the whole packet to the buffer (note that if we read less bytes 
         //than the available in the socket buffer, the difference will remain 
         //in the socket buffer)
         _socket.readBytes(_buffer, 0, _packet_size);

         //Tell flash that our buffer is in little endian
         _buffer.endian = Endian.LITTLE_ENDIAN;

         //Send to 0 the reading position of the buffer (this is to be sure 
         //that later we start reading from the beginning of the buffer)
         _buffer.position = 0;

         //Create our data object with its headers and buffer data
         _data_obj.first = _first_byte;
         _data_obj.second = _second_byte;
         _data_obj.buffer = _buffer;

         //set packet size to 0 again so we can process next package headers
         _packet_size = 0;

         // Tell classes using the socket connection we have a full packet 
         // and send it to them
         dispatchEvent(new as3kinectSocketEvent(as3kinectSocketEvent.ONDATA,
             _data_obj));
      }
   }
}


Notice that this implementation looks complex without a fixed packet size. You need to know the size of the packet you are going to process. This is because at this point the code uses a single socket to receive image, depth and accelerometer data.

Using a Single Socket

As you have seen in the socket data function above, there is a six-byte header, which indicates what type of data you are receiving and its length. Changing to a single socket rather than one socket per data type helps the Flash plug-in work better; because it is single threaded, using more than one socket simply causes unneeded overhead.

Here's a pseudocode overview of how the first two bytes route the data to the desired processing function:

/*
* dataReceived from socket (Protocol definition)
* Metadata comes in the first and second value of the data object
* first:
*   0 -> Camera data
*          second:
*           0 -> Depth ARGB received
*           1 -> Video ARGB received
*   1 -> Motor data
*   2 -> Microphone data
*   3 -> Server data
*          second:
*           0 -> Debug info received
*
*/


After this, the last 4 bytes (4 bytes is the size of an integer in memory) contain the size of the package, which you may or not need to process the information, but is required for the socket class to dispatch the correct bytes to those processing functions.

This same process is implemented on the server side, here is the function that assembles the package in C (you can find this method in freenect_network.c inside the server folder of the ActionScript wrapper here: https://github.com/i...enect_network.c.

// Send Message with two bytes as metadata and pkg len as the next 4 bytes
int freenect_network_sendMessage(int first, int second, 
   unsigned char *data, int len) {
   int n;

   //length of the header (6 bytes)
   int m_len = 1 + 1 + sizeof(int);

   //create a new memory block of the size of the data + 6 bytes
   unsigned char *msg = (unsigned char*) malloc(m_len + len);

   //copy header bytes into the new buffer at the correct position
   memcpy(msg, &first, 1);
   memcpy(msg + 1, &second, 1);
   memcpy(msg + 2, &len, sizeof(int));
   memcpy(msg + m_len, data, len);

   //send the data (this works cross platform)
   #ifdef WIN32
   n = send(data_client_socket, (char*)msg, m_len + len, 0);
   #else
   n = write(data_child, msg, m_len + len);
   #endif

   //free the generated block of memory
   free((void*)msg);

   //return the response of the send/write command
   return n;
}


Converting Bytes to an Image

At this point, you have the raw bytesónow what? You need to convert the data into a movie clip so you can view it at a rate of 30 frames per second.

Note: Bytes need to be sent in ARGB format from the server in order for this to work.[/size][/indent]

Here's how you do that:

public static function byteArrayToBitmapData(bytes:ByteArray, _canvas:BitmapData):void{
   _canvas.lock();
   _canvas.setPixels(new Rectangle(0,0, as3kinect.IMG_WIDTH, as3kinect.IMG_HEIGHT), bytes);
   _canvas.unlock();
}


In the preceding function, _canvas is defined as a BitmapData object assigned as follows to a MovieClip in your application (rgb_cam is a movieclip in your stage and _as3w is a reference to an as3kinectWrapper class instance):

   _canvas_video = _as3w.video.bitmap;
   _bmp_video = new Bitmap(_canvas_video);
   rgb_cam.addChild(_bmp_video);


Because ActionScript is slower than C, you need to make it to ask for data at the point when it's ready to process it (as described in the first article in this series.

Therefore, you need to enable the client to talk to the server and enable the server to understand what the client needs.

The following code segments show how ActionScript makes a request for a depth frame in an application:

Variable Definitions

private var _as3w           :as3kinectWrapper;
private var _canvas_depth   :BitmapData;
private var _bmp_depth      :Bitmap;


Initialization

_as3w = new as3kinectWrapper();
//Add depth BitmapData to depth_cam MovieClip
_canvas_depth = _as3w.depth.bitmap;
_bmp_depth = new Bitmap(_canvas_depth);
depth_cam.addChild(_bmp_depth);         
_as3w.addEventListener(as3kinectWrapperEvent.ON_DEPTH, got_depth);


Functions

//This function is called when depth data arrives
private function got_depth(event:as3kinectWrapperEvent):void{
   //Convert Received ByteArray into BitmapData 
   as3kinectUtils.byteArrayToBitmapData(event.data, _canvas_depth);
}

//UPDATE METHOD (This is called for each frame)
private function update(event:Event){ 
   _as3w.depth.getBuffer();
}


This way the client can make requests to the server when it's ready to display a frame on the client side.

Here's the getBuffer function:

/*
 * Tell server to send the latest depth frame
 * Note: Lock the command while waiting for the data to avoid lag
 */
public function getBuffer():void {
   if(!_depth_busy){
      //lock the depth getBuffer method so we don't request data 
      // until we have received the data of the last request
      _depth_busy = true;
      _data.clear();
      //We genereate the same a 6 bytes command to be sent to the server 
      _data.writeByte(as3kinect.CAMERA_ID);
      _data.writeByte(as3kinect.GET_DEPTH);
      _data.writeInt(0);
     //Send the command through our socket class
      if(_socket.sendCommand(_data) != as3kinect.SUCCESS){
         throw new Error('Data was not complete');
      }
   }
}


Then, server-side, catch this with the following code:

//Read data from the socket
freenect_network_read(buff, &len);
//If command length is multiple of 6
if(len > 0 && len % 6 == 0){
   //Get the number of commands received
   int max = len / 6;
   int i;
   //For each command received
   for(i = 0; i < max; i++){
      memcpy(&value, &buff[2 + (i*6)], sizeof(int));
      value = ntohl(value);
      //The BIG switch (Communication protocol)
      switch(buff[0 + (i*6)]){
         case 0: //CAMERA
            switch(buff[1 + (i*6)]){
               case 0: //GET DEPTH
                  sendDepth();
                     break;


At this point, you've seen a low-level description of how as3Kinect works, as well as the major issues faced during its implementation, and how these were solved.

You can find all the code for this article under an Open Source License here: https://github.com/i...rs/actionscript.

2 Replies

0
  agaved's Photo
Posted Mar 29 2011 11:52 AM

Great post, thanks
as far as I understood as3Kinect gets the video stream from the Kinect.
Do you know how one could intercept more high-level events like "user A raised an arm?"
0
  NILESH VAGHELA's Photo
Posted Jun 08 2012 01:06 PM

I downloaded the sources and tried to compile in FlashDevelop. I get 6 errors.

Error 1: console variable not found in test_3d.as
Error 2: depth_cam not found
Errorr 3: left and right variables not found.

Can you guide how to fix these errors ?