Main Content


Use the camera and Vitis-AI platform to make a robot dog that can recognize people and cats.

We will make a robot dog. We assume that it likes to be close to people, but is afraid of cats.

Therefore, this robot dog needs to have the ability to recognize people and cats.

Now, neural networks can easily find objects in images and distinguish their types.

We will use the Avent Ultra96v2 development board as the main control of the robot dog. This is a platform that uses an FPGA chip that includes the A53 CPU. A53 can run an operating system, such as ubuntu. FPGA can accelerate the calculation of neural networks. This will use the xilinx Vitis-ai stack.

The body of the robot dog needs 3D printing, they come from:

The robot dog uses 12 servo motors. Traditional servo motors need to input pwm signals to control the rotation angle of the motor. The Avent Ultra96v2 development board has 40 pins available on the hardware resources. We can make each motor use one pin. (But this requires a comprehensive logic circuit, and I had a problem in this step) So in the end, I used a servo motor with serial communication function. This kind of motor requires a dedicated adapter board. Convert the serial port signal into a simplex uart signal. The user agreement in the serial port contains the rotation information of the motor.

Camera, we need a usb camera. It can be easily connected to the Avent Ultra96v2 development board, compared to mipi camera. In an environment with ubuntu operating system and opencv. It is very easy to get the usb picture. Even just using python scripts.


The CPU of the Avent Ultra96v2 development board can run the Ubuntu operating system. With patalinux you can easily tailor the contents of the linux system. In this design, I used xilinx’s Pynq mirror. It provides an ubuntu operating system with a graphical interface. It also contains pre-compiled bit files and examples of neural network pre-training models.

Neural Networks

I used the dpu_yolo_v3 example included in the Pynq image. It is a tailored neural network model. It can recognize 20 different objects. The characters and cats are just included. (The types and numbers of different objects are listed in img/voc_class.txt).

Motion control of robot dog

In order to control each motor of the robot dog more conveniently. We need to model the robot dog’s legs. We want to tell the robot dog where its feet want to fall (relative to the body), and there is a function to automatically calculate the angle of each motor.”

Link to article