橙色云资讯 - 工业互联网行业信息门户

人工智能的乐趣：使用MediaPipe和OpenCV在屏幕上“神奇地”创建图形

磐创AI 2022-05-17

opencv tip

1283 字丨阅读本文需 4 分钟

作为一名软件工程师，大多数时候觉得，我们是真正的魔术师，通过将来自不同来源的不同代码片段拼接在一起，使应用程序能够工作。

那时，我们可以浏览 Paul McWhorter 关于“MediaPipe”的视频教程。在人工智能方面，他是最好的老师之一。

媒体管道

MediaPipe 为直播和流媒体提供开源跨平台、可定制的 ML 解决方案。在上述视频中，他演示了如何使用“MediaPipe Hands”跟踪手部和手指的运动。它使用机器学习（ML）从单帧中推断出一只手的 21 个 3D 地标。

想法

扩展这项工作，让圆形和矩形“神奇地”出现在屏幕上。准确地说，当双手出现在相机前时，食指尖周围会出现圆圈。把手拉近，圆圈互相接触，然后BINGO！合并成为一个圆圈。如果我们继续将手拉得更近，圆圈将变为矩形。

如果你觉得有趣，请继续阅读！

步骤

根据食指尖之间的距离绘制图形。

步骤是：

1．使用 MediaPipe 找到双手和所有手指。

2．获取双手食指尖（地标 8）的 x ＆ y 坐标。

3．计算这两个指尖之间的欧几里得距离。

· 如果距离大于预设半径（r）的两倍，则以指尖为圆心，半径为r画圆。

· 如果距离在半径的两倍和要出现的矩形的预设值之间，绘制一个包围食指尖的圆圈。

· 如果距离小于矩形点，则以食指尖为对角线绘制一个矩形。

4．使用 OpenCV 绘制这些图形。

编码

该程序的主要库是 MediaPipe、OpenCV 和 NumPy。使用命令pip install安装那些库。强烈建议使用虚拟环境。

完整的代码可以在这个 GitHub 页面上找到：

＂＂＂

A fun project to make circles ＆ rectangle ＇magically＇ appear on the screen

Platform： Windows 10

Python Version： 3．10＋

Major libraries： MediaPipe， OpenCV， NumPy

＂＂＂

import cv2

import numpy as np

import math

＃ Camera settings

DEFAULT＿CAM ＝ 0 ＃ Built－in camera

USB＿CAM ＝ 1 ＃ External camera connected via USB port

CAM＿SELECTED ＝ DEFAULT＿CAM

CAM＿WIDTH ＝ 1280

CAM＿HEIGHT ＝ 720

CAM＿FPS ＝ 30

FLIP＿CAMERA＿FRAME＿HORIZONTALLY ＝ True

＃ mediapipe parameters

MAX＿HANDS ＝ 2

DETECTION＿CONF ＝ 0．5

TRACKING＿CONF ＝ 0．5

MODEL＿COMPLEX ＝ 1

HAND＿1 ＝ 0

HAND＿2 ＝ 1

INDEX＿FINGER＿TIP ＝ 8

X＿COORD ＝ 0

Y＿COORD ＝ 1

FIGURES＿LIST ＝［＂Circles＂，＂MergedCircle＂，＂Rectangle＂］

＃ Drawing parameters － for opencv

CIR＿RADIUS ＝ 200

CIR＿COLOR ＝（255， 0， 0）

CIR＿THICKNESS ＝ 3

MERG＿CIR＿COLOR ＝（0， 255， 0）

MERG＿CIR＿THICKNESS ＝ 3

RECT＿POINT ＝ 300

RECT＿COLOR ＝（0， 0， 255）

RECT＿THICKNESS ＝ 3

class MpHands：

import mediapipe as mp

def ＿＿init＿＿（self， max＿hands＝MAX＿HANDS， det＿conf＝DETECTION＿CONF， complexity＝MODEL＿COMPLEX， track＿conf＝TRACKING＿CONF）：

＂＂＂

Inputs：－

static＿image＿mode： Mode of input． If set to False， the solution treats the input images as a video stream．

max＿num＿hands： Maximum number of hands to detect． Default to 2

model＿complexity： Complexity of the hand landmark model． 0 or 1．

Landmark accuracy as well as inference latency generally go up with the model complexity．

Default to 1．

min＿detection＿confidence： Minimum confidence value （［0．0， 1．0］） from the hand detection model for the

detection to be considered successful． Default to 0．5．

min＿tracking＿confidence： Minimum confidence value （［0．0， 1．0］） from the landmark－tracking model for the

hand landmarks to be considered tracked successfully．

Ignored if static＿image＿mode is True． Default to 0．5．

Output：－

multi＿hand＿landmarks： Collection of detected／tracked hands， where each hand is represented as a

list of 21 hand landmarks and each landmark is composed of x， y and z．

x and y are normalized to ［0．0， 1．0］ by the image width and height respectively．

＂＂＂

self．hands ＝ self．mp．solutions．hands．Hands（static＿image＿mode＝False， max＿num＿hands＝max＿hands，

model＿complexity＝complexity， min＿detection＿confidence＝det＿conf，

min＿tracking＿confidence＝track＿conf）

def marks（self， video＿frame）：

＂＂＂

Aim： To get the X ＆ Y coordinates of all the 21 landmarks of both hands

：param video＿frame： captured frame from the opencv． This is in BGR format．

：return my＿hands： Array of hands with 21 landmarks （X ＆ Y） of each hand

［［（h1＿x0，h1＿y0），（h1＿x1，h1＿y1），．．．（h1＿x20，h1＿y20）］，

［（h2＿x0，h2＿y0），（h2＿x1，h2＿y1），．．．（h2＿x20，h2＿y20）］，．．．］

＂＂＂

my＿hands ＝［］

frame＿rgb ＝ cv2．cvtColor（video＿frame， cv2．COLOR＿BGR2RGB）＃ opencv works in BGR， while rest of the world in RGB

multi＿hand＿landmarks ＝ self．hands．process（frame＿rgb）．multi＿hand＿landmarks

if multi＿hand＿landmarks：＃ Do the following if we have detected／tracked hands

＃ multi＿hand＿landmarks is an array of arrays． Each array contains the 21 landmarks （in dict） of each hand

for hand＿landmarks in multi＿hand＿landmarks：＃ Stepping through each hand

my＿hand ＝［］

for land＿mark in hand＿landmarks．landmark：＃ Stepping through the 21 landmarks of each hand

＃ landmark is a dict with x，y ＆ z coordinates． We are interested in x ＆ y only．

＃ Since x ＆ y are normalized， multiply them with camera width and height to get the actual values．

＃ Finally， convert the coordinates into integers for opencv

my＿hand．append（（int（land＿mark．x ＊ CAM＿WIDTH）， int（land＿mark．y ＊ CAM＿HEIGHT）））

my＿hands．append（my＿hand）

return my＿hands

def calc＿euclidean＿dist（p1， p2）：

＂＂＂

Aim： Get the shortest distance between two points （Euclidean distance）．

：param p1： point 1 with （x1， y1） coordinates

：param p2： point 2 with （x2， y2） coordinates

：return euc＿dist： Euclidean distance

＂＂＂

（p1＿x， p1＿y）＝ p1

（p2＿x， p2＿y）＝ p2

euc＿dist ＝ math．sqrt（（p2＿x － p1＿x）＊＊ 2 ＋（p2＿y － p1＿y）＊＊ 2）

return euc＿dist

def select＿figure（dist）：

＂＂＂

Aim： Select the figure to draw based on the distance

：param dist： Distance between the index fingertips

：return fig： selected figure

＂＂＂

fig ＝ FIGURES＿LIST［0］

if dist ＞ CIR＿RADIUS ＊ 2：

fig ＝ FIGURES＿LIST［0］

if CIR＿RADIUS ＊ 2 ＞＝ dist ＞ RECT＿POINT：

fig ＝ FIGURES＿LIST［1］

if dist ＜＝ RECT＿POINT：

fig ＝ FIGURES＿LIST［2］

return fig

＃ Camera configurations．

＃ Except ＇CAM＿SELECTED＇， all other settings are optional for faster launch of webcam in Windows

cam ＝ cv2．VideoCapture（CAM＿SELECTED， cv2．CAP＿DSHOW）＃ CAP＿DSHOW enables direct show without buffering

cam．set（cv2．CAP＿PROP＿FRAME＿WIDTH， CAM＿WIDTH）＃ Set width of the frame

cam．set（cv2．CAP＿PROP＿FRAME＿HEIGHT， CAM＿HEIGHT）＃ Set height of the frame

cam．set（cv2．CAP＿PROP＿FPS， CAM＿FPS）＃ Set fps of the camera

cam．set（cv2．CAP＿PROP＿FOURCC， cv2．VideoWriter＿fourcc（＊＇MJPG＇））＃ Set the codec as ＇MJPG＇
findHands ＝ MpHands（）＃ Create object

print（＂Press ＇q＇ to quit＂）

while True：

ignore， frame ＝ cam．read（）＃ Read the frame from camera

if FLIP＿CAMERA＿FRAME＿HORIZONTALLY：＃ MediaPipe assumes the input image is mirrored． Flip it， if we want

frame ＝ cv2．flip（frame， 1）

handData ＝ findHands．marks（frame）＃ Get the locations of both hands ＆ fingers

handDataLength ＝ len（handData）＃ Get the number of hands in the frame

if handDataLength ＝＝ 2：＃ We proceed only if there are two hands

＃ The handData consists of 21 landmarks of each hand． We are interested in the tip of index fingers only．

＃ Calculate the Euclidean distance between the index fingertips．

indexTipsDist ＝ calc＿euclidean＿dist（handData［HAND＿1］［INDEX＿FINGER＿TIP］， handData［HAND＿2］［INDEX＿FINGER＿TIP］）

figure ＝ select＿figure（indexTipsDist）＃ Based on the distance， select the figure to appear on screen

match figure：

case ＂Circles＂：

for hand in handData：＃ Draw circles with index fingertips as centers

circleCenter ＝ hand［INDEX＿FINGER＿TIP］

cv2．circle（frame， circleCenter， CIR＿RADIUS， CIR＿COLOR， CIR＿THICKNESS）

case ＂MergedCircle＂：

＃ Draw a circle which encloses our fingertips at min level，

＃ so that our fingertips will be on the edge of the circle

point1 ＝ handData［HAND＿1］［INDEX＿FINGER＿TIP］

point2 ＝ handData［HAND＿2］［INDEX＿FINGER＿TIP］

（x， y）， radius ＝ cv2．minEnclosingCircle（np．array（［point1， point2］））＃ points should be passed as a single numpy array

mergedCircleCenter ＝（int（x）， int（y））＃ opencv wants integer values

mergedCircleRadius ＝ int（radius）

cv2．circle（frame， mergedCircleCenter， mergedCircleRadius， MERG＿CIR＿COLOR， MERG＿CIR＿THICKNESS）

case ＂Rectangle＂：＃ Draw a rectangle with our index fingertips as diagonally opposite edges．

point1 ＝（handData［HAND＿1］［INDEX＿FINGER＿TIP］［X＿COORD］， handData［HAND＿1］［INDEX＿FINGER＿TIP］［Y＿COORD］）

point2 ＝（handData［HAND＿2］［INDEX＿FINGER＿TIP］［X＿COORD］， handData［HAND＿2］［INDEX＿FINGER＿TIP］［Y＿COORD］）

cv2．rectangle（frame， point1， point2， RECT＿COLOR， RECT＿THICKNESS）

else：

print（＂Show both hands for the magic to happen or Press ＇q＇ to quit＂）

cv2．imshow（＇Magic Frame＇， frame）＃ Display the frame

cv2．moveWindow（＇Magic Frame＇， 0， 0）＃ Move the frame to the top－left corner of the monitor

if cv2．waitKey（1）＆ 0xff ＝＝ ord（＇q＇）：＃ wait for letter ＇q＇ to exit．

print（＂Quiting program＂）

break

cam．release（）＃ release the camera

cv2．destroyAllWindows（）＃ close all frame windows

在代码中使用了注释来表达意图，相信这是不言自明的。因此，不在这里解释代码，以避免这成为一个冗长的帖子。PS：已经上传了 Python 3．9 或更低版本的兼容代码演示这是工作的演示。

免责声明：凡注明来源本网的所有作品，均为本网合法拥有版权或有权使用的作品，欢迎转载，注明出处本网。非本网作品均来自其他媒体，转载目的在于传递更多信息，并不代表本网赞同其观点和对其真实性负责。如您发现有任何侵权内容，请依照下方联系方式进行沟通，我们将第一时间进行处理。

0赞好资讯，需要你的鼓励

来自：磐创AI

0 0

参与评论

登录后参与讨论 0/1000

人工智能的乐趣：使用MediaPipe和OpenCV在屏幕上“神奇地”创建图形

参与评论

协同+研发

400-800-1557

我是需求方

我是服务商

交易保障

帮助中心

工程社区

人工智能的乐趣：使用MediaPipe和OpenCV在屏幕上“神奇地”创建图形

参与评论

为你推荐

使用Python进行人脸识别（第1部分）

使用Python的人脸识别系统

实现计算机视觉——人脸检测

使用Python搭建人脸识别考勤系统

使用 OpenCV 对图像进行特征检测、描述和匹配

使用 HSV 颜色模型和 openCV 构建昼夜分类器

OpenCV使用CUDA处理图像的教程与实战

使用OpenCV+Python进行人脸识别

如何使用OpenCV为图像加水印

在Ubuntu 20.04中使用CUDA安装OpenCV 4

Python+OpenCV的基础图像处理操作汇总

欢迎阅读 OpenCV 指南—第2部分

计算机视觉基本任务入门

使用Python+OpenCV进行图像处理之入门教程

Rust和OpenCV

2022最新计算机视觉学习路线(入门篇)

使用Pytesseract进行光学字符识别

欢迎阅读 OpenCV 指南—第 1 部分

4种在生产中扰乱计算机视觉模型的方法

使用OpenCV在Python中进行图像操作

2021全球机器学习技术大会在北京金茂威斯汀大饭店盛大召开！

在OpenCV中使用图像像素

OpenCV基础知识入门

机器人视觉系统使图像更加清晰

如何使用 OpenCV 开发虚拟键盘

如何使用 OpenCV 为照片添加卡通效果！

使用OpenCV和Imutils旋转图像

数字图像处理：灰度化

Python中OpenCV的基础知识

使用OpenCV进行虚拟缩放

相关推荐

协同+研发

400-800-1557

我是需求方

我是服务商

交易保障

帮助中心

工程社区