A set of computer vision and artificial intelligence algorithms for robotics and self driving cars
This project has support for Race.OSSDC.org WebRTC based platform, to allow for extensive and quick testing, of computer vision and neural nets algorithms, against live (real life or simulated) or streamed (not live) videos (from Youtube or other datasets).
To contribute follow the approach in video_processing files to add your own algorithm and create a PR to integrate it in this project.
OSSDC VisionAI Demo Reel - run the algoritms in Google Colab
(Gaze estimation video can be found here)
(Pedestrian re-identification video can be found here)
(SSD object detection video can be found here)
(MiDaS mono-depth person demo video can be found here)
(MiDaS mono-depth night walk demo video can be found here)
(MiDaS mono-depth objects demo video can be found here)
Datasets and pretrained models are available in https://github.com/OSSDC/OSSDC-VisionAI-Datasets project.
- pip install opencv-python # required for all video processors
- pip install opencv-contrib-python # required for video_processing_opencv
- pip install aiortc aiohttp websockets python-engineio==3.14.2 python-socketio[client]==4.6.0 # required for WebRTC
- pip install dlib # required for face_landmarks
- pip install torch torchvision
- pip install tensorflow-gpu
- pip install youtube-dl # required for YouTube streaming sources
- Download and install the alpha version from here:
-
Prerequisite steps every time before running the python video processing scripts
- Run VisionAI Android app and setup the room name and password and start the WebRTC conference
- Update room info in signaling_race.py (everytime the room name or password is modified in the VisionAI Android app)
-
SegFormer semantic segmentation with transformers demo
- Install SegFormer https://github.com/NVlabs/SegFormer - see install steps in video_processing_SegFormer.py or OSSDC_VisionAI_demo_reel.ipynb notebook
- run the SegFormer video processor on the video stream from VisionAI Android app
- python race-ossdc-org_webrtc_processing.py -t SegFormer.b3-512-ade --room {your_room_name}
- demo-reel.sh {your_room_name} (enable SegFormer line)
- Demo videos SegFormer - semantic segmentation with transformers using OSSDC VisionAI platform hhttps://www.youtube.com/watch?v=3ws-irF4dEQ
-
GANsNRoses demo
- Install GANsNRoses https://github.com/mchong6/GANsNRoses - see install steps in video_processing_GANsNRoses.py or OSSDC_VisionAI_demo_reel.ipynb notebook
- run the GANsNRoses video processor on the video stream from VisionAI Android app
- python race-ossdc-org_webrtc_processing.py -t GANsNRoses --room {your_room_name}
- demo-reel.sh {your_room_name} (enable GANsNRoses line)
- Demo videos Have fun with GANsNRoses - using OSSDC VisionAI realtime video processing platform https://www.youtube.com/watch?v=YZTzjk_qh4w
-
DepthAI (OAK-D) stereo smart camera Side-By-Side 3D streaming demo
- Install latest DepthAI API from https://github.com/luxonis/depthai-python
- run the DepthAI video processor on the stereo or RGB video stream from OAK-D camera and stream it to VisionAI Android app
- python race-ossdc-org_webrtc_processing.py -t depthai.sbs --room {your_room_name}
- demo-reel.sh {your_room_name} (enable depthai.sbs line)
- python race-ossdc-org_webrtc_processing.py -t depthai.rgb --room {your_room_name}
- demo-reel.sh {your_room_name} (enable depthai.rgb line)
- Demo videos Live 3D video streamed over internet from a DepthAI OAK-D with OSSDC VisionAI https://www.youtube.com/watch?v=28awrl5MipQ (use a VR head set to see the 3D depth)
-
Detectron2 demo
- Install Detectron2 - see install steps in video_processing_detectron2.py or OSSDC_VisionAI_demo_reel.ipynb notebook
- run the Detectron2 video processor on the video stream from VisionAI Android app
- python race-ossdc-org_webrtc_processing.py -t detectron2 --room {your_room_name}
- demo-reel.sh {your_room_name} (enable detectron2 line)
- Demo videos TBD
-
DeepMind NFNets demo
- Install DeepMind NFNets - see install steps in video_processing_deepmind.py or OSSDC_VisionAI_demo_reel.ipynb notebook
- run the DeepMind NFNets video processor on the video stream from VisionAI Android app
- python race-ossdc-org_webrtc_processing.py -t deepmind.nfnets --room {your_room_name}
- demo-reel.sh {your_room_name} (enable deepmind.nfnets line)
- Demo samples images https://www.linkedin.com/feed/update/urn:li:activity:6766007580679557120?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A6766007580679557120%2C6768387554418016256%29
-
MediaPipe Holistic demo
-
Install MediaPipe - see install steps in video_processing_mediapipe.py or OSSDC_VisionAI_demo_reel.ipynb notebook
-
run the MediaPipe holistic video processor on the video stream from VisionAI Android app
- python race-ossdc-org_webrtc_processing.py -t mediapipe.holistic --room {your_room_name}
- demo-reel.sh {your_room_name} (enable mediapipe.holistic line)
-
Demo video
MediaPipe holistic demo
Isn't this fun?! MediaPipe Holistic neural net model processed in real time on Google Cloud https://www.youtube.com/watch?v=0l9Bb5IC86E
-
-
OAK-D gaze estimation demo, the proceessing is done on Luxonis OAK-D camera vision processing unit https://store.opencv.ai/products/oak-d
-
Install OAK-D DepthAI - see install steps in video_processing_oakd.py
-
run the OAK-D video processor on the video stream from VisionAI Android app
- python race-ossdc-org_webrtc_processing.py -t oakd.gaze
-
Demo video
Gaze estimation demo with processing done on Luxonis OAK-D camera processor (processing at 10 FPS on 486 x 1062 video, streamed at 30 FPS)
-
-
OAK-D people reidentification demo, the proceessing is done on Luxonis OAK-D camera vision processing unit https://store.opencv.ai/products/oak-d
-
Run VisionAI Android app and setup the room and start the WebRTC conference
-
Install OAK-D DepthAI - see install steps in video_processing_oakd.py
-
run the OAK-D video processor on the video stream from VisionAI Android app
- python race-ossdc-org_webrtc_processing.py -t oakd.pre
-
Demo video
People reidentification demo with processing done on Luxonis OAK-D camera processor (processing at 9 FPS on 486 x 1062 video, streamed at 30 FPS)
-
-
OAK-D age and genrer recognition demo, the proceessing is done on Luxonis OAK-D camera vision processing unit https://store.opencv.ai/products/oak-d
-
Install OAK-D DepthAI - see install steps in video_processing_oakd.py
-
run the OAK-D video processor on the video stream from VisionAI Android app
- python race-ossdc-org_webrtc_processing.py -t oakd.age-gen
-
Demo video
Upcomming
-
-
MiDaS mono depth, processing is done on Nvidia GPU
-
Run VisionAI Android app and setup the room and start the WebRTC conference
-
Install MiDaS - see install steps in video_processing_midas.py
-
run the MiDaS video processor on the video stream from VisionAI Android app
- python race-ossdc-org_webrtc_processing.py -t midas
-
Demo Videos
Mono depth over WebRTC using Race.OSSDC.org platform
https://www.youtube.com/watch?v=6a6bqJiZuaM
OSSDC VisionAI MiDaS Mono Depth - night demo
-
-
DLIB face landmarks, processing is done on CPU
- Install DLIB and face landmarks pretrained model - see instructions steps in video_processing_face_landmarks.py
- run the DLIB face landmarks video processor on the video stream from VisionAI Android app
- python race-ossdc-org_webrtc_processing.py -t face_landmarks
-
OpenCV edges detection, processing is done on CPU
- run the OpenCV edges video processor on the video stream from VisionAI Android app
- python race-ossdc-org_webrtc_processing.py -t opencv.edges
- run the OpenCV edges video processor on the video stream from VisionAI Android app