Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple ROI boxes or non-square shape? #91

Open
geftactics opened this issue Feb 4, 2021 · 21 comments
Open

Multiple ROI boxes or non-square shape? #91

geftactics opened this issue Feb 4, 2021 · 21 comments

Comments

@geftactics
Copy link

geftactics commented Feb 4, 2021

Love the integration! Thanks for your work!

When my cameras detect motion, I run a snapshot image through this and generate a notification if we see a person.

Is there a way to apply multiple ROI boxes in order to build an area, or maybe an exclusion box? My path area is an odd shape, and I want to exclude the public path next to my house, as it causes people alerts that I don't want to worry about!

I'm guessing that you probably just pass these on to AWS API? :/

camera

@geftactics geftactics changed the title Multiple bounding boxes or non-square shape? Multiple ROI boxes or non-square shape? Feb 4, 2021
@robmarkcole
Copy link
Owner

I have a couple of related issues on deepstack:

The challenge is to make it not too complicated to configure. If you can make any suggestions about what you want and how it could be configured that would be very helpful

@geftactics
Copy link
Author

I did think about polygons etc to build a more complex zone, but as you suggested would make configuration complex and hard to setup/test/debug. If we allowed multiple ROI boxes, it becomes a bit of a pain do define complex/non-rectangular zones.

If the focus is on simplicity, how about having the option to supply an exclusion mask... This would be a transparent PNG of the same dimensions as the source image. In this file we black out the areas to exclude, we then overlay this image on top of the source image before passing onto rekogntion. For the image your addon saves out with the boxes etc drawn on, it could still use the original source image, rather than the one that we actually sent to rekognition.

This method would allow very complex zones to be applied easily.

@robmarkcole
Copy link
Owner

Can you provide a working example, either a PR or just some working python code?

Your suggestion would provide simple config but then require users to use an extra tool to generate the mask. I am open to the idea of a Hassion addon which could be used for that. But then again an addon could also be used to generate the config even if it was just some complicated text

@geftactics
Copy link
Author

geftactics commented Feb 7, 2021

I did try and do it against the actual codebase, but couldn't quite fathom it... Here's a basic example in isolation to show the concept working though...

from PIL import Image, UnidentifiedImageError

config_maskfile = 'mask.png'

camera = Image.open('camera.jpg')
image_for_rekogntion = camera.copy()

if config_maskfile:
  try:
    mask = Image.open('mask.png')
    image_for_rekogntion.paste(mask, (0, 0), mask)
  except (FileNotFoundError, UnidentifiedImageError):
      print('ERROR: Could not open a valid mask file')

# Use this for saving locally and adding boxes
camera.show()

# Send this for processing by Rekogntition
image_for_rekogntion.show()

camera
mask

@robmarkcole
Copy link
Owner

How is the mask created?

@geftactics
Copy link
Author

Transparent PNG file - Can use any image editor, gimp, photoshop, etc... I'm sure most people with the level of technical ability to be running home assistant with custom addons, and AWS keys etc will have no problem doing this?

@robmarkcole
Copy link
Owner

Probably true, but I want to roll this back into HA at some point and simplify as much as possible. One issue with using a mask before processing is that any object that is on the border risks being not detected since it will be cropped. Any comment on this? My current approach runs detection on the full frame, then uses the center of the object to decide if it's inside the ROI OR NOT

@geftactics
Copy link
Author

Ah, that is a good point - I might have a look at how we can specify polygon coordinates then

@geftactics
Copy link
Author

geftactics commented Feb 7, 2021

Building a polygon and then using shapely to test for presence of a point with the polygon is fairly effortless... Here I build a simple 4 point polygon for my ROI (Try using my camera image above as camera.jpg), I've also added two test points, one inside and one outside. We're loading a list of coordinates from YAML, then converting them to a list of tuples...

import yaml
from PIL import Image, ImageDraw
from shapely.geometry import Point, Polygon

with open('test.yaml') as file:
  yaml = yaml.load(file, Loader=yaml.FullLoader)
  roi_points = []
  for p in yaml['roi_points']:
      roi_points.append(tuple(p))

camera = Image.open('camera.jpg')
draw = ImageDraw.Draw(camera)
draw.polygon(roi_points, outline='LightGreen')
camera.show()

poly = Polygon(roi_points)

test_point_1 = Point(100, 100) # Outside our polygon
test_point_2 = Point(400, 400) # Inside our polygon

print(test_point_1.within(poly)) # False
print(test_point_2.within(poly)) # True

and then, this loads from a yaml file like so...

roi_points:
  - [300, 0]
  - [440, 0]
  - [1250, 1070]
  - [400, 1070]

@robmarkcole
Copy link
Owner

@geftactics I like this suggestion, will do.
For assisting with config my plan is to create a streamlit app that will allow the user to draw all the required ROI (for polygons require this feature), then generate the required yaml which can be pasted into their config. This app can be hosted for free also

@robmarkcole
Copy link
Owner

@geftactics do you know of any other integrations (preferably official) that are using config which involves an array?

@geftactics
Copy link
Author

geftactics commented Feb 9, 2021

I think many do, but just not in YAML formatted this way... Not sure if it helps, but my code above will also work with YAML of the following format:

roi_points:
  - 
    - 300
    - 0
  -
    - 440
    - 0
  -
    - 1250
    - 1070
  -
    - 400
    - 1070

@robmarkcole
Copy link
Owner

Based on #92 I have chosen not to use Shapely. Also given the convoluted config it appears is necessary, and the fact that this is incompatible with config flow, this whole approach of using ROI needs to be considered carefully. I am beginning to think that what is required is a separate and dedicated integration/tool for monitoring the object_detected events

@geftactics
Copy link
Author

geftactics commented Feb 12, 2021

I thought about the options above a bit more, and have one more idea to throw into the mix...

  • We use a mask image as per example above, but do not crop/mask the original image before sending to AWS Rekognition.
  • Image is sent for detection as it is now, for each object we look at the center point
  • The equivalent center point on the mask image is then grabbed using PIL.getpixel()
  • If the pixel is of a specific (configurable?) colour, we consider it within our ROI (so if pixel was white in my example)

Essentially user supplies a mask image with a coloured zone as ROI, if centerpoint falls on a pixel of that colour, it's valid.

Simple config, very flexible/multiple ROI zones... Can be implemented with PIL easily.

@robmarkcole
Copy link
Owner

That is an interesting suggestion. A person could literally use a Paint application to draw whatever regions they would want, and we would not need to do any kind of complex calculation to determine if an object is inside the region - just look up a pixel value :-) As mentioned in above comment I think this justifies a standalone integration, which could then be used by any image processing integration that outputs object locations

@geftactics
Copy link
Author

We do need to be able to allow the user to reference the colour they have used in the YAML. For example, we can't always just rely only black, as some users may naturally have black areas of their camera image. Ideally they would pick a contrasting colour that is not likely to appear in the frame.

Hopefully having it as a separate integration would still allow me to use it in my automations!

I do motion alert -> send to rekognition -> if person count>1 send alert

@robmarkcole
Copy link
Owner

I think for the timebeing I will do:

  • allow multiple ROI, named
  • add 4 point polygon

Combining these two a person can create a wide range of ROI

@geftactics
Copy link
Author

Works for me!! :)

@robmarkcole
Copy link
Owner

After more thought I think the best approach is to apply the mask to the image before processing the image

@geftactics
Copy link
Author

If you are open to the masking idea - maybe the method posted above on 12th Feb will yield the most accurate results when objects are cropped by the mask

@robmarkcole
Copy link
Owner

The binary mask should be 0/1 pixel values only, this is then applied to the image before processing. The config will then simply be the path to the mask file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants