The method to detect an object is to calculate normalized correlation coefficient with a template and pick the largest one (above a threshold) as a true detection. As to motion, I calculate the difference between previous true detection coordinate and the current one, and use threshold to determine it is horizontal or vertical motion.
- My method of using static template and no pyramid is quite successful to detect my feature. I set the threshold to 0.9, so only strong match will be claimed as a true detection, because prefer low true positive rate than high false positive rate.
- Then I tried dynamic template. When I get a true detection, I’ll update my template. But the main weakness of this method is that it will drift slowly. The advantage of this method is that it will improve the true positive rate.
- Then I use pyramids to improve my results. Pyramids are from downsampled version of the original template. The main advantage is that I can detect features that are smaller than the original one. It means I can detect the feature when I move backward.(see results below)
- Another weakness of my method is that I use grayscale images to calculate the NCC, so that those different color regions with similar grayscale values will be false positive. To get rid of this problem, I try to use CbCr channel to calculate the NCC separately, and if they are all above thresholds, I will claim this region is a true detection. I have not completely finished this part.