Tracking using Lucas Kanade Optical Flow, shows weird behavior, points are jumping - python-3.x

My goal is to implement a method, that tracks persons in a single camera. For that, I'm using Scaled Yolov4 to detect persons in the scene, then I generate points inside of their bounding boxes using cv2.goodFeaturesToTrack, and track them using Lucas-Kanade Optical Flow cv2.calcOpticalFlowPyrLK.
the problem is, sometimes the points make huge jumps, and I can't tell why. The following video shows the problem I'm facing, specifically, on second 0:02, the green dots jumps in a weird manner which makes my method detects that person as a new person.
https://www.veed.io/view/37f98715-40c5-4c07-aa97-8c2242d7806c?sharingWidget=true
my question is, is it a limitation on LK optical flow, or I'm doing something wrong? And is there a recommended Optical Flow method for tracking, or an example implementation for Single Camera Multi Person Tracking using Optical Flow? because I couldn't find much literature or codes about it.

Related

How do I select walls along a particular axis, among all the other walls, in Revit using the Revit API?

I want to change the height of all walls but the length of walls only in a particular axis, for instance, along the x-axis.
Consecutively, could you also tell how I could alter the similar dimensions for a house? Where there are connected walls?
I see nothing in this code that means it does not work.
However, it seems to me that it does not make much sense.
One would seldom constrain all wall heights to be user defined to a certain value; instead, in most Revit models, walls are constrained to reach from a bottom level to a top level. Then, if the height of all walls needs to be modified, you would modify the elevation of the top level only.
The logic of the code guarantees that the wall location line will only be modified if the newWallLine equals XYZ.BasisX. This may never be the case, since the line is a Line object and the vector an XYZ.
I would recommend researching exactly what you wish you achieve and how to do so manually in the end user interface before addressing the task programmatically.
In general, if a feature is not available in the Revit product manually through the user interface, then the Revit API will not provide it either.
You should therefore research the optimal workflow and best practices to address your task at hand manually through the user interface first.
To do so, please discuss and analyse it with an experienced application engineer, product usage expert, or product support.
Once you have got that part sorted out, it is time to step up into the programming environment.
I hope this clarifies.

Consistency for MRTK2's IMixedRealityPointerHandler

The raw position and rotation data of the IMixedRealityPointerHandler output isn't consistent across all users with the same hand gesture-- what strategies can we use to make this more consistent over various users? Does the device train to a specific eye calibration more accurately over time i.e. does more time in the device = better eye/hand tracking for that user?
The raw position and rotation data of the IMixedRealityPointerHandler output isn't consistent across all users with the same hand gesture
I’m not sure my understanding is correct, it looks like what you are talking about is, HoloLens lacks data consistency in the recognition of the same gesture by different users.
Actually, when users making any gestures on HoloLens, they need to keep his hands within the "Gesture Frame", in a range that the gesture-sensing cameras can see appropriately.
For this reason, the user needs to be trained in this area of recognition both for the success of the action and for their comfort.
For best results, three things mentioned in this document are essential to improve the consistency of gesture recognition:
Users need to be aware of Gesture Frame's existence.
Notifying users when their gestures are nearing/breaking the gesture frame boundaries within an application.
Consequences of breaking the gesture frame boundaries should be minimized.
Besides, we highly recommend that running Calibration each time a different person uses the device(please navigate to Setting->Utilities->Calibration).

Detecting damaged car parts

I am trying to build a system that on providing an image of a car can assess the damage percentage of it and also find out which parts are damaged in the car.
Is there any possible way to do this using Python and open-cv or tensorflow ?
The GitHub repositories I found that were relevant to my work are these
https://github.com/VakhoQ/damage-car-detector/tree/master/DamageCarDetector
https://github.com/neokt/car-damage-detective
But what they provide is a qualitative output( like they say the car damage is high or low), I wanted to print out a quantitative output( percentage of damage ) along with the individual part names which are damaged
Is this possible ?
If so please help me out.
Thank you.
To extend the good answers given by #yves-daoust: It is not a trivial task and you should not try to do it at once with one single approach.
You should question yourself how a human with a comparable task, i.e. say an expert who reviews these cars after a leasing contract, proceeds with this. Then you have to formulate requirements and also restrictions for your system.
For instance, an expert first checks for any visual occurences and rates these, then they may check technical issues which may well be hidden from optical sensors (i.e. if the car is drivable, driving a round and estimate if the engine is running smoothly, the steering geometry is aligned (i.e. if the car manages to stay in line), if there are any minor vibrations which should not be there and so on) and they may also apply force (trying to manually shake the wheels to check if the bearings are ok).
If you define your measurement system as restricted to just a normal camera sensor, you are somewhat limited within to what extend your system is able to deliver.
If you just want to spot cosmetic damages, i.e. classification of scratches in paint and rims, I'd say a state of the art machine vision application should be able to help you to some extent:
First you'd need to detect the scratches. Bear in mind that visibility of scratches, especially in the field with changing conditions (sunlight) may be a very hard to impossible task for a cheap sensor. I.e. to cope with reflections a system might need to make use of polarizing filters, special effect paints may interfere with your optical system in a way you are not able to spot anything.
Secondly, after you detect the position and dimension of these scratches in the camera coordinates, you need to transform them into real world coordinates for getting to know the real dimensions of these scratches. It would also be of great use to know the exact location of the scratch on the car (which would require a digital twin of the car - which is not to be trivially done anymore).
After determining the extent of the scratch and its position on the car, you need to apply a cost model. Because some car parts are easily fixable, say a scratch in the bumper, just respray the bumper, but scratch in the C-Pillar easily is a repaint for the whole back quarter if it should not be noticeable anymore.
Same goes with bigger scratches / cracks: The optical detection model needs to be able to distinguish between scratches and cracks (which is very hard to do, just by looking at it) and then the cost model can infer the cost i.e. if a bumper needs just respray or needs complete replacement (because it is cracked and not just scratched). This cost model may seem to be easy but bear in mind this needs to be adopted to every car you "scan". Because one cheap damage for the one car body might be a very hard to fix damage for a different car body. I'd say this might even be harder than to spot the inital scratches because you'd need to obtain the construction plans/repair part lists (the repair handbooks / repair part lists are mostly accessible if you are a registered mechanic but they might cost licensing fees) of any vehicle you want to quote.
You see, this is a very complex problem which is composed of multiple hard sub-problems. The easiest or probably the best way to do this would be to do a bottom up approach, i.e. starting with a simple "scratch detector" which just spots scratches in paint. Then go from there and you easily see what is possible and what is not

Detecting Handedness from Device Use

Is there any body of evidence that we could reference to help determine whether a person is using a device (smartphone/tablet) with their left hand or right hand?
My hunch is that you may be able to use accelerometer data to detect a slight tilt, perhaps only while the user is manipulating some sort of on screen input.
The answer I'm looking for would state something like, "research shows that 90% of right handed users that utilize an input mechanism tilt their phone an average of 5° while inputting data, while 90% of left handed users utilizing an input mechanism have their phone tilted an average of -5°".
Having this data, one would be able to read accelerometer data and be able to make informed decisions regarding placement of on screen items that might otherwise be in the way for left handed users or right handed users.
You can definitely do this but if it were me, I'd try a less complicated approach. First you need to recognize that not any specific approach will yield 100% accurate results - they will be guesses but hopefully highly probable ones. With that said, I'd explore the simple-to-capture data points of basic touch events. You can leverage these data points and pull x/y axis on start/end touch:
touchStart: Triggers when the user makes contact with the touch
surface and creates a touch point inside the element the event is
bound to.
touchEnd: Triggers when the user removes a touch point from the
surface.
Here's one way to do it - it could be reasoned that if a user is left handed, they will use their left thumb to scroll up/down on the page. Now, based on the way the thumb rotates, swiping up will naturally cause the arch of the swipe to move outwards. In the case of touch events, if the touchStart X is greater than touchEnd X, you could deduce they are left handed. The opposite could be true with a right handed person - for a swipe up, if the touchStart X is less than touchEnd X, you could deduce they are right handed. See here:
Here's one reference on getting started with touch events. Good luck!
http://www.javascriptkit.com/javatutors/touchevents.shtml
There are multiple approaches and papers discussing this topic. However, most of them are written between 2012-2016. After doing some research myself I came across a fairly new article that makes use of deep learning.
What sparked my interest is the fact that they do not rely on a swipe direction, speed or position but rather on the capacitive image each finger creates during a touch.
Highly recommend reading the full paper: http://huyle.de/wp-content/papercite-data/pdf/le2019investigating.pdf
Whats even better, the data set together with Python 3.6 scripts to preprocess the data as well as train and test the model described in the paper are released under the MIT license. They also provide the trained models and the software to
run the models on Android.
Git repo: https://github.com/interactionlab/CapFingerId

Really Basic Graphics in C# 2.0 Tutorials

I work for a ticketing agency and we print out tickets on our own ticket printer. I have been straight coding the ticket designs and storing the templates in a database. If we need a new field adding to a ticket I manually add it and use the arcane co-ordinate system to estimate where the fields should go and how much the other fields need to move by to accomodate new info.
We always planned to make this system automate with a simple (I stress the word simple) graphical editor. Basically we don't forsee tickets changing radically in shape any time soon, we have one size of ticket and the ticket printer firmware is super simple because it's more of an industrial machine, it has about 10 fonts and some really basic sizing interactions.
I need to make this editor display a rectangle of the dimensions by pixel of the tickets (can even be actual size) and have a resizable grid which can toggle between superimposition and invisibility on top of the ticket rectangle and represented by dots rather than lines.
Then I want to be able to represent fields by drawing rectangles filled with the letter "x" that show the maximum size of the field (to prevent overlaps). These fields should be selectable, draggable and droppable in a snap to grid fashion.
I've worked out the maths of it but I have no idea how to draw rectangles and then draw grids in layers and then put further rectangles full of 'x'es on top of those. I also don't really know much about changing drawn positions in accordance with mouse events. It's simply not something I've ever had to do.
All the tutorials I've seen so far presume that you already know a lot about using the draw objects and are seeking to extend a basic knowledge of these things. I just need pointing in the direction of a good tutorial in manipulating floating objects in a picturebox in the first place.
Any ideas?
For those of you in need of a guide to this unusual (at least those of us with a BIS background) field I would heartily endorse:
https://web.archive.org/web/20141230145656/http://bobpowell.net/faqmain.aspx
I am now happily drawing graphical interfaces and getting them to respond to control inputs with not too much hassle.

Resources