How to extract outline of an object using tensorflow? - tensorflow

The object detection example application on TensorFlow is great - https://www.tensorflow.org/lite/models/object_detection/overview
However, I don't need to identify an object I just need to know that there are few objects in the scene and their exact outlines. I want an ability to kind of smart select an object in an image.
The object detection API provides an output like below :-
What I really want that there are two objects in the scene (don't matter what) and pixel positions of the outline of those two object which kind of shrink wraps them instead of a rectangular bound.

Related

Background images in one class object detection

When training a single class object detector in Tensorflow, I am trying to pass instances of images where no signal object exists, such that the model doesn't learn that every image contains at least one instance of that class. E.g. if my signal were cats, id want to pass pictures of other animals/landscapes as background -this could also reduce false positives.
I can see that a class id is reserved in the object detection API (0) for background, but I am unsure how to code this into the TFrecords for my background images - class could be 0 but what would be the bounding box coords? Or do i need a simpler classifier on top of this model to detect if there is a signal in the image or not, prior to detecting position?
Later approach of simple classifier, makes sense. I don't think there is a way to do the first part. You can use check on confidence score as well apart from checking the object is present.

Best approach to read, optimize (polygon crunch) and display 3D model in C#/WPF

I'm creating a tool that converts hi-poly meshes to low-poly meshes and I have some best practice questions on how I want to approach some of the problems.
I have some experience with C++ and DirectX but I prefer to use C#/WPF to create this tool, I'm also hoping that C# has some rich libraries for opening, displaying and saving 3d models. This brings me to my first question:
Best approach for reading, viewing and saving 3d models
To display 3D models in my WPF application, I'm thinking about using the Helix 3D toolkit.
To read vertex data from my 3D models I'm going to write my own .OBJ reader because I'll have to optimize the vertices and write out everything
Best approach for optimizing the 3d model
For optimization things will get tricky, especially when dealing with tons of vertices and tons of changes. Guess I'll keep it simple at the start and try to detect if an edge is on the same slope as adjacent edges and then I'll remove that redundant edge and retriangulate everything.
In later stages I also want to create LODs to simplify the model by doing the opposite of what a turbosmooth modifier does in Max (inverse interpolation). I have no real clue how to start on this right now but I'll look around online and experiment a little.
And at last I have to save the model, and make sure everything still works.
For viewing 3D objects you can also consider the Ab3d.PowerToys library - it is not free, but greatly simplifies work with WPF 3D and also comes with many samples.
OBJ file is good because it is very commonly used and has very simple structure that is easy to read and write to. But it does not support object hierarchies, transformations, animations, bones, etc. If you will need any of those, than you will need to use some other data format.
I do not have any experience in optimizing hi-poly meshes, so I cannot give you any advice here. Here I can only say that you may also consider combining the meshes with the same material into one mesh - this can reduce the number of draw calls and also improve performance.
My main advice is on how to write your code to make it perform better in WPF 3D. Because you will need to check and compare many vertices, you need to avoid getting data from the MeshGeometry3D.Positions and MeshGeometry3D.TriangleIndices collections - accessing a single value from those collections is very slow (you may check the .Net source and see how many lines of code are behind each get).
Therefore I would recommend you to have your own structure of meshes with Lists (List, List) for Positions and TriangleIndices. In my observations, Lists of structs are faster than using simple arrays of structs (but the lists must be presized - their size need to be set in constructor). This way you can access the data much faster. Also, when an extra boost is needed, you may also use unsafe blocks with pointers. You may also add some other data to your mesh classes - for example you mentioned adjacent edges.
Once you have your positions and triangle indices set, you can create the WPF's MeshGeometry3D object with the following code:
var wpfMesh = new MeshGeometry3D()
{
Positions = new Point3DCollection(optimizedPositions),
TriangleIndices = new Int32Collection(optimizedTriangleIndices)
};
This is faster than adding each Point3D to Positions collection.
Because you will not change that instance of wpfMesh (for each change you will create a new MeshGeometry3D), you can freeze it - call Freeze() on it. This allows WPF to optimize the meshes (combine them into vertex buffers) to reduce the number of draw calls. What is more, after you freeze a MeshGeometry3D (or any other WPF object), you can pass it from one thread to another. This means that you can parallelize your work and create the MeshGeometry3D objects in worker threads and then pass them to UI thread as frozen objects.
The same applies to change the Positions (and other data) in MeshGeometry3D object. It is faster to copy the existing positions to an array or List, change the data there and then recreate the Positions collection back from your array, then to change each individual position. Before doing any change of MeshGeometry3D you also need to disconnect it from the parent GeometryModel3D to prevent triggering many change events. This is done with the following:
var mesh = parentGeometryModel3D.Geometry; // Save MeshGeometry3D to mesh
parentGeometryModel3D.Geometry = null; // Disconnect
// modify the mesh here ...
parentGeometryModel3D.Geometry = mesh; // Connect the mesh back

How can I instance a model for collision-detection?

Just started some work in c# (using xna) where I want to check collision between two objects using their models boundingspheres. Well, rather the meshes' boundingspheres, for more detailed detection between the objects.
The trick is the objects use the same model-reference. And since they both use the reference I am unwilling to manipulate the root-bones' transform. Both objects wanting to check collision have their own matrix of course.
I've run out of ideas on how to do it, so I could use some help with it. (This is also not homework, just saying.)
I've looked at MSDN:s example of instancing a model for rendering, but that wouldn't help with my issue (as far as I know).
Any tip is appreciated!
I remembered that the boundingspheres are structs, so therefore it would be easier to copy those compared to the model. So I use the objects matrix (containing position etc) to transform the list of B-spheres I get from the model. Therefore they are on the correct position where they would be if I moved the model.

How to move and rotate group of polygons in xna?

In openGL we could create some polygons and connect them as a group by the function
'pushMatrix()' and then we could rotate them and move them as one object..
Is there a way to do it with xna? if i have 3 polygons and i want to rotate and move them all together as a group how can i do that?
EDIT:
‫I am using Primitives Shapes to build a Skeleton of a basketball player.
The game will only be a shoot out game to the basket, which means the player
will only have to move his Arm.
I need a full control over the Arm parts, and in order to do that, I need to move
the Arm which is built from Primitive shapes Harmonicaly. In order to do that,
I've tried Implementing the MatrixStack for performing matrix Transformations but with
no success. Any suggestions?‬
I will answer this in basic terms, as I can't quite gleen from your question how well versed you are with XNA or graphics development in general. I'm not even sure where your problem is; is it the code, the structure or how XNA works compared to OpenGL?
The short answer is that there is no matrix stack built in.
What you do in OpenGL and XNA/DX is the very same thing when working with matrices. What you do with pushMatrix is actually only preserving the matrix (transformation) state on a stack for convenience.
Connecting objects as a group is merely semantics, you don't actually connect them as a group in any real way. What you're doing is setting a render state which is used by the GPU to transform and draw vertices for every draw call thereafter until that state is once again changed. This can be done in XNA/DX in the same way as in OpenGL.
Depending on what you're using to draw your objects, there are different ways of applying transformations. From your description I'm guessing you're using DrawPrimitives (or something like that) on the GraphicsDevice object, but whichever you're using, it'll use whatever transformation has been previously applied, normally on the Effect. The simplest of these is the BasicEffect, which has three members you'd be interested in:
World
View
Projection
If you use the BasicEffect, you merely apply your transform using a matrix in the World member. Anything that you draw after having applied your transforms to your current effect will use those transforms. If you're using a custom Effect, you do something quite like it except for how you set the matrix on the effect (using the parameters collection). Have a look at:
http://msdn.microsoft.com/en-us/library/bb203926(v=xnagamestudio.40).aspx
http://msdn.microsoft.com/en-us/library/bb203872(v=xnagamestudio.40).aspx
If what you're after is an actual transform stack, you'll have to implement one yourself, although this is quite simple. Something like this:
Stack<Matrix> matrixStack = new Stack<Matrix>();
...
matrixStack.Push( armMatrix );
...
basicEffect.World = matrixStack.Peek();
foreach (EffectPass pass in basicEffect.CurrentTechnique.Passes)
{
pass.Apply();
graphics.GraphicsDevice.DrawPrimitives(...);
}
basicEffect.End();
...
matrixStack.Pop();

How to Create Model from Scratch

I am working to generate terrain for our project, something that will be contained in the Model class that I can draw, but I new class would be alright since I may need to look inside for specific data often, and then I would just need the basic function to work with the Game class.
Anyway, I have a fair amount of knowledge of the XNA framework, but because of how convoluted it handles anything. So my problem is I can't just make a Model, I can't instantiate that class or anything. I have what I believe the proper data to form a model's geometry, which is all I need right now, and later possibly have it textured.
I don't know where to go from here.
XNA you usually use Content.Load, to have their content pipeline read in a file and parse it specifically, but I want to avoid that because I want my terrain generated. I can compute an array of Vertex data and indices for the triangles I want to make-up a mesh, but so far my efforts have tried to instantiate any object like Model or those it contains, have failed.
If there is some factory class I can use to build it, I have no idea what that is, so if someone else can point me in the right direction there and give me a rough outline on how to build a model, that would help.
If that's not the answer, maybe I need to do something completely different, either centered on using Content.Load or not, but basically I don't want my terrain sitting in a file, consistent between executions, I want to control the mesh data on load and randomize it, etc.
So how can I get a model generated completely programmatically, to show up on the screen, and still have its data exposed?
Model and its associated classes (eg: ModelMesh), are convenience classes. They are not the only way to draw models. It is expected that sometimes, particularly when doing something "special", you will have to re-implement them entirely, using the same low-level methods that Model uses.
Here's the quick version of what you should do:
First of all, at load time, create a VertexBuffer and an IndexBuffer and use SetData on each to fill each with the appropriate data.
Then, at draw time, do this:
GraphicsDevice.SetVertexBuffer(myVertexBuffer);
GraphicsDevice.Indices = myIndexBuffer;
// Set up your effect. Use a BasicEffect here, if you don't have something else.
myEffect.CurrentTechnique.Passes[0].Apply();
GraphicsDevice.Textures[0] = myTexture; // From Content.Load<Texture2D>("...")
GraphicsDevice.DrawIndexedPrimitives(...);

Resources