Should I keep images at original size for a better classification? - tensorflow

I'm building an image classification model with keras.
I have images of several dimensions, the smaller one being 400x400.
Let's suppose that every image is a square, so proportions should be safe: is it bad to scale them to, say, 64x64 to make processing faster?
Or, in order to achieve best quality, is it better to keep them as close as possible to their original size?
I would say that scale them is good as it helps avoiding overfitting, but I'd like to hear your opinion.

You are correct on scaling the image helps avoiding overfitting. However the answer depends on your objective. Do you require faster processing or higher quality images? Images as small as 400x400 need high quality to render visibly.

Related

Scaling Image to multiple sizes for Deep Zoom

Lets assume I have a bitmap with a square aspect and width of 2048 pixels.
In order to create a set of files need by Silverlight's DeepZoomImageTileSource I need to scale this bitmap to 1024 then to 512 then to 256 etc down to 1 pixel image.
There are two, I suspect naive, approaches:-
For each image required scale the original full size image to the required size. However it seems excessive to be scaling the full image to the very small sizes.
Having scaled from one level to the next discard the original image and scale each sucessive scaled image as the source of the next smaller image. However I suspect that this would generate images in the 256-64 range with poor fidelity than using option 1.
Note unlike with the Deep Zoom Composer this tool is expected to act in an on-demand fashion hence it needs to complete in a reasonable timeframe (tops 30 seconds). On the pluse side I'm only creating a single multiscale image not a pyramid of mutliple high-res images.
I am outside my comfort zone here, any graphics experts got any advice? Am I wrong about point 2? Is point 1 reasonably performant and I'm worrying about nothing? Option 3?
About 1: This seems the best way. And it is not that excessive if you don't reload the source image each time.
About 2: Yes, you would looses (some) quality by scaling in steps.
I think 30 sec should be sufficient to scale a picture (a few times). Any optimization would be in the area of caching the results.
What you are essentially trying to do is create MipMaps (see http://en.wikipedia.org/wiki/Mipmap). If you start out with a power of 2 square image, then scaling down the image to half the size then using the scaled down image to reduce its size by 2 again should give the same results as taking the original image and scaling it down by a factor of 4.
Each pixel in the half sized image will be the average of 4 pixels in the original image. Each pixel in the quater sized image will be the average of 16 pixels. It doesn't matter if you take the average of 16 pixels or the average of 4 pixels which where the average of 4 other pixels.
So I'd say you'd be fine with successively scaling images down as you mentioned in option 2. If you want to be sure then try both ways and compare the images.
I realize this question is old, and maybe there's a reason you haven't done this, but if you get Microsoft's free Deep Zoom Composer, it comes with a DLL for making Deep Zooms, "DeepZoomTools.dll". It will create Deep Zoom compositions of a single image, or composites of many images. (The class ImageCreator does the heavy lifting of resizing images).
You should investigate the licensing implications if you're developing a commercial application, but reusing code is always better than writing it yourself.

Mnist pixel borders

I'm currently playing with tensorflow and mnist code. The mnist dataset from Yann Lecun contain 20x20 pixel images which were centered in a 28x28 image by computing the center of mass of the pixels. The result is to have at least a 4 pixels border which improve the result analisys. I've searched and read a lot on mnist but I cannot find why 4 pixels were used.
I'm computing 100x100 pixel images with 5 pixel border within, but I have no idea if this enough or not. I could try to change the border size and compare results but this would take me ages. Knowledge and applying good practices are better I think. So how to define the best border size?
In my experience it's not common practice to use borders at all, outside of MNIST. If you're trying to recognize objects within images (rather than digits) you should just supply the whole image, possibly with some random cropping or other distortions to help the learning process. The best practices for other tasks will vary with the domain, but generally comes from fairly common-sense intuitions about the inputs the model is likely to encounter in production.

Scaling an image up in Corona SDK without it becoming fuzzy

I am working on a classic RPG that requires a pixelated style of graphics. I want to do this by making a small image and scaling it up. However, when I do this, it gets fuzzy. Is there any way to scale it while keeping a crisp edge for every pixel, or do I just need to make a bigger image?
You cannot scale an image expecting it to keep a crisp aspect if it's not made in a big enough resolution in the first place. In your case you would have to make a bigger image and scale it down to make the small image.
If you do not use the large image all the time however, you should consider having two versions of the same image (one small / one large) for optimization sake.

how to convert the small image into big images without affect the resolution in Photoshop? [closed]

I am using Photoshop CS6.
I have images in small sizes(3.5mm X 3.5mm).
I enlarge the image size in(10cm X 8 cm).
Then the image quality are going low..
SO how to enlarge the images without affect the resolution.
BiCubic Smoother is not satisfied me..
Is there any way to resize images to high resolutions without losing pixels.
If you enlarge an image with a factor of 6:1 (as in this case) you will have an image missing 5/6 of information that need to be "filled" with constructed information by mathematical means. In most cases interpolation (bi-cubic or otherwise) is used.
Unfortunately this will never result in anything sharp and high quality due to the nature of interpolating (basically averaging the constructed color points between the actual pixels). The picture will appear blurry no matter what you try to do in a case like this.
You can always throw a sharpening convolution on it, but the result will never be ideal.
For example, lets say I have a 2x1 pixel image that looks like this (enlarged for example):
If I now want to enlarge this image using interpolation I will end up with an image containing information like this:
As you can see two points between the black and white needs to be reconstructed. As there is no way of knowing how these points would look like (as they never existed in the image in the first place) we need to guess how they would look like by averaging the black and white points.
This will result in a "gray scale" that will result in the image looking blurry.
The more complex interpolation algorithms can make a better guess by using more points to get a Bezier approach for the non-existing points and so forth, but it will always be a good guess at best.
Now, this example uses 2:1 enlarging. You can probably by now imagine then how 6:1 scale will appear.
It is impossible in this way. you will lose quality because your image and Photoshop are pixel based.
you can convert your picture to vector using softwares like corel draw.

How to identify/check blur image?

How can we identify that a given image is blur or how much percent is it blur in C#? Is there any API available for that? Or any algorithm that can be helpful in that?
Thanks!
You could perform a 2D-FFT and search the frequency coefficients for a value over a certain threshold (to elimate false-positives from rounding/edge errors). A blurred image will never have high frequency coefficients (large X/Y values in frequency-space).
If you want to compare with a certain blurring algorithm, run a single pixel through a 2D-FFT and check further images to see if they have frequency components outside the range of the reference FFT. This means you can use the same algorithm regardless of what type of blurring algorithm is used (box blur, gaussian, etc)
For a very specific problem (finding blurred photos shot to an ancient book), I set up this script, based on ImageMagick:
https://gist.github.com/888239
Given a blurred bitmap alone, you probably can't.
Given the original bitmap and the blurred bitmap you could compare the two pixel by pixel and use a simple difference to tell you how much it is blurred.
Given that I'm guessing you don't have the original image, it might be worthing looking at performing some kind of edge detection on the blurred image.
This Paper suggests a method using Harr Wavelet Transform, but as other posters have said, this is a fairly complex subject.
Although, I am posting this answer after 8-9 years after asking the question. At that time, I resolved this problem by applying blur on image and then comparing with the original one.
The idea is, when we apply blur on a non-blurry image and then compare them, the difference in images is very high. But when we apply blur on a blurry image, the difference in images is almost 10%.
In this way, we resolved our problem of identifying blur images and the results were quite good.
Results were published in following conference paper:
Creating digital life stories through activity recognition with image filtering (2010)
Err ... do you have the original image? What you ask is not a simple thing to do, though ... if its even possible
This is just a bit of a random though but you might be able to do it by fourier transforming the "blurred" and the original image and seeing if you can get something that has a very similar frequency profiles by progressively low pass filtering the original image in the frequency domain. Testing for "similarity" would be fairly complex in itself though.

Resources