discrete of continuous attributes in data mining - attributes

need help to know
" discrete of continuous attributes in data mining "
I searched for the subject and nothing useful
thank you


Google Vision API similar files

I prepare some solution for grouping documents using Google Vision API. I would like grouping documents by something like template of document.
If i firsty scan invoice from one company and a few days after a scan additional other invoice from the same company, can I check they are simlar?
This is not something that can be done by default with the Vision API.
You could use the visionapi to detect text using OCR and see if this gives enough information to do the clustering.
Otherwise you'd have to make a custom implementation. You could train a neural network to do classification, or maybe an easier "dumber" solution depending on how your input is structured.

How to do sensor fusion?

Let suppose that I have two measures from two difference sensors of the same variable. I'd like to know if there's a way to do an information fusion and obtain a unique measure that describes the best way possible the whole system (Both sensors).
I know Bar-Shalom - Campo sensor fusion model, but I'd like to know if there are any model that doesn't adopte the classical Gaussian assumption, so that the sensor fusion can deal with bad data/gross erros.
Thank you.
For sensor fusion, you can go for Kalman Filter. There are few tutorials and research papers available for extended kalman filter, used for sensor fusion.

Linear program model formulation

I'm having a hard time formulating a model that will later be implemented with (Octave, glpk function) linear programming.
The problem appears to be quite simple but I fail to translate it into the mathematical notation.
I have warehouses, each of them accumulates certain amounts of material.
This material needs to be transported to other locations called processing facilities.
The processing facility can ether exist at the same location the warehouse is or not at all.
The model would have to tell me which warehouses should have a processing facility with regards to cost.
I have a distance matrix between all warehouses and transportation cost per mile per tone of material.
Placing a processing facility has a price as well.
The problem I'm having is how to incorporate facility placement and transportation price to the model in such a way that the model would tell me where the processing should be done.
I've followed this example.
But I'm getting a feeling that my problem is a multivariate and should be solved differently.
Most parts are quite similar to the example, but you will have to introduce 0/1-variables p_j to indicate wether you have a processing-facility at location j. As a result you'll not have a plain LP but rather a MIP, but your solver should be able to handle this.
You'll have to add some conditions like x_ij <= p_j*M with some Big-M that's greater than all goods that might possibly be transported so you can only transport materials to locations that have a processing facility. Likewise you'll add some terms c_j*p_j to your cost-function to cover the placement-cost.

Gender Detection by audio

I've been searching everywhere for some form of gender detection by reading frequency data of a audio file. I've had no luck with finding a program that could do that or even anything that can output audio data so I can write a basic program to read it and manipulate it to determine gender of the speaker.
Do any of you know where I can find something to help me with this?
To reiterate, I basically want to have a program that when a person talks into a microphone it will say the gender of the speaker with a fair amount of precision. My full plan is to also have speech to text feature on it, so the program will write out what the speaker said and give some extremely basic demographics on the speaker.
*Preferably with a common scripting language thats cross platform or linux supported.
You're going to want to look into formant detection and linear predictive coding. Heres a paper that has some signal flow diagrams that could be ported over to scipy/numpy.
Though an old question but still if someone is interested in doing gender detection from audio, You can easily do this by extracting MFCC (Mel-frequency Cepstral coefficient) features and model it with machine learning model GMM (Gausssian Mixture model)
One can follow this tutorial which implements the same and has evaluated it on subset extracted from Google's AudioSet gender wise data.

Speaker Recognition [closed]

How could I differentiate between two people speaking? As in if someone says "hello" and then another person says "hello" what kind of signature should I be looking for in the audio data? periodicity?
Thanks a lot to anyone who can answer this!
The solution to this problem lies in Digital Signal Processing (DSP). Speaker recognition is a complex problem which brings computers and communication engineering to work hand in hand. Most techniques of speaker identification require signal processing with machine learning (training over the speaker database and then identification using training data). The outline of algorithm which may be followed -
Record the audio in raw format. This serves as the digital signal which needs to be processed.
Apply some pre-processing routines over the captured signal. These routines could be simply signal normalization, or filtering the signal to remove noise (using band pass filters for normal frequency range of human voice. Band pass filters can in turn be created using a low pass and a high pass filter in combination.)
Once it is fairly certain that the captured signal is pretty much free from noise, feature extraction phase begins. Some of the known techniques which are used for extracting voice features are - Mel-Frequency Cepstral Coefficients (MFCC), Linear Predictive Coding (LPC) or simple FFT features.
Now, there are two phases - training and testing.
First the system needs to be trained over the voice features of different speakers before it is capable to distinguish between them. In order to ensure that the features are correctly calculated, it is recommended that several (>10) samples of voice from speakers must be collected for training purposes.
Training can be done using different techniques like neural networks or distance based classification to find the differences in the features of voices from different speakers.
In testing phase, the training data is used to find the voice feature set which lies at the lowest distance from the signal being tested. Different distances like Euclidean or Chebyshev distances might be used to calculate this proximity.
There are two open source implementations which enable speaker identification - ALIZE: http://mistral.univ-avignon.fr/index_en.html and MARF: http://marf.sourceforge.net/.
I know its a bit late to answer this question, but I hope someone finds it useful.
This is an extremely hard problem, even for experts in speech and signal processing. This page has much more information: http://en.wikipedia.org/wiki/Speaker_recognition
And some suggested technology starting points:
The various technologies used to
process and store voice prints include
frequency estimation, hidden Markov
models, Gaussian mixture models,
pattern matching algorithms, neural
networks, matrix representation,Vector
Quantization and decision trees. Some
systems also use "anti-speaker"
techniques, such as cohort models, and
world models.
Having only two people to differentiate, if they are uttering the same word or phrase will make this much easier. I suggest starting with something simple, and only adding complexity as needed.
To begin, I'd try sample counts of the digital waveform, binned by time and magnitude or (if you have the software functionality handy) an FFT of the entire utterance. I'd consider a basic modeling process first, too, such as linear discriminant (or whatever you already have available).
Another way to go is to use an array of microphones and differentiate between the postions and directions of the vocal sources. I consider this to be a easier approach since the position calculation is much less complicated than separating different speakers from a mono or stereo source.