Wednesday, April 15, 2009
Discretization for naive Bayes
When using naive Bayes, quantitative attributes are usually discretized (binning). Two simple ways of discretization are partitioning the data into k intervals of equal width, or into k intervals so each contains equal number of data points. There are also supervised methods, like the one used in Weka discretizer, which recursively selects the cut-point based on information gain (just like in some decision tree construction algorithms). In the paper "Discretization for naive-Bayes learning: managing discretization bias and variance" (MLJ, Jan 09), the authors find out that, increasing the number of intervals helps reduce the bias of the learned NB classifier, while the number of data points within each interval (called "interval frequency") shall be large enough to reduce the variance of the learned classifier. Therefore, when we have a fixed size of training data, there's a trade-off between the interval number and frequency. Based on this observation they propose two novel discretization methods. "Proportional discretization" requires the interval number and frequency to be the same; "fixed frequency discretization" simply fixes the interval frequency to a sufficiently large value. Their experiments showed that these two methods outperformed existing discretization techniques.
Thursday, April 9, 2009
More on Indian buffet process
I had my lunch today in an Indian buffet for the first time. Since I knew nothing of Indian food, I decided to follow the Indian buffet process, as described in my last post. That is, I chose dishes with probabilities proportional to their popularity, but also chose some random new dishes. The result was good. That said, however, I still don't think I have enough courage to test the Chinese restaurant process in a Chinese restaurant :)
Back to mathematics. Remember that the Chinese restaurant process actually produces samples according to a distribution that is sampled from a Dirichlet process; and the stick-breaking construction is another way to sample from the Dirichlet process. It turns out that there exist counterparts for the Indian buffet process (IBP). The paper "Hierarchical beta processes and the Indian buffet process" (AISTATS 07) shows that for IBP the beta process is the counterpart of the Dirichlet process; and the paper "Stick-breaking Construction for the Indian Buffet Process" (also in AISTATS 07) shows a stick-breaking construction for IBP.
Back to mathematics. Remember that the Chinese restaurant process actually produces samples according to a distribution that is sampled from a Dirichlet process; and the stick-breaking construction is another way to sample from the Dirichlet process. It turns out that there exist counterparts for the Indian buffet process (IBP). The paper "Hierarchical beta processes and the Indian buffet process" (AISTATS 07) shows that for IBP the beta process is the counterpart of the Dirichlet process; and the paper "Stick-breaking Construction for the Indian Buffet Process" (also in AISTATS 07) shows a stick-breaking construction for IBP.
Subscribe to:
Posts (Atom)