clj-duckling.ml.naivebayes
classify
(classify classifier bag-of-feats)
Tries to find the most likely class by computing each score for the given
bag of features
count-words
(count-words text)
datum-class
(datum-class datum)
datum-features
(datum-features datum)
top10-classes
(top10-classes classifier)
train-classifier
(train-classifier dataset)
Returns a Naive Bayes classifier.
Accepts a dataset following:
[[{:feat1 1 :feat2 4 :feat3 6} <class1>]
[{:feat2 5 :feat3 1 :feat6 9} <class2>]]
First, counts every occurrence of each feature for each class
Then, aggregates these counts into probabilities