F-Score

MLMetrics.f_score — Function.

f_score(targets, outputs, [encoding], [avgmode], [beta = 1]) -> Float64

Compute the F-score for the outputs given the targets. The F-score is a measure for accessing the quality of binary predictor by considering both recall and the precision. Which value(s) denote "positive" or "negative" depends on the given (or inferred) encoding.

If encoding is omitted, the appropriate MLLabelUtils.LabelEncoding will be inferred from the types and/or values of targets and outputs. Note that omitting the encoding can cause performance penalties, which may include a lack of return-type inference.

The return value of the function depends on the number of labels in the given encoding and on the specified avgmode. In case an avgmode other than :none is specified, or the encoding is binary (i.e. it has exactly 2 labels), a single number is returned. Otherwise, the function will compute a separate result for each individual label, where that label is treated as "positive" and the other labels are treated as "negative". These results are then returned as a single dictionary with an entry for each label.

Arguments

targets::AbstractArray: The array of ground truths $\mathbf{y}$.
outputs::AbstractArray: The array of predicted outputs $\mathbf{\hat{y}}$.
encoding: Optional. Specifies the possible values in targets and outputs and their interpretation (e.g. what constitutes as a positive or negative label, how many labels exist, etc). It can either be an object from the namespace LabelEnc, or a vector of labels.
avgmode: Optional keyword argument. Specifies if and how class-specific results should be aggregated. This is mainly useful if there are more than two classes. Typical values are :none (default), :micro for micro averaging, or :macro for macro averaging. It is also possible to specify avgmode as a type-stable positional argument using an object from the AvgMode namespace.
beta::Number: Optional keyword argument. Used to balance the importance of recall vs precision. The default beta = 1 corresponds to the harmonic mean. A value of beta > 1 weighs recall higher than precision, while a value of beta < 1 weighs recall lower than precision.

See also

accuracy, positive_predictive_value (aka "precision"), true_positive_rate (aka "recall" or "sensitivity")

Examples

julia> recall([1,0,0,1,1], [1,0,0,0,1])
0.6666666666666666

julia> precision_score([1,0,0,1,1], [1,0,0,0,1])
1.0

julia> f_score([1,0,0,1,1], [1,0,0,0,1])
0.8

julia> f_score([1,0,0,1,1], [1,0,0,0,1], beta = 2)
0.7142857142857143

julia> f_score([1,0,0,1,1], [1,0,0,0,1], beta = 0.5)
0.9090909090909091

julia> f_score([1,0,0,1,1], [1,-1,-1,-1,1], LabelEnc.FuzzyBinary())
0.8

julia> f_score([:a,:b,:a,:c,:c], [:a,:c,:b,:c,:c]) # avgmode=:none
Dict{Symbol,Float64} with 3 entries:
  :a => 0.666667
  :b => 0.0
  :c => 0.8

julia> f_score([:a,:b,:a,:c,:c], [:a,:c,:b,:c,:c], avgmode=:micro)
0.6

MLMetrics.f1_score — Function.

f1_score(targets, outputs, [encoding], [avgmode])

Same as f_score, but with beta fixed to 1.