F-Score
MLMetrics.f_score โ Function.f_score(targets, outputs, [encoding], [avgmode], [beta = 1]) -> Float64Compute the F-score for the outputs given the targets. The F-score is a measure for accessing the quality of binary predictor by considering both recall and the precision. Which value(s) denote "positive" or "negative" depends on the given (or inferred) encoding.
If encoding is omitted, the appropriate MLLabelUtils.LabelEncoding will be inferred from the types and/or values of targets and outputs. Note that omitting the encoding can cause performance penalties, which may include a lack of return-type inference.
The return value of the function depends on the number of labels in the given encoding and on the specified avgmode. In case an avgmode other than :none is specified, or the encoding is binary (i.e. it has exactly 2 labels), a single number is returned. Otherwise, the function will compute a separate result for each individual label, where that label is treated as "positive" and the other labels are treated as "negative". These results are then returned as a single dictionary with an entry for each label.
Arguments
targets::AbstractArray: The array of ground truths $\mathbf{y}$.outputs::AbstractArray: The array of predicted outputs $\mathbf{\hat{y}}$.encoding: Optional. Specifies the possible values intargetsandoutputsand their interpretation (e.g. what constitutes as a positive or negative label, how many labels exist, etc). It can either be an object from the namespaceLabelEnc, or a vector of labels.avgmode: Optional keyword argument. Specifies if and how class-specific results should be aggregated. This is mainly useful if there are more than two classes. Typical values are:none(default),:microfor micro averaging, or:macrofor macro averaging. It is also possible to specifyavgmodeas a type-stable positional argument using an object from theAvgModenamespace.beta::Number: Optional keyword argument. Used to balance the importance of recall vs precision. The defaultbeta = 1corresponds to the harmonic mean. A value ofbeta > 1weighs recall higher than precision, while a value ofbeta < 1weighs recall lower than precision.
See also
accuracy, positive_predictive_value (aka "precision"), true_positive_rate (aka "recall" or "sensitivity")
Examples
julia> recall([1,0,0,1,1], [1,0,0,0,1])
0.6666666666666666
julia> precision_score([1,0,0,1,1], [1,0,0,0,1])
1.0
julia> f_score([1,0,0,1,1], [1,0,0,0,1])
0.8
julia> f_score([1,0,0,1,1], [1,0,0,0,1], beta = 2)
0.7142857142857143
julia> f_score([1,0,0,1,1], [1,0,0,0,1], beta = 0.5)
0.9090909090909091
julia> f_score([1,0,0,1,1], [1,-1,-1,-1,1], LabelEnc.FuzzyBinary())
0.8
julia> f_score([:a,:b,:a,:c,:c], [:a,:c,:b,:c,:c]) # avgmode=:none
Dict{Symbol,Float64} with 3 entries:
:a => 0.666667
:b => 0.0
:c => 0.8
julia> f_score([:a,:b,:a,:c,:c], [:a,:c,:b,:c,:c], avgmode=:micro)
0.6MLMetrics.f1_score โ Function.f1_score(targets, outputs, [encoding], [avgmode])Same as f_score, but with beta fixed to 1.