Core-methods

Segmentation and segment-based metrics

Similar (or equal) to frame by frame evaluations but using the detailed error scores additionally.

wardmetrics.core_methods.eval_segments(ground_truth_events, detected_events, evaluation_start=None, evaluation_end=None)[source]

Segment-based evaluation (frame - length based)

Computes and scores segments and returns the occurrences of each error type in the overall dataset segments

Parameters:
  • ground_truth_events (list of tuples (start, end) or lists [start, end]) – numeric values (e.g. frame number or posix timestamp) for ground truth events’ start and end times
  • detected_events (list of tuples (start, end) or lists [start, end]) – numeric values (e.g. frame number or posix timestamp) for detected events’ start and end times
  • evaluation_start (numeric value or None) – This should be the first segment’s start value. None indicates that start of the first event should be used.
  • evaluation_end (numeric value or None) – This should be the first segment’s start value. None indicates that start of the first event should be used.
Returns:

  • twoset_results (dictionary) – result for the 2SET metrics as a dictonary
  • segments_with_detailed_categories (list of tuples) – list of detected segments including standard and detailed score categories
  • segment_counts (dictionary) – frame counts/length of segments for each category
  • normed_segment_counts (dictionary) – same as before but normed

Event-based metrics

wardmetrics.core_methods.eval_events(ground_truth_events, detected_events, evaluation_start=None, evaluation_end=None)[source]

Event-based evaluation

Assigns scores to each ground truth and detection event and calculates statistics

Parameters:
  • ground_truth_events (list of tuples (start, end) or lists [start, end]) – numeric values (e.g. frame number or posix timestamp) for ground truth events’ start and end times
  • detected_events (list of tuples (start, end) or lists [start, end]) – numeric values (e.g. frame number or posix timestamp) for detected events’ start and end times
  • evaluation_start (numeric value or None) – This should be the first segment’s start value. None indicates that start of the first event should be used.
  • evaluation_end (numeric value or None) – This should be the first segment’s start value. None indicates that start of the first event should be used.
Returns:

  • gt_scores (list) – score label for each ground truth event
  • detection_scores (list) – score label for each detected event
  • detailed_score_statistics (dictionary) – containing total number of events for each score category
  • standard_score_statistics (dictionary) – precision and recall values (normal and weighted with event length) based on standard event scores (TP, FP, TN, FN)