## Abstract

Support Vector Machine (SVM) is a powerful paradigm that has proven to be extremely useful for the task of classifying high-dimensional objects. In principle, SVM allows us to train scoring classifiers those that output a prediction score; however, it can also be adapted to produce probability-type outputs through the use of the Venn-Abers framework. This allows us to obtain valuable information on the labels distribution for each test object. This procedure, however, is restricted to very small data given its inherent computational complexity.

We circumvent this limitation by borrowing results from the field of computational geometry. Specifically, we make use of the concept of a coreset: a small summary of data that is constructed by discretising the feature space into enclosing balls, so that each ball will be represented by only one point.

Our results indicate that training Venn-Abers predictors using enclosing balls provides an average acceleration of 8 times compared to the regular Venn-Abers approach while largely retaining probability calibration. These stimulating results imply that we can still enjoy well-calibrated probabilistic outputs for kernel SVM even in the realm of large-scale datasets.

We circumvent this limitation by borrowing results from the field of computational geometry. Specifically, we make use of the concept of a coreset: a small summary of data that is constructed by discretising the feature space into enclosing balls, so that each ball will be represented by only one point.

Our results indicate that training Venn-Abers predictors using enclosing balls provides an average acceleration of 8 times compared to the regular Venn-Abers approach while largely retaining probability calibration. These stimulating results imply that we can still enjoy well-calibrated probabilistic outputs for kernel SVM even in the realm of large-scale datasets.

Original language | English |
---|---|

Publication status | Published - Sep 2020 |

Event | 9th Symposium on Conformal and Probabilistic Prediction with Applications (COPA 2020) - , Italy Duration: 9 Sep 2020 → 11 Sep 2020 https://cml.rhul.ac.uk/copa2020/ |

### Conference

Conference | 9th Symposium on Conformal and Probabilistic Prediction with Applications (COPA 2020) |
---|---|

Country/Territory | Italy |

Period | 9/09/20 → 11/09/20 |

Internet address |