Understanding Scikit-Learn's SVC: Decision Function and Predict
Scikit-Learn's SVC (Support Vector Classifier) is a powerful tool for classification tasks, particularly in situations where you have high-dimensional data or need to deal with non-linear decision boundaries. When using SVC, two commonly used methods are decision_function and predict. Understanding the differences between these methods and their appropriate use cases is essential for effectively leveraging SVC in your machine learning projects.
Table of Content
Introduction to Support Vector Classifier (SVC)
Support Vector Classifier (SVC) is a type of Support Vector Machine (SVM) used for classification tasks. SVM is a supervised learning model that finds the hyperplane which best separates the data points of different classes in a high-dimensional space. The main goal of SVM is to maximize the margin between the hyperplane and the nearest data points (support vectors) from any class.
Key Parameters:
- Kernel: The kernel function used to transform the data into a higher-dimensional space. Common kernels include 'linear', 'poly', and 'rbf'.
- C: The regularization parameter that controls the trade-off between margin and misclassification error.
- gamma: The kernel coefficient for 'rbf', 'poly', and 'sigmoid' kernels.
- decision_function_shape: The shape of the decision function, which can be 'ovr' (one-vs-rest) or 'ovo' (one-vs-one).
- One-vs-Rest (OVR): In this approach, one classifier is trained for each class against all other classes. The decision function returns the decision values for each class.
- One-vs-One (OVO): In this approach, one classifier is trained for each pair of classes. The decision function returns the decision values for each pair of classes.
Implementing SVC in Scikit-Learn
In scikit-learn, the SVC
class is used to implement Support Vector Classification. It supports both linear and non-linear classification through the use of kernel functions.
Scikit-Learn's SVC class provides an implementation of this algorithm with various kernel options, including linear, polynomial, radial basis function (RBF), and sigmoid.
from sklearn.svm import SVC
import numpy as np
X_train = np.array([[2, 1.5], [-2, -1], [-1, -1], [2, 1]])
y_train = np.array([0, 0, 1, 1])
X_test = np.array([[1, 5], [0.5, 0.5], [-2, 0.5]])
clf = SVC()
clf.fit(X_train, y_train)
The decision_function
Method
The decision_function
method in SVC calculates the distance of each sample in the input data from the separating hyperplane. This distance is known as the decision score. The decision function computes the signed distance from the hyperplane.
- A positive decision score indicates that the sample is on the positive side of the hyperplane.
- A negative score indicates the sample is on the negative side.
- The magnitude of the score indicates the confidence of the classification.
# Calculate decision function
decision_scores = clf.decision_function(X_test)
print("Decision Scores:", decision_scores)
Output:
Decision Scores: [-0.04274893 0.29143233 -0.13001369]
In this example, the decision scores provide insight into how far each test point is from the hyperplane
The predict
Method
The predict
method in SVC is used to assign a class label to each sample in the input data based on the decision scores. It is the most commonly used function in classification models.
- The
predict
method assigns the class label corresponding to the side of the hyperplane on which the sample lies. - For binary classification, it assigns the label 1 for positive scores and 0 for negative scores.
# Predict class labels
predictions = clf.predict(X_test)
print("Predictions:", predictions)
Output:
Predictions: [0 1 0]
In this example, the predict
method assigns class labels based on the decision scores calculated earlier
Relationship Between decision_function
and predict
The decision_function
and predict
methods are closely related:
- Decision Function: Provides the raw distance of each sample from the hyperplane, which can be used to understand the confidence of the prediction.
- Predict: Uses the decision scores to assign class labels. It essentially thresholds the decision scores to determine the class membership
For binary classification, the relationship between the decision function and the predicted class labels is straightforward: if the decision value is positive, the predicted label is the positive class, and if it's negative, the predicted label is the negative class.
In multi-class classification (using SVC with decision_function_shape='ovr'), the decision_function returns an array where each element corresponds to the decision value for each class. The class with the highest decision value is chosen as the predicted label.
Building and Evaluating a Linear SVM Classifier
Let’s consider an example to see how predict and decision_function work in practice.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
# Generate a synthetic binary classification dataset
# Adjust the number of informative features or clusters per class
X, y = make_classification(n_samples=100, n_features=2, n_classes=2,
n_informative=2, n_redundant=0, n_repeated=0, # Increased n_informative to 2
random_state=42)
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=42)
# Initialize and train an SVC model
model = SVC(kernel='linear', C=1.0)
model.fit(X_train, y_train)
# Get predictions using predict
predictions = model.predict(X_test)
print("Predicted class labels:", predictions)
# Get decision values using decision_function
decision_values = model.decision_function(X_test)
print("Decision function values:", decision_values)
Output:
Predicted class labels: [0 1 1 0 1 0 0 0 1 0 1 0 1 0 0 1 1 1 0 0 0 0 1 0 1 0 1 1 1 0]
Decision function values: [-2.72823763 1.77748148 2.66391537 -1.83620483 3.16825904 -0.70569557
-1.97719989 -2.28432341 5.71133357 -0.13715254 3.72340245 -1.1473952
2.73935006 -2.49641636 -2.34220583 3.86929847 3.70492997 3.93555536
-1.67017578 -2.77000083 -2.34121054 -4.02344281 2.38762757 -1.91081964
2.27148796 -1.94514428 0.47794686 3.31117939 1.86256405 -2.7255542 ]
Output Explanation:
- The predictions array contains the class labels predicted by the model for each test instance.
- The decision_values array contains the corresponding decision function values. Positive values indicate the positive class, and negative values indicate the negative class.
The choice of kernel can significantly impact the performance of SVC. The 'rbf' kernel is often a good default choice, but 'linear' or 'poly' might be more appropriate depending on the nature of the data.
Conclusion
Understanding the difference between predict and decision_function in Scikit-Learn's SVC is crucial for effectively utilizing the classifier. While predict is straightforward and commonly used for final classification tasks, decision_function provides deeper insights into the model’s decision-making process. It allows you to assess the confidence of predictions and make more informed decisions in applications such as threshold tuning, anomaly detection, and model evaluation.