Newest 'svm' Questions

3 votes

0 answers

18 views

Machine learning for importance check of three-way interaction in a longitudinal dataset

I want to know wether the interaction between the continous variable A, the continous variable B and the group (factor with 3 levels) is associated with a continous outcome Y in a longitudinal setting....

Friedebert

31

asked yesterday

-1 votes

1 answer

81 views

Categorical Dependent Variable

Repost: Hello all, thank you so much for the response. Here I have provided some information. a. This is clinical data which is around 859 in sample size. b. It has 11 columns as input features and ...

Ayesha Haya

11

asked Dec 6 at 7:45

0 votes

0 answers

16 views

Youtube Spam Classifier - Different Methods yielding the same accuracy (94%)

(CONTEXT) I'm currently doing a report project at my university to build a classifer model that classifies a comment as spam or ham (non-spam) using this data set, and then submit a prediction csv ...

KitanaKatana

389

asked Nov 29 at 15:16

4 votes

1 answer

144 views

why am I getting worst results when using CNN for feature extraction and SVM for classification

I am using 3d cnn for feature extraction and svm for classification But I got worst results then using the 3d cnn for both feature extraction and classification is that a normal thing ?

anya

41

asked Nov 19 at 8:43

2 votes

1 answer

20 views

Why do results from various experiments with different % of features selected through RFE for SVM-based classification yield inconsistent outcomes?

I now have a basic understanding of classifiers such as Random Forests, Gradient Boosted Trees, and Support Vector Machines. My tasks involve classifying layer stacks that consist of optical and radar ...

Giancarlo

23

asked Oct 25 at 21:37

0 votes

0 answers

11 views

When running a Support Vector Machine, how do I formulate the linear transformation that flips the decision hyperplane in the non-augmented dimension?

We know that when running a support vector machine, we actually use the "kernel trick" to compute the decision hyperplane (boundary) as if we do so in the kernel-augmented dimension, but not ...

Wonjae Oh

1

asked Oct 23 at 1:24

0 votes

0 answers

153 views

Prove matrix constructed based on gaussian RBF is PSD

I have a radial basis function $k(x, y) = \exp(-{(x-y)}^T M {(x-y)})$ where $M$ is a symmetric PSD matrix. I know that $k(\cdot)$ is a kernel itself: Prove that multiplication with positive ...

BiriBora

101

asked Sep 16 at 22:28

0 votes

0 answers

14 views

Accuracy for Permutation Test is very high

I am a bit confused as to what is happening with my classifier. I have a dataset of ~220 features and about 4000 trials. Classes are perfect balanced, and I'm doing a simple binary classification task ...

Chris

1

asked Sep 12 at 3:17

1 vote

0 answers

26 views

Some further explanation of Alex Smola's 1998 implementation of support vector regression

I am currently going through, and trying to implement the pseudo-code in Alex Smola's 1998 paper on support vector regression, particularly the one on sequential minimal optimization. (Section 4.6.3, ...

Nnanna

11

asked Sep 2 at 0:09

0 votes

0 answers

11 views

Averaging over labels instances for SVM classification

This is a hypothetical question about different ways to input training/testing data into an SVM model. I have 128 instances for each of two classes, that can be heirarchally grouped into 4 sets (i.e. ...

Dusacks1570

1

asked Aug 26 at 2:18

0 votes

0 answers

19 views

How to get the second stationary point condition corresponding to intercept when using the augmented weight vector and augmented design matrix in SVM?

Below is the formulation I got for SVM when using the equation of classifier as w.x + b = 0 I want to know why I am not getting the second stationary condition i.e. summation over i from 1 to n of (...

Shri

23

asked Jul 23 at 6:12

0 votes

0 answers

35 views

Mean and Standard Deviation of accuracy for SVM model prediction

I am training a SVM model for binary classification. For this, I have split the train and test datasets in an 80:20 ratio. Then I standardized the training and test data separately and tuned the ...

Sultan Ahmed

101

asked Jul 10 at 17:12

1 vote

0 answers

24 views

Modeling for a data set that has different number of factors for each row (not binomial) [closed]

The modeling issue I'm having is that the categorical variable for each row has different number of factors. If I can reshape the data by products (a,b,c,.....~cost, hoursum, numPod, numDate), so that ...

rocknRrr

121

asked Jul 3 at 17:22

2 votes

1 answer

142 views

How do I perform a permutation test on a machine learning model to obtain a p-value for its performance?

this is kinda of the same question of this previous post. But since there's no reply, and I'm having a hard time to find some answers, I'd like to ask it again. I'm training a regression model (SVM ...

artvmac

73

asked Jun 21 at 23:31

0 votes

0 answers

36 views

How to handle Data Normalization in case that a Logarithmic scale is required?

Let's say we wished to build a Regressor (e.g. a Support Vector Regressor) to predict the price of an asset, within a given time span from now on. However, what if the historical data we have ...

Juan Flautista De Torrepacheco

105

asked Jun 18 at 11:12

1 vote

0 answers

40 views

Derivation of dual formulation of support vector regression

I'm trying to derive the dual formulation of epsilon-insensitive support vector regression. I think my derivation is correct, but I can't match it up to a result for the dual that I've seen given in ...

oweydd

225

asked Jun 10 at 13:08

0 votes

0 answers

28 views

I want to plot the decision boundaries of an SVM model with more than 2 variables

I understand that that is impossible to visualize, so I went in and PCA-transformed the variables. The problem is that I still need more than 2 principal components to get "good" ...

maglorismyspiritanimal

413

asked Jun 7 at 10:58

0 votes

0 answers

47 views

Applying PCA Before Training Multiple SVM Binary Classifiers To Reduce Data

I am working on a project which has a goal to determine if a new sample is part of Class A or Class A'. I need multiple of those classifiers. I will have an SVM to classify between: ClassA - ClassA' ...

guitardenver

121

asked May 31 at 18:19

1 vote

1 answer

165 views

Non-linear kernel for classifying data points corresponding to two concentric circles [closed]

Have seen article, while doing self-study, on Non-linearly seperable problems, here. The images as given there are here, and here. It deals a common text-book problem, where the data points are in two ...

jiten

113

asked May 23 at 14:50

0 votes

1 answer

32 views

SVM Kernel to compare histograms as input vectors

In lecture 7 of CS229 by Andrew Ng he mentions at the very end a specific Kernel that allows an SVM to "classify" how similar two histograms are, such as the demographics of 2 countries. He ...

yyyLLL

33

asked May 18 at 23:32

4 votes

5 answers

457 views

Is it valid to exhaustively test all possible combinations of features to find the best combination?

I have about 1000 labelled observations from about 50 subjects responding physiologically under different situations and am trying to classify the situation (usually into three classes of roughly ...

user1596274

169

asked May 18 at 18:21

2 votes

1 answer

89 views

Is my understanding/approach to nested cross-validation, final model tuning correct?

I am training a SVM on limited training data with unbalanced classes. Here are the things that I want to do: 1.) I want to make a statement of the generalizability ...

curious

115

asked May 15 at 21:04

0 votes

0 answers

30 views

How is ROC AUC calculated for a Support Vector Machine?

My understanding is that a support vector machine (SVM) finds a hyperplane that separates two classes from each other. During training, there can be some amount of error allowed so that some classes ...

inquisitive_hamster

13

asked May 8 at 3:02

0 votes

0 answers

43 views

Should I interprete data as noise or not

I am tackling a classification problem with 3 classes. Here is what those classes look like on the Two first principal axes. I fine-tuned a SVM model and the best performance achievable was 50%. By ...

Yann

53

asked May 1 at 18:28

1 vote

1 answer

72 views

How does fitting data work in SVM using the Kernel Trick?

In SVM, I understand how to fit some data after transforming it into a higher dimension. (ex: $(X_1, X_2) \to (X_1, X_2, X_1^2, X_2^2, X_1X_2)$, which is a 2 dimension to 5 dimension transformation). ...

Random user33

23

asked Apr 24 at 17:32

0 votes

0 answers

65 views

About the hinge loss and slack variables

I'll be denoting the $ith$ training example, target label and slack variable as $\mathbf{\vec x}^{(i)}$, $y^{(i)}$ and $\xi_i$ respectively. Hinge Loss : The hinge loss function in the context of ...

Sagnik Taraphdar

115

asked Apr 3 at 13:18

0 votes

0 answers

46 views

How is the SVM optimization objective derived from the hinge loss function?

The hinge loss function, in the context of SVMs, is given as: $$ \mathcal{L}(\mathbf{\vec w}, b\,; \mathbf{\vec x}^{(i)}, y ^{(i)}) = \max(0, 1-y ^{(i)}(\mathbf{\vec w}\cdot \mathbf{\vec x}^{(i)} + b))...

Sagnik Taraphdar

115

asked Apr 3 at 6:55

3 votes

1 answer

41 views

How to determine one-class SVM's $r$ parameter after obtaining $\alpha$ from QP programming solver?

I'm reading about one-class SVM in wiki here: One-class SVM. One-class SVM attempts to learn $r$ and $c$ to fit a hypersphere to the dataset. The formula for assigning labels is: $$sign(r^2 - ||\phi(x)...

MathematicsBeginner

171

asked Mar 30 at 8:44

0 votes

0 answers

32 views

Scholkopf single class linear SVM equation: why ρ substracted to 1/2 ||w||² is the same as maximizing the distance

In the one class linear SVM, the equation is : $\min_{w, \rho} \frac{1}{2} \|w\|^2- \rho + C\sum_{i=1}^{n} \xi_i$ subject to: $\begin{align*} & w \cdot x_i \geq \rho - \xi_i, \\ & \xi_i \geq 0,...

Arnaud Feldmann

116

asked Mar 24 at 17:47

0 votes

1 answer

87 views

Learning Curve to Know Underfitting or Overfitting

I want to know if the model I am using tends to be overfitting or underfitting. I am using SVM and Random Forest algorithms. How to figure it out?

Anna

3

asked Mar 23 at 14:43

0 votes

0 answers

10 views

Introducing bias via combining probability outputs from multiple models

I am working on a classification task, where I am trying to estimate the probability that a patient may not die. I did use a Survival Analysis approach at first, but the results seemed unintuitive and ...

vjgu

23

asked Mar 23 at 4:19

0 votes

0 answers

23 views

Can I find the explicit feature map that generates exponent of a kernel?

Let's say I have a kernel $K$, and another kernel of the form : $$ K' = e^K $$ now I know how to prove K' is a kernel, I can do it using taylor expansion of $e^x$ around $0$, but let's say if I want ...

aroma

123

asked Mar 8 at 17:22

1 vote

0 answers

43 views

Support Vector Classifiers for Overlapping Classes

I am currently studying support vector classifiers (SVC), more specifically, the solution to the Lagrangian (Wolfe) dual function with the help of the book "The Elements of Statistical Learning&...

Kobi

11

asked Mar 5 at 18:11

2 votes

2 answers

109 views

How is the Representer theorem used in the derivation of the SVM dual form?

This is the primal form of the SVM hypothesis : $$ h _{\mathbf{\vec w}, b}(\mathbf{\vec x}^{(i)}) = \mathbf{\vec w}\cdot \mathbf{\vec x}^{(i)} + b $$ The Representer theorem as formulated here ...

Sagnik Taraphdar

115

asked Feb 24 at 6:51

3 votes

1 answer

126 views

Why is the regularization term multiplied by the error term in the cost function of SVM?

The cost function of the Optimal Margin Classifier(non-kernelized SVM) is given as : $$ J(\mathbf{\vec w}, b) = \frac{1}{2}\|\mathbf{\vec w}\|_{2}^{2} + C \sum_{i=1}^{n}\max(0, 1-y ^{(i)}(\mathbf{\vec ...

Sagnik Taraphdar

115

asked Feb 15 at 13:51

3 votes

1 answer

251 views

Scenario where minimizing 0-1 loss is different than minimizing hinge loss

Suppose we're using linear predictors. I'm trying to conceptually understand how minimizing hinge loss and 0-1 loss aren't necessarily the same. For instance I was told that one can choose a set of ...

redbull_nowings

21

asked Jan 31 at 18:40

0 votes

0 answers

58 views

How to use random kitchen sinks for $\sigma \neq 1$?

The RBF kernel is given by $$ k(x,y) = \exp\left(-\frac{\| x - y \|_2^2}{2 \sigma^2}\right) $$ where $\sigma$ is the length-scale parameter. I want to use the random kitchen sinks method to create a ...

user336650

11

asked Jan 25 at 20:33

0 votes

0 answers

17 views

Linear SVM vs Decision Stumps for AdaBoost

I have heard that AdaBoost can use a linear SVM as a weak classificer. I wonder why Decision Stumps is often used with AdaBoost? Booth are binary classifiers. In my opinion, linear SVM seems to be a ...

euraad

425

asked Jan 22 at 10:35

3 votes

2 answers

166 views

Support Vector machine - Hingeloss

What does it mean that 'The SVM hinge loss estimates the mode of the posterior class probabilities'(Elements of statistical Learning p.427). The decision function f(x) assigns to the positive class(+...

J.doe

369

asked Dec 21, 2023 at 13:13

1 vote

1 answer

42 views

Availability of Linear Grouping Algorithms to Linearly Cluster Datasets

I have been trying to cluster a scatter plot that has a triangular graph, ideally the proper clustering plot should have a linear form, as shown below: I tried using Spectral Clustering: and ...

NOT-A-CS-GUY

39

asked Dec 21, 2023 at 7:02

1 vote

0 answers

30 views

Feature selection before ML (RF and SVM)

I am new to machine learning and have to work with big data (lots of OTUs along with clinical) which I will input into 2 different machine learning models (RF and SVM) that will be used for prediction ...

Tori

11

asked Dec 13, 2023 at 4:34

2 votes

1 answer

41 views

Interpreting the formula for Riemannian metric tensor

In Improving support vector machine classifiers by modifying kernel functions, the authors defined Riemannian metric tensor for a kernel as follows: $$ \begin{align} g(\vec{x}) &= \text{det}|g_{ij}...

Omar Shehab

95

asked Dec 12, 2023 at 18:48

5 votes

2 answers

3k views

Support Vector Regression vs. Linear Regression

I am new to ML and I am learning the different algorithms one can use to perform regression. Keep in mind that I have a strong mathematical background, but I am new in the ML field. So I understand ...

kubo

195

asked Dec 5, 2023 at 11:02

1 vote

1 answer

35 views

An extremely simple classification problem leads to intractable SVM program

In the popular textbook Mathematics for Machine Learning, creating a SVM requires solving: $\text{min}_{w,b} \dfrac{1}{2}\|w\|^2$ subject to $y_n (w^T x_n + b) \geq 1$, for all $n = 1, \ldots, N$ Ok, ...

Fraïssé

1,630

asked Dec 5, 2023 at 0:08

2 votes

0 answers

117 views

Textbook Recommendation other than ESL [duplicate]

My current background is as follows: (core subjects only) Math : Linear Algebra, Analysis, (half of) Measure TheoryStats : Mathematical Statistics, Regression Analysis, Multivariate Analysis "...

jason 1

311

asked Dec 1, 2023 at 10:33

1 vote

0 answers

36 views

Convexitiy of multi-class hinge loss

The empirical risk of a multi-class hinge-loss is given by $$L(\Theta,(x,y) = \max_{j \neq y} \Big[1+ \sum_{i=1}^{d} x_i(\Theta_{ij} - \Theta_{iy}) \Big]_{+} $$ where $x \in \mathbb{R}^{d}$ is a ...

Oskar

265

asked Nov 28, 2023 at 16:16

2 votes

0 answers

32 views

Implement Nesterov's acceleration for SVM

I am trying to implement Nestrov's acceleration gradient descent for SVM. The objective function I need to minimize is $$\frac{1}{2}\lVert Au-Bv\rVert_2^2$$ with constraints $\sum_{i}u_i=\sum_{j}v_j=1$...

struggleinmath

21

asked Nov 26, 2023 at 21:49

1 vote

0 answers

26 views

How to forecast changepoints from Gas Concentration Data?

So I'm trying to predict when gas concentrations change from sensor conductivity readings over a day. The gases randomly change concentrations around every 80-120 seconds and are kept constant between ...

Jawi Doen

11

asked Nov 12, 2023 at 16:42

0 votes

0 answers

69 views

Is $\ell_1$ regularization not compatible with SVM?

In the notes of Andrew Ng's CS229 Machine Learning course, it is mentioned: The $\ell_2$ norm regularization is much more commonly used with kernel methods because $\ell_1$ regularization is ...

Katatonic

9

asked Nov 11, 2023 at 12:50

2 votes

2 answers

343 views

What method should be used if the clusters contains different classes?

Assume that you having $N$ clusters. Each cluster have multiple classes. So we know the class ID for every major clusters, but not the class ID for the data points inside the major clusters. Each ...

euraad

425

asked Nov 1, 2023 at 16:30

Questions tagged [svm]

Related Tags