Sklearn SVM - how to get a list of the wrong predictions?












2












$begingroup$


I am not an expert user. I know that I can obtain the confusion matrix, but I would like to obtain a list of the rows that have been classified in a wrong way in order to study them after classification.



On stackoverflow I found this Can I get a list of wrong predictions in SVM score function in scikit-learn but I am not sure to have understood everything.



This is an example code.



# importing necessary libraries
from sklearn import datasets
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import train_test_split

# loading the iris dataset
iris = datasets.load_iris()

# X -> features, y -> label
X = iris.data
y = iris.target

# dividing X, y into train and test data
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state = 0)

# training a linear SVM classifier
from sklearn.svm import SVC
svm_model_linear = SVC(kernel = 'linear', C = 1).fit(X_train, y_train)
svm_predictions = svm_model_linear.predict(X_test)

# model accuracy for X_test
accuracy = svm_model_linear.score(X_test, y_test)

# creating a confusion matrix
cm = confusion_matrix(y_test, svm_predictions)


To iterate through the rows and to find the wrong ones, the proposed solution is:



predictions = clf.predict(inputs)
for input, prediction, label in zip(inputs, predictions, labels):
if prediction != label:
print(input, 'has been classified as ', prediction, 'and should be ', label)


I didn't understand what is "input"/"inputs". If I adapt this code to my code, like this:



for input, prediction, label in zip (X_test, svm_predictions, y_test):
if prediction != label:
print(input, 'has been classified as ', prediction, 'and should be ', label)


I obtain:



[6.  2.7 5.1 1.6] has been classified as  2 and should be  1


Is the row 6 the wrong row? What are the numbers after the 6.? I am asking this because I am using the same code on a dataset that is bigger than this one, so I would like to be sure that I am doing the right things.
I am not posting the other dataset because unfortunately I can't, but the problem there is that I obtained something like this:



  (0, 253)  0.5339655767137572
(0, 601) 0.27665553856928027
(0, 1107) 0.7989633757962163 has been classified as 7 and should be 3
(0, 885) 0.3034934766501018
(0, 1295) 0.6432561790864061
(0, 1871) 0.7029318585026516 has been classified as 7 and should be 6
(0, 1020) 1.0 has been classified as 3 and should be 8


When I count every line of this last output, I obtain the double of the lines of the test set... So I am not sure that I am analysing exactly the wrong list of predicted results…
I hope to have been enough clear.










share|improve this question









$endgroup$

















    2












    $begingroup$


    I am not an expert user. I know that I can obtain the confusion matrix, but I would like to obtain a list of the rows that have been classified in a wrong way in order to study them after classification.



    On stackoverflow I found this Can I get a list of wrong predictions in SVM score function in scikit-learn but I am not sure to have understood everything.



    This is an example code.



    # importing necessary libraries
    from sklearn import datasets
    from sklearn.metrics import confusion_matrix
    from sklearn.model_selection import train_test_split

    # loading the iris dataset
    iris = datasets.load_iris()

    # X -> features, y -> label
    X = iris.data
    y = iris.target

    # dividing X, y into train and test data
    X_train, X_test, y_train, y_test = train_test_split(X, y, random_state = 0)

    # training a linear SVM classifier
    from sklearn.svm import SVC
    svm_model_linear = SVC(kernel = 'linear', C = 1).fit(X_train, y_train)
    svm_predictions = svm_model_linear.predict(X_test)

    # model accuracy for X_test
    accuracy = svm_model_linear.score(X_test, y_test)

    # creating a confusion matrix
    cm = confusion_matrix(y_test, svm_predictions)


    To iterate through the rows and to find the wrong ones, the proposed solution is:



    predictions = clf.predict(inputs)
    for input, prediction, label in zip(inputs, predictions, labels):
    if prediction != label:
    print(input, 'has been classified as ', prediction, 'and should be ', label)


    I didn't understand what is "input"/"inputs". If I adapt this code to my code, like this:



    for input, prediction, label in zip (X_test, svm_predictions, y_test):
    if prediction != label:
    print(input, 'has been classified as ', prediction, 'and should be ', label)


    I obtain:



    [6.  2.7 5.1 1.6] has been classified as  2 and should be  1


    Is the row 6 the wrong row? What are the numbers after the 6.? I am asking this because I am using the same code on a dataset that is bigger than this one, so I would like to be sure that I am doing the right things.
    I am not posting the other dataset because unfortunately I can't, but the problem there is that I obtained something like this:



      (0, 253)  0.5339655767137572
    (0, 601) 0.27665553856928027
    (0, 1107) 0.7989633757962163 has been classified as 7 and should be 3
    (0, 885) 0.3034934766501018
    (0, 1295) 0.6432561790864061
    (0, 1871) 0.7029318585026516 has been classified as 7 and should be 6
    (0, 1020) 1.0 has been classified as 3 and should be 8


    When I count every line of this last output, I obtain the double of the lines of the test set... So I am not sure that I am analysing exactly the wrong list of predicted results…
    I hope to have been enough clear.










    share|improve this question









    $endgroup$















      2












      2








      2





      $begingroup$


      I am not an expert user. I know that I can obtain the confusion matrix, but I would like to obtain a list of the rows that have been classified in a wrong way in order to study them after classification.



      On stackoverflow I found this Can I get a list of wrong predictions in SVM score function in scikit-learn but I am not sure to have understood everything.



      This is an example code.



      # importing necessary libraries
      from sklearn import datasets
      from sklearn.metrics import confusion_matrix
      from sklearn.model_selection import train_test_split

      # loading the iris dataset
      iris = datasets.load_iris()

      # X -> features, y -> label
      X = iris.data
      y = iris.target

      # dividing X, y into train and test data
      X_train, X_test, y_train, y_test = train_test_split(X, y, random_state = 0)

      # training a linear SVM classifier
      from sklearn.svm import SVC
      svm_model_linear = SVC(kernel = 'linear', C = 1).fit(X_train, y_train)
      svm_predictions = svm_model_linear.predict(X_test)

      # model accuracy for X_test
      accuracy = svm_model_linear.score(X_test, y_test)

      # creating a confusion matrix
      cm = confusion_matrix(y_test, svm_predictions)


      To iterate through the rows and to find the wrong ones, the proposed solution is:



      predictions = clf.predict(inputs)
      for input, prediction, label in zip(inputs, predictions, labels):
      if prediction != label:
      print(input, 'has been classified as ', prediction, 'and should be ', label)


      I didn't understand what is "input"/"inputs". If I adapt this code to my code, like this:



      for input, prediction, label in zip (X_test, svm_predictions, y_test):
      if prediction != label:
      print(input, 'has been classified as ', prediction, 'and should be ', label)


      I obtain:



      [6.  2.7 5.1 1.6] has been classified as  2 and should be  1


      Is the row 6 the wrong row? What are the numbers after the 6.? I am asking this because I am using the same code on a dataset that is bigger than this one, so I would like to be sure that I am doing the right things.
      I am not posting the other dataset because unfortunately I can't, but the problem there is that I obtained something like this:



        (0, 253)  0.5339655767137572
      (0, 601) 0.27665553856928027
      (0, 1107) 0.7989633757962163 has been classified as 7 and should be 3
      (0, 885) 0.3034934766501018
      (0, 1295) 0.6432561790864061
      (0, 1871) 0.7029318585026516 has been classified as 7 and should be 6
      (0, 1020) 1.0 has been classified as 3 and should be 8


      When I count every line of this last output, I obtain the double of the lines of the test set... So I am not sure that I am analysing exactly the wrong list of predicted results…
      I hope to have been enough clear.










      share|improve this question









      $endgroup$




      I am not an expert user. I know that I can obtain the confusion matrix, but I would like to obtain a list of the rows that have been classified in a wrong way in order to study them after classification.



      On stackoverflow I found this Can I get a list of wrong predictions in SVM score function in scikit-learn but I am not sure to have understood everything.



      This is an example code.



      # importing necessary libraries
      from sklearn import datasets
      from sklearn.metrics import confusion_matrix
      from sklearn.model_selection import train_test_split

      # loading the iris dataset
      iris = datasets.load_iris()

      # X -> features, y -> label
      X = iris.data
      y = iris.target

      # dividing X, y into train and test data
      X_train, X_test, y_train, y_test = train_test_split(X, y, random_state = 0)

      # training a linear SVM classifier
      from sklearn.svm import SVC
      svm_model_linear = SVC(kernel = 'linear', C = 1).fit(X_train, y_train)
      svm_predictions = svm_model_linear.predict(X_test)

      # model accuracy for X_test
      accuracy = svm_model_linear.score(X_test, y_test)

      # creating a confusion matrix
      cm = confusion_matrix(y_test, svm_predictions)


      To iterate through the rows and to find the wrong ones, the proposed solution is:



      predictions = clf.predict(inputs)
      for input, prediction, label in zip(inputs, predictions, labels):
      if prediction != label:
      print(input, 'has been classified as ', prediction, 'and should be ', label)


      I didn't understand what is "input"/"inputs". If I adapt this code to my code, like this:



      for input, prediction, label in zip (X_test, svm_predictions, y_test):
      if prediction != label:
      print(input, 'has been classified as ', prediction, 'and should be ', label)


      I obtain:



      [6.  2.7 5.1 1.6] has been classified as  2 and should be  1


      Is the row 6 the wrong row? What are the numbers after the 6.? I am asking this because I am using the same code on a dataset that is bigger than this one, so I would like to be sure that I am doing the right things.
      I am not posting the other dataset because unfortunately I can't, but the problem there is that I obtained something like this:



        (0, 253)  0.5339655767137572
      (0, 601) 0.27665553856928027
      (0, 1107) 0.7989633757962163 has been classified as 7 and should be 3
      (0, 885) 0.3034934766501018
      (0, 1295) 0.6432561790864061
      (0, 1871) 0.7029318585026516 has been classified as 7 and should be 6
      (0, 1020) 1.0 has been classified as 3 and should be 8


      When I count every line of this last output, I obtain the double of the lines of the test set... So I am not sure that I am analysing exactly the wrong list of predicted results…
      I hope to have been enough clear.







      scikit-learn svm multiclass-classification






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Sep 6 '18 at 17:23









      JurafskyJurafsky

      133




      133






















          3 Answers
          3






          active

          oldest

          votes


















          1












          $begingroup$

          Welcome to SE:DataScience.



          Here [6. 2.7 5.1 1.6] is the feature of the input instance which is classified wrongly. It is one row from your input feature X = iris.data.



          The message means: your SVM use the input feature [6. 2.7 5.1 1.6] to predict a label, and it predicts label=2. The ground truth is label=1.



          If you want to print the indices of rows that are classified wrongly, you can use



          for row_index, (input, prediction, label) in enumerate(zip (X_test, svm_predictions, y_test)):
          if prediction != label:
          print('Row', row_index, 'has been classified as ', prediction, 'and should be ', label)





          share|improve this answer









          $endgroup$













          • $begingroup$
            I think that both enumerate and zip are internal functions of python or sklearn?
            $endgroup$
            – Jurafsky
            Sep 8 '18 at 5:22










          • $begingroup$
            @Jurafsky Yes, internal of python.
            $endgroup$
            – user12075
            Sep 8 '18 at 14:51



















          1












          $begingroup$

          Welcome.



          In addition to what user12075 mentioned, you could do:



          indices = np.arange(y.shape[0])
          X_train, X_test, y_train, y_test, idx_train, idx_test = train_test_split(X, y, indices, stratify=y, test_size=0.3,
          random_state=42)


          Then,



          for input, prediction, label in zip (indices[idx_train], svm_predictions, y_test):
          if prediction != label:
          print(input, 'has been classified as ', prediction, 'and should be ', label)





          share|improve this answer









          $endgroup$













          • $begingroup$
            What does "stratify" do? and what are the main changes among your code and the one of user12075?
            $endgroup$
            – Jurafsky
            Sep 8 '18 at 5:22












          • $begingroup$
            Stratify used to ensure the different classes would be equally split into train and test set. In my code, input or the index of row refers to the index of row in the original data set. However, in his/her code, it refers to the index of row in the test set. You could run both code and see the result. There are different ways to do what you want.
            $endgroup$
            – ebrahimi
            Sep 8 '18 at 9:04





















          1












          $begingroup$

          The following method works for all kinds of classification problem.



          Use list comprehension to find all indices of wrong prediction:



          indices = [i for i in range(len(y_test)) if y_test[i] != y_pred[i]]


          wrong predictions will then be:



          wrong_predictions = test_dataframe.iloc[indices,:]


          You can also make indices a new column of wrong_predictions, it would be convenient to compare :)






          share|improve this answer










          New contributor




          Chris is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          $endgroup$













            Your Answer





            StackExchange.ifUsing("editor", function () {
            return StackExchange.using("mathjaxEditing", function () {
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            });
            });
            }, "mathjax-editing");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "557"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f37899%2fsklearn-svm-how-to-get-a-list-of-the-wrong-predictions%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            3 Answers
            3






            active

            oldest

            votes








            3 Answers
            3






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            1












            $begingroup$

            Welcome to SE:DataScience.



            Here [6. 2.7 5.1 1.6] is the feature of the input instance which is classified wrongly. It is one row from your input feature X = iris.data.



            The message means: your SVM use the input feature [6. 2.7 5.1 1.6] to predict a label, and it predicts label=2. The ground truth is label=1.



            If you want to print the indices of rows that are classified wrongly, you can use



            for row_index, (input, prediction, label) in enumerate(zip (X_test, svm_predictions, y_test)):
            if prediction != label:
            print('Row', row_index, 'has been classified as ', prediction, 'and should be ', label)





            share|improve this answer









            $endgroup$













            • $begingroup$
              I think that both enumerate and zip are internal functions of python or sklearn?
              $endgroup$
              – Jurafsky
              Sep 8 '18 at 5:22










            • $begingroup$
              @Jurafsky Yes, internal of python.
              $endgroup$
              – user12075
              Sep 8 '18 at 14:51
















            1












            $begingroup$

            Welcome to SE:DataScience.



            Here [6. 2.7 5.1 1.6] is the feature of the input instance which is classified wrongly. It is one row from your input feature X = iris.data.



            The message means: your SVM use the input feature [6. 2.7 5.1 1.6] to predict a label, and it predicts label=2. The ground truth is label=1.



            If you want to print the indices of rows that are classified wrongly, you can use



            for row_index, (input, prediction, label) in enumerate(zip (X_test, svm_predictions, y_test)):
            if prediction != label:
            print('Row', row_index, 'has been classified as ', prediction, 'and should be ', label)





            share|improve this answer









            $endgroup$













            • $begingroup$
              I think that both enumerate and zip are internal functions of python or sklearn?
              $endgroup$
              – Jurafsky
              Sep 8 '18 at 5:22










            • $begingroup$
              @Jurafsky Yes, internal of python.
              $endgroup$
              – user12075
              Sep 8 '18 at 14:51














            1












            1








            1





            $begingroup$

            Welcome to SE:DataScience.



            Here [6. 2.7 5.1 1.6] is the feature of the input instance which is classified wrongly. It is one row from your input feature X = iris.data.



            The message means: your SVM use the input feature [6. 2.7 5.1 1.6] to predict a label, and it predicts label=2. The ground truth is label=1.



            If you want to print the indices of rows that are classified wrongly, you can use



            for row_index, (input, prediction, label) in enumerate(zip (X_test, svm_predictions, y_test)):
            if prediction != label:
            print('Row', row_index, 'has been classified as ', prediction, 'and should be ', label)





            share|improve this answer









            $endgroup$



            Welcome to SE:DataScience.



            Here [6. 2.7 5.1 1.6] is the feature of the input instance which is classified wrongly. It is one row from your input feature X = iris.data.



            The message means: your SVM use the input feature [6. 2.7 5.1 1.6] to predict a label, and it predicts label=2. The ground truth is label=1.



            If you want to print the indices of rows that are classified wrongly, you can use



            for row_index, (input, prediction, label) in enumerate(zip (X_test, svm_predictions, y_test)):
            if prediction != label:
            print('Row', row_index, 'has been classified as ', prediction, 'and should be ', label)






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Sep 6 '18 at 17:38









            user12075user12075

            1,296515




            1,296515












            • $begingroup$
              I think that both enumerate and zip are internal functions of python or sklearn?
              $endgroup$
              – Jurafsky
              Sep 8 '18 at 5:22










            • $begingroup$
              @Jurafsky Yes, internal of python.
              $endgroup$
              – user12075
              Sep 8 '18 at 14:51


















            • $begingroup$
              I think that both enumerate and zip are internal functions of python or sklearn?
              $endgroup$
              – Jurafsky
              Sep 8 '18 at 5:22










            • $begingroup$
              @Jurafsky Yes, internal of python.
              $endgroup$
              – user12075
              Sep 8 '18 at 14:51
















            $begingroup$
            I think that both enumerate and zip are internal functions of python or sklearn?
            $endgroup$
            – Jurafsky
            Sep 8 '18 at 5:22




            $begingroup$
            I think that both enumerate and zip are internal functions of python or sklearn?
            $endgroup$
            – Jurafsky
            Sep 8 '18 at 5:22












            $begingroup$
            @Jurafsky Yes, internal of python.
            $endgroup$
            – user12075
            Sep 8 '18 at 14:51




            $begingroup$
            @Jurafsky Yes, internal of python.
            $endgroup$
            – user12075
            Sep 8 '18 at 14:51











            1












            $begingroup$

            Welcome.



            In addition to what user12075 mentioned, you could do:



            indices = np.arange(y.shape[0])
            X_train, X_test, y_train, y_test, idx_train, idx_test = train_test_split(X, y, indices, stratify=y, test_size=0.3,
            random_state=42)


            Then,



            for input, prediction, label in zip (indices[idx_train], svm_predictions, y_test):
            if prediction != label:
            print(input, 'has been classified as ', prediction, 'and should be ', label)





            share|improve this answer









            $endgroup$













            • $begingroup$
              What does "stratify" do? and what are the main changes among your code and the one of user12075?
              $endgroup$
              – Jurafsky
              Sep 8 '18 at 5:22












            • $begingroup$
              Stratify used to ensure the different classes would be equally split into train and test set. In my code, input or the index of row refers to the index of row in the original data set. However, in his/her code, it refers to the index of row in the test set. You could run both code and see the result. There are different ways to do what you want.
              $endgroup$
              – ebrahimi
              Sep 8 '18 at 9:04


















            1












            $begingroup$

            Welcome.



            In addition to what user12075 mentioned, you could do:



            indices = np.arange(y.shape[0])
            X_train, X_test, y_train, y_test, idx_train, idx_test = train_test_split(X, y, indices, stratify=y, test_size=0.3,
            random_state=42)


            Then,



            for input, prediction, label in zip (indices[idx_train], svm_predictions, y_test):
            if prediction != label:
            print(input, 'has been classified as ', prediction, 'and should be ', label)





            share|improve this answer









            $endgroup$













            • $begingroup$
              What does "stratify" do? and what are the main changes among your code and the one of user12075?
              $endgroup$
              – Jurafsky
              Sep 8 '18 at 5:22












            • $begingroup$
              Stratify used to ensure the different classes would be equally split into train and test set. In my code, input or the index of row refers to the index of row in the original data set. However, in his/her code, it refers to the index of row in the test set. You could run both code and see the result. There are different ways to do what you want.
              $endgroup$
              – ebrahimi
              Sep 8 '18 at 9:04
















            1












            1








            1





            $begingroup$

            Welcome.



            In addition to what user12075 mentioned, you could do:



            indices = np.arange(y.shape[0])
            X_train, X_test, y_train, y_test, idx_train, idx_test = train_test_split(X, y, indices, stratify=y, test_size=0.3,
            random_state=42)


            Then,



            for input, prediction, label in zip (indices[idx_train], svm_predictions, y_test):
            if prediction != label:
            print(input, 'has been classified as ', prediction, 'and should be ', label)





            share|improve this answer









            $endgroup$



            Welcome.



            In addition to what user12075 mentioned, you could do:



            indices = np.arange(y.shape[0])
            X_train, X_test, y_train, y_test, idx_train, idx_test = train_test_split(X, y, indices, stratify=y, test_size=0.3,
            random_state=42)


            Then,



            for input, prediction, label in zip (indices[idx_train], svm_predictions, y_test):
            if prediction != label:
            print(input, 'has been classified as ', prediction, 'and should be ', label)






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Sep 6 '18 at 17:56









            ebrahimiebrahimi

            73721021




            73721021












            • $begingroup$
              What does "stratify" do? and what are the main changes among your code and the one of user12075?
              $endgroup$
              – Jurafsky
              Sep 8 '18 at 5:22












            • $begingroup$
              Stratify used to ensure the different classes would be equally split into train and test set. In my code, input or the index of row refers to the index of row in the original data set. However, in his/her code, it refers to the index of row in the test set. You could run both code and see the result. There are different ways to do what you want.
              $endgroup$
              – ebrahimi
              Sep 8 '18 at 9:04




















            • $begingroup$
              What does "stratify" do? and what are the main changes among your code and the one of user12075?
              $endgroup$
              – Jurafsky
              Sep 8 '18 at 5:22












            • $begingroup$
              Stratify used to ensure the different classes would be equally split into train and test set. In my code, input or the index of row refers to the index of row in the original data set. However, in his/her code, it refers to the index of row in the test set. You could run both code and see the result. There are different ways to do what you want.
              $endgroup$
              – ebrahimi
              Sep 8 '18 at 9:04


















            $begingroup$
            What does "stratify" do? and what are the main changes among your code and the one of user12075?
            $endgroup$
            – Jurafsky
            Sep 8 '18 at 5:22






            $begingroup$
            What does "stratify" do? and what are the main changes among your code and the one of user12075?
            $endgroup$
            – Jurafsky
            Sep 8 '18 at 5:22














            $begingroup$
            Stratify used to ensure the different classes would be equally split into train and test set. In my code, input or the index of row refers to the index of row in the original data set. However, in his/her code, it refers to the index of row in the test set. You could run both code and see the result. There are different ways to do what you want.
            $endgroup$
            – ebrahimi
            Sep 8 '18 at 9:04






            $begingroup$
            Stratify used to ensure the different classes would be equally split into train and test set. In my code, input or the index of row refers to the index of row in the original data set. However, in his/her code, it refers to the index of row in the test set. You could run both code and see the result. There are different ways to do what you want.
            $endgroup$
            – ebrahimi
            Sep 8 '18 at 9:04













            1












            $begingroup$

            The following method works for all kinds of classification problem.



            Use list comprehension to find all indices of wrong prediction:



            indices = [i for i in range(len(y_test)) if y_test[i] != y_pred[i]]


            wrong predictions will then be:



            wrong_predictions = test_dataframe.iloc[indices,:]


            You can also make indices a new column of wrong_predictions, it would be convenient to compare :)






            share|improve this answer










            New contributor




            Chris is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            $endgroup$


















              1












              $begingroup$

              The following method works for all kinds of classification problem.



              Use list comprehension to find all indices of wrong prediction:



              indices = [i for i in range(len(y_test)) if y_test[i] != y_pred[i]]


              wrong predictions will then be:



              wrong_predictions = test_dataframe.iloc[indices,:]


              You can also make indices a new column of wrong_predictions, it would be convenient to compare :)






              share|improve this answer










              New contributor




              Chris is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.






              $endgroup$
















                1












                1








                1





                $begingroup$

                The following method works for all kinds of classification problem.



                Use list comprehension to find all indices of wrong prediction:



                indices = [i for i in range(len(y_test)) if y_test[i] != y_pred[i]]


                wrong predictions will then be:



                wrong_predictions = test_dataframe.iloc[indices,:]


                You can also make indices a new column of wrong_predictions, it would be convenient to compare :)






                share|improve this answer










                New contributor




                Chris is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.






                $endgroup$



                The following method works for all kinds of classification problem.



                Use list comprehension to find all indices of wrong prediction:



                indices = [i for i in range(len(y_test)) if y_test[i] != y_pred[i]]


                wrong predictions will then be:



                wrong_predictions = test_dataframe.iloc[indices,:]


                You can also make indices a new column of wrong_predictions, it would be convenient to compare :)







                share|improve this answer










                New contributor




                Chris is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.









                share|improve this answer



                share|improve this answer








                edited 2 days ago









                ebrahimi

                73721021




                73721021






                New contributor




                Chris is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.









                answered 2 days ago









                ChrisChris

                112




                112




                New contributor




                Chris is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.





                New contributor





                Chris is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.






                Chris is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Data Science Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f37899%2fsklearn-svm-how-to-get-a-list-of-the-wrong-predictions%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    How to label and detect the document text images

                    Tabula Rosettana

                    Aureus (color)