Accuracy differs between MATLAB and scikit-learn for a decision tree

Is there any possibility to vary the accuracy of same data set in matlab and jupyter notebook by using python code ?

For same data set, at first I applied it in matlab and get 96% accuracy for decision tree method, then I apply that same data set in jupyter notebook by using python code where I get 53% accuracy for C4.5 (decision tree) by using k-fold cross validation.

I didn't understand where's the problem for getting different accuracy for same dataset and same method.

My procedure in python code is given below:

import pandas as pd

import numpy as np

from sklearn import tree

from sklearn.model_selection import KFold 



train=pd.read_csv('E://New.csv')

train.head()

enter image description here

# define X and y

feature_cols = ['Past','Family_History','Current','current or previous 

               workplace','diagnosed with a mental health condition by a 

               medical professional?','do you feel that it interferes with 

               your work when being treated effectively?','Gender']

X = train[feature_cols]



# y is a vector, hence we use dot to access 'label'

y = train['Diagonised condition']



kfold = KFold(n_splits=10,random_state=None)

model = tree.DecisionTreeClassifier(criterion='gini')



results = cross_val_score(model, X, y, cv=kfold,scoring = 'accuracy')

result = results.mean()*100



std = results.std()*100

print (result)

enter image description here

edited 17 mins ago

Brian Spiering

3,5531028

asked Jan 23 at 15:37

IS2057

1021317

$begingroup$
Please post the MATLAB code so it can be compared to the Python code.
$endgroup$
– Brian Spiering
Jan 23 at 16:12

$begingroup$
In matlab I use classification app (decision tree) and load my data set then calculate accuracy.
$endgroup$
– IS2057
Jan 23 at 18:05

$begingroup$
Are you sure that all other parameters for your decision tree are the same?
$endgroup$
– Majid Mortazavi
Jan 24 at 6:23

$begingroup$
@MajidMortazavi, Yes I am sure . I use the same dataset and same parameters.
$endgroup$
– IS2057
Jan 24 at 6:49

add a comment |

Is there any possibility to vary the accuracy of same data set in matlab and jupyter notebook by using python code ?

I didn't understand where's the problem for getting different accuracy for same dataset and same method.

My procedure in python code is given below:

import pandas as pd

import numpy as np

from sklearn import tree

from sklearn.model_selection import KFold 



train=pd.read_csv('E://New.csv')

train.head()

enter image description here

# define X and y

feature_cols = ['Past','Family_History','Current','current or previous 

               workplace','diagnosed with a mental health condition by a 

               medical professional?','do you feel that it interferes with 

               your work when being treated effectively?','Gender']

X = train[feature_cols]



# y is a vector, hence we use dot to access 'label'

y = train['Diagonised condition']



kfold = KFold(n_splits=10,random_state=None)

model = tree.DecisionTreeClassifier(criterion='gini')



results = cross_val_score(model, X, y, cv=kfold,scoring = 'accuracy')

result = results.mean()*100



std = results.std()*100

print (result)

enter image description here

edited 17 mins ago

Brian Spiering

3,5531028

asked Jan 23 at 15:37

IS2057

1021317

$begingroup$
Please post the MATLAB code so it can be compared to the Python code.
$endgroup$
– Brian Spiering
Jan 23 at 16:12

$begingroup$
In matlab I use classification app (decision tree) and load my data set then calculate accuracy.
$endgroup$
– IS2057
Jan 23 at 18:05

$begingroup$
Are you sure that all other parameters for your decision tree are the same?
$endgroup$
– Majid Mortazavi
Jan 24 at 6:23

$begingroup$
@MajidMortazavi, Yes I am sure . I use the same dataset and same parameters.
$endgroup$
– IS2057
Jan 24 at 6:49

add a comment |

Is there any possibility to vary the accuracy of same data set in matlab and jupyter notebook by using python code ?

I didn't understand where's the problem for getting different accuracy for same dataset and same method.

My procedure in python code is given below:

import pandas as pd

import numpy as np

from sklearn import tree

from sklearn.model_selection import KFold 



train=pd.read_csv('E://New.csv')

train.head()

enter image description here

# define X and y

feature_cols = ['Past','Family_History','Current','current or previous 

               workplace','diagnosed with a mental health condition by a 

               medical professional?','do you feel that it interferes with 

               your work when being treated effectively?','Gender']

X = train[feature_cols]



# y is a vector, hence we use dot to access 'label'

y = train['Diagonised condition']



kfold = KFold(n_splits=10,random_state=None)

model = tree.DecisionTreeClassifier(criterion='gini')



results = cross_val_score(model, X, y, cv=kfold,scoring = 'accuracy')

result = results.mean()*100



std = results.std()*100

print (result)

enter image description here

edited 17 mins ago

Brian Spiering

3,5531028

asked Jan 23 at 15:37

IS2057

1021317

Is there any possibility to vary the accuracy of same data set in matlab and jupyter notebook by using python code ?

I didn't understand where's the problem for getting different accuracy for same dataset and same method.

My procedure in python code is given below:

import pandas as pd

import numpy as np

from sklearn import tree

from sklearn.model_selection import KFold 



train=pd.read_csv('E://New.csv')

train.head()

enter image description here

# define X and y

feature_cols = ['Past','Family_History','Current','current or previous 

               workplace','diagnosed with a mental health condition by a 

               medical professional?','do you feel that it interferes with 

               your work when being treated effectively?','Gender']

X = train[feature_cols]



# y is a vector, hence we use dot to access 'label'

y = train['Diagonised condition']



kfold = KFold(n_splits=10,random_state=None)

model = tree.DecisionTreeClassifier(criterion='gini')



results = cross_val_score(model, X, y, cv=kfold,scoring = 'accuracy')

result = results.mean()*100



std = results.std()*100

print (result)

enter image description here

python scikit-learn decision-trees accuracy matlab

edited 17 mins ago

Brian Spiering

3,5531028

asked Jan 23 at 15:37

IS2057

1021317

edited 17 mins ago

Brian Spiering

3,5531028

asked Jan 23 at 15:37

IS2057

1021317

edited 17 mins ago

Brian Spiering

3,5531028

edited 17 mins ago

Brian Spiering

3,5531028

edited 17 mins ago

Brian Spiering

3,5531028

asked Jan 23 at 15:37

IS2057

1021317

asked Jan 23 at 15:37

IS2057

1021317

asked Jan 23 at 15:37

IS2057

1021317

$begingroup$
Please post the MATLAB code so it can be compared to the Python code.
$endgroup$
– Brian Spiering
Jan 23 at 16:12

$begingroup$
In matlab I use classification app (decision tree) and load my data set then calculate accuracy.
$endgroup$
– IS2057
Jan 23 at 18:05

$begingroup$
Are you sure that all other parameters for your decision tree are the same?
$endgroup$
– Majid Mortazavi
Jan 24 at 6:23

$begingroup$
@MajidMortazavi, Yes I am sure . I use the same dataset and same parameters.
$endgroup$
– IS2057
Jan 24 at 6:49

add a comment |

$begingroup$
Please post the MATLAB code so it can be compared to the Python code.
$endgroup$
– Brian Spiering
Jan 23 at 16:12

$begingroup$
In matlab I use classification app (decision tree) and load my data set then calculate accuracy.
$endgroup$
– IS2057
Jan 23 at 18:05

$begingroup$
Are you sure that all other parameters for your decision tree are the same?
$endgroup$
– Majid Mortazavi
Jan 24 at 6:23

$begingroup$
@MajidMortazavi, Yes I am sure . I use the same dataset and same parameters.
$endgroup$
– IS2057
Jan 24 at 6:49

Please post the MATLAB code so it can be compared to the Python code.

– Brian Spiering
Jan 23 at 16:12

In matlab I use classification app (decision tree) and load my data set then calculate accuracy.

– IS2057
Jan 23 at 18:05

Are you sure that all other parameters for your decision tree are the same?

– Majid Mortazavi
Jan 24 at 6:23

@MajidMortazavi, Yes I am sure . I use the same dataset and same parameters.

– IS2057
Jan 24 at 6:49

add a comment |

1 Answer
1

active

oldest

votes

It is hard to make a direct comparison between a white box implementation (scikit-learn) and a black box implementation (MATLAB).

One guess they are using different algorithms. scikit-learn uses an optimized version of the CART algorithm. Maybe MATLAB uses ID3, C4.5, or something else. Another guess two implementations are using different hyperparameters (e.g., different splitting criteria, max depth, minimum node size, ...).

Since decision trees are white-box models, you can examine their internal structure. Plot both trained trees. See how they each are making the splits and how many splits are being made.

edited 2 days ago

answered Jan 24 at 18:17

Brian Spiering

3,5531028

$begingroup$
Yes, I understand it. Thanks for answering.
$endgroup$
– IS2057
8 hours ago

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f44450%2faccuracy-differs-between-matlab-and-scikit-learn-for-a-decision-tree%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

It is hard to make a direct comparison between a white box implementation (scikit-learn) and a black box implementation (MATLAB).

Since decision trees are white-box models, you can examine their internal structure. Plot both trained trees. See how they each are making the splits and how many splits are being made.

edited 2 days ago

answered Jan 24 at 18:17

Brian Spiering

3,5531028

$begingroup$
Yes, I understand it. Thanks for answering.
$endgroup$
– IS2057
8 hours ago

add a comment |

It is hard to make a direct comparison between a white box implementation (scikit-learn) and a black box implementation (MATLAB).

Since decision trees are white-box models, you can examine their internal structure. Plot both trained trees. See how they each are making the splits and how many splits are being made.

edited 2 days ago

answered Jan 24 at 18:17

Brian Spiering

3,5531028

$begingroup$
Yes, I understand it. Thanks for answering.
$endgroup$
– IS2057
8 hours ago

add a comment |

It is hard to make a direct comparison between a white box implementation (scikit-learn) and a black box implementation (MATLAB).

Since decision trees are white-box models, you can examine their internal structure. Plot both trained trees. See how they each are making the splits and how many splits are being made.

edited 2 days ago

answered Jan 24 at 18:17

Brian Spiering

3,5531028

It is hard to make a direct comparison between a white box implementation (scikit-learn) and a black box implementation (MATLAB).

Since decision trees are white-box models, you can examine their internal structure. Plot both trained trees. See how they each are making the splits and how many splits are being made.

edited 2 days ago

answered Jan 24 at 18:17

Brian Spiering

3,5531028

edited 2 days ago

answered Jan 24 at 18:17

Brian Spiering

3,5531028

answered Jan 24 at 18:17

Brian Spiering

3,5531028

answered Jan 24 at 18:17

Brian Spiering

3,5531028

$begingroup$
Yes, I understand it. Thanks for answering.
$endgroup$
– IS2057
8 hours ago

add a comment |

$begingroup$
Yes, I understand it. Thanks for answering.
$endgroup$
– IS2057
8 hours ago

Yes, I understand it. Thanks for answering.

– IS2057
8 hours ago

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Htydjtk