I received this error message: Found input variables with inconsistent numbers of samples: [15573, 15600]....












0












$begingroup$


Screenshot of codesData source



I received that error and when running my code, I realised that the issue lies with my X_train and y_train.



The link below is the source of my data (test.csv and train.csv)



My X_train.shape gave me (15573,) and my y_train gave me (15600,)



How do I reshape this?










share|improve this question









New contributor




Renae is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    You cat reshap them, you have to figure why it's happening because y_train and X_train will have same shape by default
    $endgroup$
    – Aditya
    yesterday










  • $begingroup$
    @Aditya Strangely, my X_test and y_test tally...Is there a particular reason why it isn't the same for y_train and X_train?
    $endgroup$
    – Renae
    yesterday










  • $begingroup$
    See if you are dropping some rows or not or else if they are one-to-one matching, then just reduce the X_train;
    $endgroup$
    – Aditya
    yesterday










  • $begingroup$
    @Aditya Got it, will give that a shot! To reduce the X_train, I inputted this code: y_train_new=y_train[1:17644] but it showed me an error. The aim is to make it the same as that of y_train
    $endgroup$
    – Renae
    yesterday












  • $begingroup$
    17644 isn't the y_train dims, it's 15600 but be careful, you shouldn't be using such tricks as in a normal scenario, this shouldn't happen actually. If you can share your preprocessing, folks can then help finding what's causing it unless it's by default like that
    $endgroup$
    – Aditya
    yesterday


















0












$begingroup$


Screenshot of codesData source



I received that error and when running my code, I realised that the issue lies with my X_train and y_train.



The link below is the source of my data (test.csv and train.csv)



My X_train.shape gave me (15573,) and my y_train gave me (15600,)



How do I reshape this?










share|improve this question









New contributor




Renae is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    You cat reshap them, you have to figure why it's happening because y_train and X_train will have same shape by default
    $endgroup$
    – Aditya
    yesterday










  • $begingroup$
    @Aditya Strangely, my X_test and y_test tally...Is there a particular reason why it isn't the same for y_train and X_train?
    $endgroup$
    – Renae
    yesterday










  • $begingroup$
    See if you are dropping some rows or not or else if they are one-to-one matching, then just reduce the X_train;
    $endgroup$
    – Aditya
    yesterday










  • $begingroup$
    @Aditya Got it, will give that a shot! To reduce the X_train, I inputted this code: y_train_new=y_train[1:17644] but it showed me an error. The aim is to make it the same as that of y_train
    $endgroup$
    – Renae
    yesterday












  • $begingroup$
    17644 isn't the y_train dims, it's 15600 but be careful, you shouldn't be using such tricks as in a normal scenario, this shouldn't happen actually. If you can share your preprocessing, folks can then help finding what's causing it unless it's by default like that
    $endgroup$
    – Aditya
    yesterday
















0












0








0





$begingroup$


Screenshot of codesData source



I received that error and when running my code, I realised that the issue lies with my X_train and y_train.



The link below is the source of my data (test.csv and train.csv)



My X_train.shape gave me (15573,) and my y_train gave me (15600,)



How do I reshape this?










share|improve this question









New contributor




Renae is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$




Screenshot of codesData source



I received that error and when running my code, I realised that the issue lies with my X_train and y_train.



The link below is the source of my data (test.csv and train.csv)



My X_train.shape gave me (15573,) and my y_train gave me (15600,)



How do I reshape this?







python






share|improve this question









New contributor




Renae is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




Renae is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited yesterday







Renae













New contributor




Renae is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked yesterday









RenaeRenae

134




134




New contributor




Renae is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Renae is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Renae is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • $begingroup$
    You cat reshap them, you have to figure why it's happening because y_train and X_train will have same shape by default
    $endgroup$
    – Aditya
    yesterday










  • $begingroup$
    @Aditya Strangely, my X_test and y_test tally...Is there a particular reason why it isn't the same for y_train and X_train?
    $endgroup$
    – Renae
    yesterday










  • $begingroup$
    See if you are dropping some rows or not or else if they are one-to-one matching, then just reduce the X_train;
    $endgroup$
    – Aditya
    yesterday










  • $begingroup$
    @Aditya Got it, will give that a shot! To reduce the X_train, I inputted this code: y_train_new=y_train[1:17644] but it showed me an error. The aim is to make it the same as that of y_train
    $endgroup$
    – Renae
    yesterday












  • $begingroup$
    17644 isn't the y_train dims, it's 15600 but be careful, you shouldn't be using such tricks as in a normal scenario, this shouldn't happen actually. If you can share your preprocessing, folks can then help finding what's causing it unless it's by default like that
    $endgroup$
    – Aditya
    yesterday




















  • $begingroup$
    You cat reshap them, you have to figure why it's happening because y_train and X_train will have same shape by default
    $endgroup$
    – Aditya
    yesterday










  • $begingroup$
    @Aditya Strangely, my X_test and y_test tally...Is there a particular reason why it isn't the same for y_train and X_train?
    $endgroup$
    – Renae
    yesterday










  • $begingroup$
    See if you are dropping some rows or not or else if they are one-to-one matching, then just reduce the X_train;
    $endgroup$
    – Aditya
    yesterday










  • $begingroup$
    @Aditya Got it, will give that a shot! To reduce the X_train, I inputted this code: y_train_new=y_train[1:17644] but it showed me an error. The aim is to make it the same as that of y_train
    $endgroup$
    – Renae
    yesterday












  • $begingroup$
    17644 isn't the y_train dims, it's 15600 but be careful, you shouldn't be using such tricks as in a normal scenario, this shouldn't happen actually. If you can share your preprocessing, folks can then help finding what's causing it unless it's by default like that
    $endgroup$
    – Aditya
    yesterday


















$begingroup$
You cat reshap them, you have to figure why it's happening because y_train and X_train will have same shape by default
$endgroup$
– Aditya
yesterday




$begingroup$
You cat reshap them, you have to figure why it's happening because y_train and X_train will have same shape by default
$endgroup$
– Aditya
yesterday












$begingroup$
@Aditya Strangely, my X_test and y_test tally...Is there a particular reason why it isn't the same for y_train and X_train?
$endgroup$
– Renae
yesterday




$begingroup$
@Aditya Strangely, my X_test and y_test tally...Is there a particular reason why it isn't the same for y_train and X_train?
$endgroup$
– Renae
yesterday












$begingroup$
See if you are dropping some rows or not or else if they are one-to-one matching, then just reduce the X_train;
$endgroup$
– Aditya
yesterday




$begingroup$
See if you are dropping some rows or not or else if they are one-to-one matching, then just reduce the X_train;
$endgroup$
– Aditya
yesterday












$begingroup$
@Aditya Got it, will give that a shot! To reduce the X_train, I inputted this code: y_train_new=y_train[1:17644] but it showed me an error. The aim is to make it the same as that of y_train
$endgroup$
– Renae
yesterday






$begingroup$
@Aditya Got it, will give that a shot! To reduce the X_train, I inputted this code: y_train_new=y_train[1:17644] but it showed me an error. The aim is to make it the same as that of y_train
$endgroup$
– Renae
yesterday














$begingroup$
17644 isn't the y_train dims, it's 15600 but be careful, you shouldn't be using such tricks as in a normal scenario, this shouldn't happen actually. If you can share your preprocessing, folks can then help finding what's causing it unless it's by default like that
$endgroup$
– Aditya
yesterday






$begingroup$
17644 isn't the y_train dims, it's 15600 but be careful, you shouldn't be using such tricks as in a normal scenario, this shouldn't happen actually. If you can share your preprocessing, folks can then help finding what's causing it unless it's by default like that
$endgroup$
– Aditya
yesterday












1 Answer
1






active

oldest

votes


















1












$begingroup$

You likely get problems because you remove rows containing nulls in X_train and y_train independent of each other. y_train probably has few, or no nulls and X_train probably has some. So when you remove a row in X_train and the same row is not removed in y_train it will cause your data to be unsynced and have different lenghts. Instead you should remove nulls before you separate X and y.



Before this:



X = df['text']
y = df['label']


Do this:



df.dropna(inplace=True)


And remove this:



X_train.dropna(inplace=True)
X_test.dropna(inplace=True)
y_train.dropna(inplace=True)
y_test.dropna(inplace=True)





share|improve this answer











$endgroup$













  • $begingroup$
    That worked! Thank you :)
    $endgroup$
    – Renae
    13 hours ago












Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});






Renae is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48848%2fi-received-this-error-message-found-input-variables-with-inconsistent-numbers-o%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1












$begingroup$

You likely get problems because you remove rows containing nulls in X_train and y_train independent of each other. y_train probably has few, or no nulls and X_train probably has some. So when you remove a row in X_train and the same row is not removed in y_train it will cause your data to be unsynced and have different lenghts. Instead you should remove nulls before you separate X and y.



Before this:



X = df['text']
y = df['label']


Do this:



df.dropna(inplace=True)


And remove this:



X_train.dropna(inplace=True)
X_test.dropna(inplace=True)
y_train.dropna(inplace=True)
y_test.dropna(inplace=True)





share|improve this answer











$endgroup$













  • $begingroup$
    That worked! Thank you :)
    $endgroup$
    – Renae
    13 hours ago
















1












$begingroup$

You likely get problems because you remove rows containing nulls in X_train and y_train independent of each other. y_train probably has few, or no nulls and X_train probably has some. So when you remove a row in X_train and the same row is not removed in y_train it will cause your data to be unsynced and have different lenghts. Instead you should remove nulls before you separate X and y.



Before this:



X = df['text']
y = df['label']


Do this:



df.dropna(inplace=True)


And remove this:



X_train.dropna(inplace=True)
X_test.dropna(inplace=True)
y_train.dropna(inplace=True)
y_test.dropna(inplace=True)





share|improve this answer











$endgroup$













  • $begingroup$
    That worked! Thank you :)
    $endgroup$
    – Renae
    13 hours ago














1












1








1





$begingroup$

You likely get problems because you remove rows containing nulls in X_train and y_train independent of each other. y_train probably has few, or no nulls and X_train probably has some. So when you remove a row in X_train and the same row is not removed in y_train it will cause your data to be unsynced and have different lenghts. Instead you should remove nulls before you separate X and y.



Before this:



X = df['text']
y = df['label']


Do this:



df.dropna(inplace=True)


And remove this:



X_train.dropna(inplace=True)
X_test.dropna(inplace=True)
y_train.dropna(inplace=True)
y_test.dropna(inplace=True)





share|improve this answer











$endgroup$



You likely get problems because you remove rows containing nulls in X_train and y_train independent of each other. y_train probably has few, or no nulls and X_train probably has some. So when you remove a row in X_train and the same row is not removed in y_train it will cause your data to be unsynced and have different lenghts. Instead you should remove nulls before you separate X and y.



Before this:



X = df['text']
y = df['label']


Do this:



df.dropna(inplace=True)


And remove this:



X_train.dropna(inplace=True)
X_test.dropna(inplace=True)
y_train.dropna(inplace=True)
y_test.dropna(inplace=True)






share|improve this answer














share|improve this answer



share|improve this answer








edited yesterday

























answered yesterday









Simon LarssonSimon Larsson

724114




724114












  • $begingroup$
    That worked! Thank you :)
    $endgroup$
    – Renae
    13 hours ago


















  • $begingroup$
    That worked! Thank you :)
    $endgroup$
    – Renae
    13 hours ago
















$begingroup$
That worked! Thank you :)
$endgroup$
– Renae
13 hours ago




$begingroup$
That worked! Thank you :)
$endgroup$
– Renae
13 hours ago










Renae is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















Renae is a new contributor. Be nice, and check out our Code of Conduct.













Renae is a new contributor. Be nice, and check out our Code of Conduct.












Renae is a new contributor. Be nice, and check out our Code of Conduct.
















Thanks for contributing an answer to Data Science Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48848%2fi-received-this-error-message-found-input-variables-with-inconsistent-numbers-o%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Callistus I

Tabula Rosettana

How to label and detect the document text images