I received this error message: Found input variables with inconsistent numbers of samples: [15573, 15600]....
$begingroup$
Data source
I received that error and when running my code, I realised that the issue lies with my X_train and y_train.
The link below is the source of my data (test.csv and train.csv)
My X_train.shape gave me (15573,) and my y_train gave me (15600,)
How do I reshape this?
python
New contributor
Renae is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
Data source
I received that error and when running my code, I realised that the issue lies with my X_train and y_train.
The link below is the source of my data (test.csv and train.csv)
My X_train.shape gave me (15573,) and my y_train gave me (15600,)
How do I reshape this?
python
New contributor
Renae is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
$begingroup$
You cat reshap them, you have to figure why it's happening because y_train and X_train will have same shape by default
$endgroup$
– Aditya
yesterday
$begingroup$
@Aditya Strangely, my X_test and y_test tally...Is there a particular reason why it isn't the same for y_train and X_train?
$endgroup$
– Renae
yesterday
$begingroup$
See if you are dropping some rows or not or else if they are one-to-one matching, then just reduce the X_train;
$endgroup$
– Aditya
yesterday
$begingroup$
@Aditya Got it, will give that a shot! To reduce the X_train, I inputted this code: y_train_new=y_train[1:17644] but it showed me an error. The aim is to make it the same as that of y_train
$endgroup$
– Renae
yesterday
$begingroup$
17644 isn't the y_train dims, it's 15600 but be careful, you shouldn't be using such tricks as in a normal scenario, this shouldn't happen actually. If you can share your preprocessing, folks can then help finding what's causing it unless it's by default like that
$endgroup$
– Aditya
yesterday
add a comment |
$begingroup$
Data source
I received that error and when running my code, I realised that the issue lies with my X_train and y_train.
The link below is the source of my data (test.csv and train.csv)
My X_train.shape gave me (15573,) and my y_train gave me (15600,)
How do I reshape this?
python
New contributor
Renae is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
Data source
I received that error and when running my code, I realised that the issue lies with my X_train and y_train.
The link below is the source of my data (test.csv and train.csv)
My X_train.shape gave me (15573,) and my y_train gave me (15600,)
How do I reshape this?
python
python
New contributor
Renae is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Renae is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
edited yesterday
Renae
New contributor
Renae is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
asked yesterday
RenaeRenae
134
134
New contributor
Renae is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Renae is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Renae is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$begingroup$
You cat reshap them, you have to figure why it's happening because y_train and X_train will have same shape by default
$endgroup$
– Aditya
yesterday
$begingroup$
@Aditya Strangely, my X_test and y_test tally...Is there a particular reason why it isn't the same for y_train and X_train?
$endgroup$
– Renae
yesterday
$begingroup$
See if you are dropping some rows or not or else if they are one-to-one matching, then just reduce the X_train;
$endgroup$
– Aditya
yesterday
$begingroup$
@Aditya Got it, will give that a shot! To reduce the X_train, I inputted this code: y_train_new=y_train[1:17644] but it showed me an error. The aim is to make it the same as that of y_train
$endgroup$
– Renae
yesterday
$begingroup$
17644 isn't the y_train dims, it's 15600 but be careful, you shouldn't be using such tricks as in a normal scenario, this shouldn't happen actually. If you can share your preprocessing, folks can then help finding what's causing it unless it's by default like that
$endgroup$
– Aditya
yesterday
add a comment |
$begingroup$
You cat reshap them, you have to figure why it's happening because y_train and X_train will have same shape by default
$endgroup$
– Aditya
yesterday
$begingroup$
@Aditya Strangely, my X_test and y_test tally...Is there a particular reason why it isn't the same for y_train and X_train?
$endgroup$
– Renae
yesterday
$begingroup$
See if you are dropping some rows or not or else if they are one-to-one matching, then just reduce the X_train;
$endgroup$
– Aditya
yesterday
$begingroup$
@Aditya Got it, will give that a shot! To reduce the X_train, I inputted this code: y_train_new=y_train[1:17644] but it showed me an error. The aim is to make it the same as that of y_train
$endgroup$
– Renae
yesterday
$begingroup$
17644 isn't the y_train dims, it's 15600 but be careful, you shouldn't be using such tricks as in a normal scenario, this shouldn't happen actually. If you can share your preprocessing, folks can then help finding what's causing it unless it's by default like that
$endgroup$
– Aditya
yesterday
$begingroup$
You cat reshap them, you have to figure why it's happening because y_train and X_train will have same shape by default
$endgroup$
– Aditya
yesterday
$begingroup$
You cat reshap them, you have to figure why it's happening because y_train and X_train will have same shape by default
$endgroup$
– Aditya
yesterday
$begingroup$
@Aditya Strangely, my X_test and y_test tally...Is there a particular reason why it isn't the same for y_train and X_train?
$endgroup$
– Renae
yesterday
$begingroup$
@Aditya Strangely, my X_test and y_test tally...Is there a particular reason why it isn't the same for y_train and X_train?
$endgroup$
– Renae
yesterday
$begingroup$
See if you are dropping some rows or not or else if they are one-to-one matching, then just reduce the X_train;
$endgroup$
– Aditya
yesterday
$begingroup$
See if you are dropping some rows or not or else if they are one-to-one matching, then just reduce the X_train;
$endgroup$
– Aditya
yesterday
$begingroup$
@Aditya Got it, will give that a shot! To reduce the X_train, I inputted this code: y_train_new=y_train[1:17644] but it showed me an error. The aim is to make it the same as that of y_train
$endgroup$
– Renae
yesterday
$begingroup$
@Aditya Got it, will give that a shot! To reduce the X_train, I inputted this code: y_train_new=y_train[1:17644] but it showed me an error. The aim is to make it the same as that of y_train
$endgroup$
– Renae
yesterday
$begingroup$
17644 isn't the y_train dims, it's 15600 but be careful, you shouldn't be using such tricks as in a normal scenario, this shouldn't happen actually. If you can share your preprocessing, folks can then help finding what's causing it unless it's by default like that
$endgroup$
– Aditya
yesterday
$begingroup$
17644 isn't the y_train dims, it's 15600 but be careful, you shouldn't be using such tricks as in a normal scenario, this shouldn't happen actually. If you can share your preprocessing, folks can then help finding what's causing it unless it's by default like that
$endgroup$
– Aditya
yesterday
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
You likely get problems because you remove rows containing nulls in X_train and y_train independent of each other. y_train probably has few, or no nulls and X_train probably has some. So when you remove a row in X_train and the same row is not removed in y_train it will cause your data to be unsynced and have different lenghts. Instead you should remove nulls before you separate X and y.
Before this:
X = df['text']
y = df['label']
Do this:
df.dropna(inplace=True)
And remove this:
X_train.dropna(inplace=True)
X_test.dropna(inplace=True)
y_train.dropna(inplace=True)
y_test.dropna(inplace=True)
$endgroup$
$begingroup$
That worked! Thank you :)
$endgroup$
– Renae
13 hours ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Renae is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48848%2fi-received-this-error-message-found-input-variables-with-inconsistent-numbers-o%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
You likely get problems because you remove rows containing nulls in X_train and y_train independent of each other. y_train probably has few, or no nulls and X_train probably has some. So when you remove a row in X_train and the same row is not removed in y_train it will cause your data to be unsynced and have different lenghts. Instead you should remove nulls before you separate X and y.
Before this:
X = df['text']
y = df['label']
Do this:
df.dropna(inplace=True)
And remove this:
X_train.dropna(inplace=True)
X_test.dropna(inplace=True)
y_train.dropna(inplace=True)
y_test.dropna(inplace=True)
$endgroup$
$begingroup$
That worked! Thank you :)
$endgroup$
– Renae
13 hours ago
add a comment |
$begingroup$
You likely get problems because you remove rows containing nulls in X_train and y_train independent of each other. y_train probably has few, or no nulls and X_train probably has some. So when you remove a row in X_train and the same row is not removed in y_train it will cause your data to be unsynced and have different lenghts. Instead you should remove nulls before you separate X and y.
Before this:
X = df['text']
y = df['label']
Do this:
df.dropna(inplace=True)
And remove this:
X_train.dropna(inplace=True)
X_test.dropna(inplace=True)
y_train.dropna(inplace=True)
y_test.dropna(inplace=True)
$endgroup$
$begingroup$
That worked! Thank you :)
$endgroup$
– Renae
13 hours ago
add a comment |
$begingroup$
You likely get problems because you remove rows containing nulls in X_train and y_train independent of each other. y_train probably has few, or no nulls and X_train probably has some. So when you remove a row in X_train and the same row is not removed in y_train it will cause your data to be unsynced and have different lenghts. Instead you should remove nulls before you separate X and y.
Before this:
X = df['text']
y = df['label']
Do this:
df.dropna(inplace=True)
And remove this:
X_train.dropna(inplace=True)
X_test.dropna(inplace=True)
y_train.dropna(inplace=True)
y_test.dropna(inplace=True)
$endgroup$
You likely get problems because you remove rows containing nulls in X_train and y_train independent of each other. y_train probably has few, or no nulls and X_train probably has some. So when you remove a row in X_train and the same row is not removed in y_train it will cause your data to be unsynced and have different lenghts. Instead you should remove nulls before you separate X and y.
Before this:
X = df['text']
y = df['label']
Do this:
df.dropna(inplace=True)
And remove this:
X_train.dropna(inplace=True)
X_test.dropna(inplace=True)
y_train.dropna(inplace=True)
y_test.dropna(inplace=True)
edited yesterday
answered yesterday
Simon LarssonSimon Larsson
724114
724114
$begingroup$
That worked! Thank you :)
$endgroup$
– Renae
13 hours ago
add a comment |
$begingroup$
That worked! Thank you :)
$endgroup$
– Renae
13 hours ago
$begingroup$
That worked! Thank you :)
$endgroup$
– Renae
13 hours ago
$begingroup$
That worked! Thank you :)
$endgroup$
– Renae
13 hours ago
add a comment |
Renae is a new contributor. Be nice, and check out our Code of Conduct.
Renae is a new contributor. Be nice, and check out our Code of Conduct.
Renae is a new contributor. Be nice, and check out our Code of Conduct.
Renae is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48848%2fi-received-this-error-message-found-input-variables-with-inconsistent-numbers-o%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
You cat reshap them, you have to figure why it's happening because y_train and X_train will have same shape by default
$endgroup$
– Aditya
yesterday
$begingroup$
@Aditya Strangely, my X_test and y_test tally...Is there a particular reason why it isn't the same for y_train and X_train?
$endgroup$
– Renae
yesterday
$begingroup$
See if you are dropping some rows or not or else if they are one-to-one matching, then just reduce the X_train;
$endgroup$
– Aditya
yesterday
$begingroup$
@Aditya Got it, will give that a shot! To reduce the X_train, I inputted this code: y_train_new=y_train[1:17644] but it showed me an error. The aim is to make it the same as that of y_train
$endgroup$
– Renae
yesterday
$begingroup$
17644 isn't the y_train dims, it's 15600 but be careful, you shouldn't be using such tricks as in a normal scenario, this shouldn't happen actually. If you can share your preprocessing, folks can then help finding what's causing it unless it's by default like that
$endgroup$
– Aditya
yesterday