Find suitable locations using Machine Learning

Just for fun, I am currently trying to find suitable locations to deploy new stores. So what I did so far is to take the actual sites of current stores and to assign surrounding variables to it. These features include for example: point of interest density, population density, region popularity etc. In total I have 9000, 100 dimensional points. 1000 of these points contain stores already, the remaining 8000 do not.

In the next step I want to perform dim reduction using PCA. However, I am not sure how to proceed afterwards. Should I try to cluster the points? Or how can I „predict“ which of the points are suitable candidates for new stores? Maybe using some kind of skip gram model?

Hoping to get some advise:)

Cheers,
Tom

asked 2 days ago

Lossa

New contributor

add a comment |

Hoping to get some advise:)

Cheers,
Tom

asked 2 days ago

Lossa

New contributor

add a comment |

Hoping to get some advise:)

Cheers,
Tom

asked 2 days ago

Lossa

New contributor

Hoping to get some advise:)

Cheers,
Tom

machine-learning classification prediction

asked 2 days ago

Lossa

New contributor

asked 2 days ago

Lossa

New contributor

asked 2 days ago

Lossa

New contributor

asked 2 days ago

Lossa

asked 2 days ago

Lossa

New contributor

Lossa is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

1 Answer
1

active

oldest

votes

Are you sure PCA is the correct way to go?
It's an analytical problem and being able to interpret the results are very important.

How about the correlation between the number of stores and nearby features? Find out what makes a good location. What are the most important features? Run forward or backward selection as an example, or use another model/feature selection technique.

It's not a pure machine learning case you have here. It's a typical analytical data science problem.

If you still want to do classification, just train a model. You have POI features and some others. You know if there is a store or not :) I might not fully understand the problem here. You train on a 50% a store exist location, and 50% a store does not exist in this location dataset. Train a classifier, and classify other areas.

I'd still start to visualize and understand the data as I mentioned first. It's much underrated and the way to start solving most problems.

Hope that gave you some hints,

Cheers

answered 2 days ago

Carl Rynegardh

30119

$begingroup$
Hi Carl, I mean I can still interpret PCA using a correlation analysis between the principal components and the original variables right? This should help to get an idea of how the data looks like. Still it would be a nice idea to use the analytical solution to validate the classification result. Thx for your help!
$endgroup$
– Lossa
2 days ago

$begingroup$
You can look at how much each feature adds to the principal components. I would not call that correlation analysis, but maybe that is something you can do. Looking at how much each feature adds to the principal components is not always very interpretable.
$endgroup$
– Carl Rynegardh
2 days ago

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

Lossa is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47039%2ffind-suitable-locations-using-machine-learning%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Are you sure PCA is the correct way to go?
It's an analytical problem and being able to interpret the results are very important.

It's not a pure machine learning case you have here. It's a typical analytical data science problem.

I'd still start to visualize and understand the data as I mentioned first. It's much underrated and the way to start solving most problems.

Hope that gave you some hints,

Cheers

answered 2 days ago

Carl Rynegardh

30119

$begingroup$
Hi Carl, I mean I can still interpret PCA using a correlation analysis between the principal components and the original variables right? This should help to get an idea of how the data looks like. Still it would be a nice idea to use the analytical solution to validate the classification result. Thx for your help!
$endgroup$
– Lossa
2 days ago

$begingroup$
You can look at how much each feature adds to the principal components. I would not call that correlation analysis, but maybe that is something you can do. Looking at how much each feature adds to the principal components is not always very interpretable.
$endgroup$
– Carl Rynegardh
2 days ago

add a comment |

Are you sure PCA is the correct way to go?
It's an analytical problem and being able to interpret the results are very important.

It's not a pure machine learning case you have here. It's a typical analytical data science problem.

I'd still start to visualize and understand the data as I mentioned first. It's much underrated and the way to start solving most problems.

Hope that gave you some hints,

Cheers

answered 2 days ago

Carl Rynegardh

30119

$begingroup$
Hi Carl, I mean I can still interpret PCA using a correlation analysis between the principal components and the original variables right? This should help to get an idea of how the data looks like. Still it would be a nice idea to use the analytical solution to validate the classification result. Thx for your help!
$endgroup$
– Lossa
2 days ago

$begingroup$
You can look at how much each feature adds to the principal components. I would not call that correlation analysis, but maybe that is something you can do. Looking at how much each feature adds to the principal components is not always very interpretable.
$endgroup$
– Carl Rynegardh
2 days ago

add a comment |

Are you sure PCA is the correct way to go?
It's an analytical problem and being able to interpret the results are very important.

It's not a pure machine learning case you have here. It's a typical analytical data science problem.

I'd still start to visualize and understand the data as I mentioned first. It's much underrated and the way to start solving most problems.

Hope that gave you some hints,

Cheers

answered 2 days ago

Carl Rynegardh

30119

Are you sure PCA is the correct way to go?
It's an analytical problem and being able to interpret the results are very important.

It's not a pure machine learning case you have here. It's a typical analytical data science problem.

I'd still start to visualize and understand the data as I mentioned first. It's much underrated and the way to start solving most problems.

Hope that gave you some hints,

Cheers

answered 2 days ago

Carl Rynegardh

30119

answered 2 days ago

Carl Rynegardh

30119

answered 2 days ago

Carl Rynegardh

30119

answered 2 days ago

Carl Rynegardh

30119

$begingroup$
Hi Carl, I mean I can still interpret PCA using a correlation analysis between the principal components and the original variables right? This should help to get an idea of how the data looks like. Still it would be a nice idea to use the analytical solution to validate the classification result. Thx for your help!
$endgroup$
– Lossa
2 days ago

$begingroup$
You can look at how much each feature adds to the principal components. I would not call that correlation analysis, but maybe that is something you can do. Looking at how much each feature adds to the principal components is not always very interpretable.
$endgroup$
– Carl Rynegardh
2 days ago

add a comment |

$begingroup$
Hi Carl, I mean I can still interpret PCA using a correlation analysis between the principal components and the original variables right? This should help to get an idea of how the data looks like. Still it would be a nice idea to use the analytical solution to validate the classification result. Thx for your help!
$endgroup$
– Lossa
2 days ago

$begingroup$
You can look at how much each feature adds to the principal components. I would not call that correlation analysis, but maybe that is something you can do. Looking at how much each feature adds to the principal components is not always very interpretable.
$endgroup$
– Carl Rynegardh
2 days ago

Hi Carl, I mean I can still interpret PCA using a correlation analysis between the principal components and the original variables right? This should help to get an idea of how the data looks like. Still it would be a nice idea to use the analytical solution to validate the classification result. Thx for your help!

– Lossa
2 days ago

You can look at how much each feature adds to the principal components. I would not call that correlation analysis, but maybe that is something you can do. Looking at how much each feature adds to the principal components is not always very interpretable.

– Carl Rynegardh
2 days ago

add a comment |

Lossa is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Lossa is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Htydjtk