Crowd Detector

(Built Using Five Nodes on AWS)

In this project I have built and app which helps users to avoid wasting time in crowds. Imagine if we had data about people's location. It would be nice if we could use that data to detect clusters of population. The app that I have built takes user's location of interest in San Fransisco and a search radius seperated by comma. The result shows the clusters of people and can look like this:

Here are the details of how I approached this problem:

Data Collection: Since such data is not available to me I engineered it. I took data from Yelp which contains location of restaurants in San Fransisco. I performed a random walk to create more points around the restaurants.
Here's how my pipeline looks like:

I use Streaming K-means in the spark streaming environment. There are two indices are created on elasticsearch. One contains data about peoples location and is updated every day due to lack of memory storage. The other index contains location of 1000000 people and is updated every 3 minutes.

Some Engineering challenges :

Tunning kafka, spark streaming and elasticsearch in order to update the map as quick as possible. In particular tunning batch intervals has to be done carefully to avoid situations where the map is empty of points.
Choosing k.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
map		map
producer		producer
project		project
src/main/scala		src/main/scala
templates		templates
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Crowd Detector

(Built Using Five Nodes on AWS)

About

Releases

Packages

Languages

reza-asad/Crowd-Detector

Folders and files

Latest commit

History

Repository files navigation

Crowd Detector

(Built Using Five Nodes on AWS)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages