Home

Row

Date of Inspection Violations

Column

FS Recommender

DBA Pred_SCORE
’CESCA 13
’ESSEN 11
’RITAS 13
’WICHCRAFT 13
#1 Chinese Restaurant 12
#1 GARDEN CHINESE RESTAURANT 13
#1 NATURAL JUICE BAR 13
#1 SABOR LATINO RESTAURANT 4
$1 PIZZA 13
$1.25 PIZZA 13
% SHAO BIN ZHENG 12
& PIZZA 11
&PIZZA 13

© 2020. The Data Incubator. All Rights Reserved.

Top Recommendations

Explore

Column

Time Series Date of Inspection Violations

DBA’s with Most Violations

Column

Cuisine Type

DBA’s

DBA No_of_Violations
DUNKIN’ 3070
SUBWAY 2319
STARBUCKS 1668
MCDONALD’S 1536
KENNEDY FRIED CHICKEN 1196
CROWN FRIED CHICKEN 973
DUNKIN’, BASKIN ROBBINS 833
BURGER KING 747
POPEYES 721
GOLDEN KRUST CARIBBEAN BAKERY & GRILL 677
CHIPOTLE MEXICAN GRILL 604
LE PAIN QUOTIDIEN 511
DOMINO’S 461
NA 418
KFC 401
CHECKERS 373
WENDY’S 333
PRET A MANGER 327
VIVI BUBBLE TEA 312
JUST SALAD 294
JOE & THE JUICE 288
BREAD & BUTTER 270
BLUESTONE LANE 258
TEXAS CHICKEN & BURGERS 232
BAREBURGER 230

© 2020. The Data Incubator. All Rights Reserved.

Top Cuisines

Boro

Pie

Street

Column

NYC Street Map

Column

Top Streets

Top 25 Streets Table

STREET No_of_Violations
BROADWAY 11494
3 AVENUE 8838
2 AVENUE 6833
5 AVENUE 6823
8 AVENUE 5635
1 AVENUE 5275
7 AVENUE 4708
LEXINGTON AVENUE 4461
AMSTERDAM AVENUE 4455
9 AVENUE 4346
FLATBUSH AVENUE 3131
NOSTRAND AVENUE 2957
FULTON STREET 2837
4 AVENUE 2811
GRAND STREET 2586
86 STREET 2031
MADISON AVENUE 2009
WESTCHESTER AVENUE 1950
BEDFORD AVENUE 1927
CONEY ISLAND AVENUE 1913
CHURCH AVENUE 1889
10 AVENUE 1774
6 AVENUE 1768
COLUMBUS AVENUE 1764
FOREST AVENUE 1750

© 2020. The Data Incubator. All Rights Reserved.

Zip Code

Column

Zip Code

Column

Top Zip Codes

Top 25 Table

ZIPCODE No_of_Violations
10003 10221
10019 9212
10013 8494
10002 8028
10036 7875
10001 6891
10016 6745
11220 6709
10022 6384
10012 6327
10011 6274
11201 5720
10014 5630
10018 5249
11211 5169
10017 5060
11215 4912
10009 4392
11209 4265
11217 3731
11237 3673
10025 3626
10010 3513
11238 3433
10029 3338

© 2020. The Data Incubator. All Rights Reserved.

Model

Row

UBCF & IBCF Collaborative Recommendation Models

The UBCF and IBCF collaborative models were used to generate the Food Score Recommender system. The following is a breakdown of both approaches including validation metrics from both models. The UBCF: This method produces recommendations based on user-based collaborative filtering. The IBCF: This method produces recommendations based on item-based collaborative filtering.

Recommender Method Type

Sparse Matrix Object

The sparse matrix object used in the model development. It consists of filtered DBA with no critical flags, only A grades, and scores less than 20.

[1] "realRatingMatrix"
attr(,"package")
[1] "recommenderlab"

Recommender Method

The following code segment builds a model using the POPULAR method with the first 100 data points to issue three (TopN = 5) recommendations for a new DBA.

[1] "POPULAR"

Recommender Ratings Conversion

The number of ratings from the POPULAR method.

[1] 11268

Prediction Ratings
$`'CESCA`
[1] 2.802913

$`'ESSEN`
[1] 5.678246

$`'RITAS`
[1] 2.802913

$`'WICHCRAFT`
[1] 4.802913

$`#1 Chinese Restaurant`
[1] 3.148508

$`#1 GARDEN CHINESE RESTAURANT`
[1] 2.802913

$`#1 NATURAL JUICE BAR`
[1] 4.136246

$`#1 SABOR LATINO RESTAURANT`
[1] 2.672806

$`$1 PIZZA`
[1] 2.802913

$`$1.25 PIZZA`
[1] 4.802913

$`% SHAO BIN ZHENG`
[1] 3.981842

$`& PIZZA`
[1] 3.678246

$`&PIZZA`
[1] 2.802913

Predicted Item Labels
 [1] "0"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13" "17" "21"

Predicted DBA Scores
$`'CESCA`
[1] 13

$`'ESSEN`
[1] 11

$`'RITAS`
[1] 13

$`'WICHCRAFT`
[1] 13

$`#1 Chinese Restaurant`
[1] 12

$`#1 GARDEN CHINESE RESTAURANT`
[1] 13

$`#1 NATURAL JUICE BAR`
[1] 13

$`#1 SABOR LATINO RESTAURANT`
[1] 4

$`$1 PIZZA`
[1] 13

$`$1.25 PIZZA`
[1] 13

$`% SHAO BIN ZHENG`
[1] 12

$`& PIZZA`
[1] 11

$`&PIZZA`
[1] 13

IBCF Collaborative Model

Collaborative Model Creation

The model created is called IBCF or Item Based Collaborative Filtering trained with 5000 user ratings. The IBCF is similar to the UBCF or User Based Collaborative Filtering but allows for a less memory intensive approach without having to load the entire user database in memory, locally.

Essentially, this model computes internally the cosine similarity between all users represented as vectors, which in R is as simple as:

crossprod(a,b)/sqrt(crossprod(a)*crossprod(b))


Model Development

Reading 5,000 ratings using the IBCF collaborative model, then predicting from second indexed DBA.

IBCF Model Results

    RMSE      MSE      MAE 
1.970775 3.883953 1.811243 

User-User Collaborative Filtering

#In user-based collaborative filtering (UBCF) the procedure is to first find other users that are similar to a given user, then find the top-rated items purchased by those users. Those items are then recommended for the given user.

Recommendation Frequency Rating Table
Rating frequency
vector_ratings Freq
0 191167
2 17713
4 11040
6 3147
8 1214
10 484
12 196
14 95
16 72
18 31
20 24
22 22
24 18
26 12
28 8
30 8
32 11
34 8
36 6
38 11
40 2
42 5
44 6
46 3
48 3
50 5
52 2
54 2
58 2
60 2
64 3
66 2
72 1
74 1
78 3
82 2
86 1
88 2
94 1
96 2
100 2
106 1
108 1
110 1
112 2
114 1
116 1
130 3
132 1
140 1
142 1
144 1
152 2
154 1
156 2
160 1
168 2
174 1
182 1
190 2
194 3
196 1
208 1
238 1
252 1
258 1
280 1
310 1
Ratings Dimensions
[1] 5425   12

Splitting Data

For splitting data into test and train sets, we can use the evaluationScheme() function in recommenderlab. It extends the usage of generic methods of splitting the data, by allowing several parameters that are specific to recommender systems. As shown in the code section below, there is a parameter specifying how many items to use for each user, and another parameter specifying the minimum value that indicates a good rating.

Evaluation scheme with 1 items given
Method: 'split' with 1 run(s).
Training set proportion: 0.800
Good ratings: >=1.000000
Data set: 5425 x 12 rating matrix of class 'realRatingMatrix' with 19345 ratings.

We now build a UBCF model using the default parameters of the Recommender() function, and use it to predict using the test portion of the data set. We use library functions to evaluate accuracy of the prediction by comparing against values in the data set. Performance metrics for the UBCF model are displayed.

UBCF Model Results

                                          RMSE      MSE       MAE
'ESSEN                                1.986125 3.944691 1.5111111
#1 Chinese Restaurant                 2.128673 4.531250 2.1250000
1 OR 8                                2.173067 4.722222 2.1666667
104-01 FOSTER AVENUE COFFEE SHOP(UPS) 1.166424 1.360544 0.7619048
118 KITCHEN                           1.056531 1.116259 0.8714286
16TH AVENUE GLATT                     0.728869 0.531250 0.6250000

© 2020. The Data Incubator. All Rights Reserved.

About

Row

Introduction

The NYC Health Department discovered 10 foodborne illness outbreaks since 2012 using Yelp reviews.

—Columbia University


Abstract

Foodborne illness outbreaks and food poisonings are increasing becoming more frequent. There are over 3,000 deaths a year in the USA due food poisoning. To combat these new trends of foodborne illness outbreaks and food poisoning, a web app was created focusing on NYC inspection violations of food establishments. Food Score is the dashboard designed to provide the cleanest restaurants in New York City by borough.




The Data Incubator

Food Score was designed and created by Kyle W. Brown as a Capstone project for the Data Incubator Summer 20’ Cohort.



The Data Incubator is a data science education company that offers an intensive, immersive, 8-week, full-time bootcamp for those with advanced STEM degrees. They provide corporate data science training and placement services. The program has four campuses: New York City, Washington DC, San Francisco, and Online. The program also offers corporate training to Fortune 500 clients.

  • According to Venture Beat, the program had over 1000 applicants from over 80 universities in its first round and accepted just under 3% of all applicants. The program was selected by Business Insider as one of 15 competitive programs in the world with more competitive admissions than Harvard.

  • Only accepting and training the best STEM post-graduates, ensures confidence that hiring partners are getting the best when working us. The Data Incubator teaches the most on-demand skills and best open-source programs preparing students to jump right in and make a difference.

Link to apply:


Data

NYC Restaurant Inspection Violations

The data used for Food Score was exclusively from NYC OpenData website, in particular the restaurant inspection results from August 2014 until present.



The NYC Inspection Violations can be found here:


Problem

To put food safety and foodborne illness outbreaks into perspective, and why it’s a problem.

  • Food safety is a concern in NYC as the number of violations have significantly risen by an average of 28% over the past 3 years.

  • Across the USA there are 3,000 deaths a year due to food poisoning and foodborne illness outbreaks are common in NYC.

  • New York City has one of the highest concentration of restaurants in the world with 27,000.

  • Tourist revenue accounted for $44 billion in 2017.


Value Statement

Food safety is the central focus of the platform for consumers, tourists, or food and drink establishment revenue. Food Scores value statement is designed to drive value through tourist revenue and customer centric approach.

  • The main feature of Food Score is building a recommendation system that predicts companies based on no critical flags, scores of less than 20, and only A scores.

  • With 65 million people that traveled to NYC in 2018, accounting for $9 billion spent in food and drink establishments.

  • The main deliverable is building a dashboard that integrates the recommender model that produces a list of the cleanest restaurants in New York City by borough and maps it.


NYC Competition

  • SALT lets you view a restaurant’s menu, make a reservation through OpenTable when applicable, and even allows you to request an Uber to any of your saved locations.

  • ChefsFeed capitalizes on the credibility and clout of leading professional chefs to help New Yorkers discover new spots in a social media-type network.

  • PopCity allows users to map any food photos they find on social media outlets like Instagram or on the Popcity discovery channel. Using Instagram’s photo copy link feature, you can immediately import a post to your Popcity map.


End-Users

The end-user for Food Score is anyone looking to grab a bite to eat, whether it’s close to work, hotel, shopping, or most importantly close to Broadway in New York City. The value to the end-users would be in the form of knowing the cleanest/safest restaurants, their locations whether its by building, street, borough, or by cuisine type.

More specifically, end-users are:

  • Restauranteurs
  • Foodies
  • Bloggers
  • General Consumers

Summary

Food Score was created to provide a recommendation of the cleanest restaurants in New York City by borough. The recommendation system used was Recommenderlab’s UBCF/IBCF collaborative model. The recommendations model considered companies with only grade of A, score lower than 20, and no critical flags. There is a vibrant market for Food Score in NYC with similar competitors such as ChefsFeed, PopCity, and SALT. in NYC with similar applications.

Food Score provides a new unique approach to providing clean restaurants in NYC, and could impact:

  • The NYC food scene.
  • Reducing the likelihood of the next foodborne illness outbreak.
  • Increasing tourist spending in restaurants by 1% would account for $90m in revenue.

Food safety and providing awareness of food safety is the core value and mission of Food Score. By recommending the cleanest restaurants in NYC, we hope to cut down on food poisoning deaths and foodborne illness outbreaks.


© 2020. The Data Incubator. All Rights Reserved.

Contact

Kyle W. Brown

Food Score was designed and created by Kyle W. Brown as a Capstone project for the Data Incubator Summer 20’ Cohort.



kylewbrown@worldcapitalis.com | hackerrank.com/kylewbrown | github.com/kyle-w-brown | rpubs.com/kylewbrown

With an M.S. in Data Science & Business Analytics concentration in Advanced Computing from Wayne State University; Kyle W. Brown is an advanced technology researcher, entreprenuer, author, and national leadership award recipient. Who’s research interests include: Accelerator hardware for datacenters, automotive embedded systems, and particle physics. Besides researching Kyle’s hobbies are volunteering for the Detroit Economic Club as a Young Leader, reading classical lit, and medicinal chemistry.

Achievements

  • Chairperson SAE World Congress Experience 2019 (WCX19) as well as presenter; presenter at SAE WCX Digital Summit 2020.
  • 2005 National Leadership Award.
  • 2005 Appointed Honorary Chairman to the Business Advisory Council (BAC).
  • Distinguished Alumni presenter, Engineering Technology. Wayne State University College of Engineering Hall of Fame Ceremony 2018. (3 presenters)

Maximizing Shareholder Value




WorldCapital Integrated Solutions, LLC.

Integrating the World with Innovative Solutions

(800) 846-8693 | worldcapitalis.com | info@worldcapitalis.com | github.com/WorldCapital | rpubs.com/WorldCapital

WorldCapital Integrated Solutions, LLC. (WCIS) is an investment banking firm focusing on economic development in emerging markets. While we specialize in raising capital, corporate valuations, advanced technology, and research. We also assist clients and the community with business development, investment startegies, and open source software solutions.

Specializing in solutions for:

  • Advanced Technology
  • Risk Management
  • Open Sourced Software
  • Businesses

© 2020. The Data Incubator. All Rights Reserved.