Nikhil Sharma About Projects Research

Hi! My name is Nikhil Sharma and I am an aspiring data scientist. I am in my fourth year at UCLA, where I study Statistics.

On campus, I am President of Bruin Sports Analytics. Last year, I was Blog Chair, where I oversaw all operations related to
the Data Journalism team, including editing all articles, advising writers on their analyses and writing pieces of my own. As
President, I manage all club operations by supervising the data journalism, research and consulting teams.

This past summer, I was the Advanced Analytics & Data Science Intern at L'Oréal USA, in New York, NY. While interning, I undertook various projects to help optimize marketing strategies, evaluate marketing costs and predict daily online revenues
for brands such as L'Oréal Paris, Maybelline New York, Matrix and Kiehl's. I acheived these goals through tree-based and regression modeling.

I am also currently working remotely with Dr. Shane Jensen, Associate Professor of Statistics at The Wharton School on a
research project approaching the Ewing Theory in the NBA through statistical modeling. The Ewing Theory is the idea that
when a star athlete leaves a team, the team gets better.

Last summer, I was a Decision Sciences/Business Insights Intern at Communications Media, Inc. in Philadelphia, PA.
As an intern, I was mainly tasked with analyzing and predicting behavior of healthcare professionals for advertising
purposes through NLP and regression models.

In my free time, I enjoy watching NBA games and listening to hip-hop. I occasionally dabble in writing as well; I was once a
Contributor for the Arts & Entertainment section of UCLA's main newspaper, The Daily Bruin.

Feel free to contact me at my email, or my LinkedIn.

Constructing My All-Time NBA Lineup Through K-Means Clustering
November 2018, January 2019
Article, Part 1 Article, Part 2 Code

In the first part of this article, I created my best all-time NBA starting lineup with the help of k-means clustering and random
forest concepts in R. I made the visualizations in R using ggplot2 and Plotly. I revisited the project in part two, where I
addressed some criticisms of the first part and considered different methods to choose lineups.

American Statistical Association Datafest at UCLA, 2019 Finalist
April 2019
Presentation Code

I participated in the 2019 ASA DataFest at UCLA, where 75+ teams competing from universities across Southern California
were challenged to analyze data from the Canada women's national rugby union team towards the goal of redefining athlete
fatigue. We redefined fatigue using k-means clustering based on subjective player wellness scores. Then, we examined how
our definition of fatigue impacted on-field results. We were chosen by the judges to advance to the Finalist's round.

Russell Westbrook: Point Guard or Shooting Guard?
December 2018
Article Code

For this article, a fellow Bruin Sports Analytics member and I built a logistic regression model in R that classified guards as
point guards or shooting guards. Then, we ran Russell Westbrook's stats through the model and thoroughly examined and
explained our findings on his status as a guard in the NBA. We made the visualizations in R using ggplot2.

Analyzing Denzel Curry’s Lyrics Through Text Mining Methods
May 2019
Article Code

I performed sentiment analysis in R with my friend on the lyrics of Denzel Curry, a rapper from South Florida. Our analysis
and visualizations lead to some fascinating insights about how the lyrical mood of his most recent album “TABOO” informed
its structure.

Predicting NBA Players’ Positions Using Machine Learning Methods
March 2019
Write-Up Code

I did this project with two classmates in the "Python and its Applications" course offered by the PIC department at UCLA. The
goal of this project was to construct a model that classified player positions in the NBA. For this project, I did the entire
random forest/decision tree and analysis/conclusion portions and contributed to the data cleaning. All analysis and
visualizations were done in Python using scikit-learn and matplotlib.

The 3-Point Revolution in the NBA
September 2018
GIF Code

I created this visualization in using the gganimate package in R that shows how dramatically NBA teams have increased their
3-point shooting throughout the decades. You might recognize this GIF from the front page of this website!

On Mark Twain and Jay-Z
May 2016
TEDx Talk

I delivered this TEDx Talk during my senior year of high school, where I discussed the importance of basketball analytics in
relation to a research project I had done earlier that year on analysis of NBA salaries.

Why LeBron James Left Cleveland
June 2018

I made this dashboard when I was first learning how to use Tableau at my internship at CMI. These visuals clearly show how
much LeBron had to do in the 2018 Playoffs to carry Cleveland to the Finals, and thus, the reason he left.

Blake Griffin: Sharpshooter or Bricklayer?
March 2018

I analyzed how Blake Griffin has trended towards becoming a perimeter shooter as his career has progressed. I made the
main spider chart using the fmsb package in R, and the other charts using Plotly's online client.

An In-Depth Comparison of the 06-07 Phoenix Suns and 17-18 Houston Rockets
March 2018

Two fellow BSA members and I compared and contrasted the 2006-07 Phoenix Suns and 2017-18 Houston Rockets, as both
teams were coached by Mike D'Antoni and had some fascinating similarities and differences. We made the visualizations in
Plotly's online client.

Jay-Z’s “4:44″ marks rap mogul’s mature, reflective return to form
October 2017

Moving away from sports, I wrote this article when I was a contributor for the Daily Bruin. In it, I talked about how Jay-Z had
been stumbling musically for a few years until the release of his 2017 album "4:44".

For other statistical projects, check out my GitHub. For my Daily Bruin articles, check out my author page.

Analyzing the Ewing Theory (In Progress)
June 2018 -

I started this research project last summer, and am remotely continuing it with Dr. Shane Jensen, Associate Professor of
Statistics at The Wharton School. We are using random forest, decision tree, and logistic regression models to assess the
validity of the Ewing Theory in the NBA (the idea that when a star athlete leaves a team, the team gets better).

Education Statistics
May 2019 - June 2019
Code Presentation

I served as a Project Manager for a project during the Spring 2019 quarter for DataRes, UCLA’s premier student-run data
science organization. We used a dataset obtained from Kaggle featuring global education statistics. We curated a
presentation for DataRes's Poster Day highlighting the statistical differences in high and low income countries in terms of education. Our work was featured in the Daily Bruin!

Examining the Effects of Exercise on Human Endorphin Levels Through Hypothesis Testing
May 2019 - June 2019
Code Paper

I wrote this final paper with a friend for my "Design and Analysis of Experiments" course taken in Spring 2019. We collected
data through an online "Island" and designed a Latin Square experiment to test the effects of different exercises on
endorphin secretion.

Understanding Methods of Measuring Vocabulary Ability Through Classical Test Theory,
Factor Analysis and Item Response Theory
October 2018 - December 2018
Code Paper

I wrote this final paper for my "Measurement and its Applications" course taken in Fall 2018. It focuses on using Classical
Test Theory, Factor Analysis and Item Response Theory concepts to shorten a scale intended to measure vocabulary
strength. It was an interesting departure from what I normally do in terms of statistical studies.

Using a Dashboard to Analyze NBA Salaries
April 2016

I completed this project my senior year of high school for my Research in Science class. Though it is quite rudimentary, I
would credit this project with thrusting me into the worlds of sports analytics and data science. This project also served as
the backbone for my TEDx Talk.