Credits to: Ye Joo Park, Sandip Sonawane from the University of Illinois Urbana-Champaign

<aside> 📌 In this data project, the main goal is to dig deep into the demographics of Starbucks App users and examine the effects of various loyalty offers that have been employed previously.


The datasets used in this project will be directly from Starbucks(originally were provided to Udacity).

  1. Transcript: a list of all purchases (transactions) and events related to loyalty offers
  2. Profile: demographics data for each customer in the Rewards app; customers who have not provided their demographic information will show up as np.nan

Load pandas, NumPy, and datasets

import pandas as pd
import numpy as np
df_transcript = pd.read_csv('<>')
df_profiles = pd.read_csv('<>')
df_transcript_backup = df_transcript.copy()
df_profiles_backup = df_profiles.copy()

visualize transcript and profiles datasets:



Find number of unique event type

unique_events = df_transcript['event'].unique()
print(f'Event types: {unique_events}')

Event types: ['offer received' 'offer viewed' 'transaction' 'offer completed']

Cleaning the datasets

#checking the num of missing values in our data
num_rows = df_profiles.shape[0]
num_cols = df_profiles.shape[1]
num_missing = df_profiles['gender'].isna().sum()

#remove missing values and unused columns
df_profiles = df_profiles[df_profiles['gender'].notna()]

Merge profiles dataset into transactions dataset