Credits to: Ye Joo Park, Sandip Sonawane from the University of Illinois Urbana-Champaign
<aside> 📌 In this data project, the main goal is to dig deep into the demographics of Starbucks App users and examine the effects of various loyalty offers that have been employed previously.
</aside>
The datasets used in this project will be directly from Starbucks(originally were provided to Udacity).
np.nan
import pandas as pd
import numpy as np
df_transcript = pd.read_csv('<https://github.com/bdi475/datasets/raw/main/starbucks-rewards/transcript.v2.csv.gz>')
df_profiles = pd.read_csv('<https://github.com/bdi475/datasets/raw/main/starbucks-rewards/profile.csv>')
df_transcript_backup = df_transcript.copy()
df_profiles_backup = df_profiles.copy()
df_transcript.tail(10)
df_profiles.head(10)
visualize transcript and profiles datasets:
unique_events = df_transcript['event'].unique()
print(f'Event types: {unique_events}')
Event types: ['offer received' 'offer viewed' 'transaction' 'offer completed']
#checking the num of missing values in our data
num_rows = df_profiles.shape[0]
num_cols = df_profiles.shape[1]
num_missing = df_profiles['gender'].isna().sum()
#remove missing values and unused columns
df_profiles = df_profiles[df_profiles['gender'].notna()]
df_transactions.drop(columns=['event','time','offer_id'],inplace=True)