The aim of the project is to study effectivness of phone operators for virtual telephony service CallMeMaybe.
To define ineffective operators we will study the following features:
Also we will study these features for every tariff plan the company has (A, B, C) and define ineffective operators for every of this plan.
We will test the hypothesis if the difference in operators performance for different tariff plans is statistically significant.
Finally we will draw basic conclusions and develop some recommendations on how to improve operators performance.
Part 1. Data preprocessing
Reading the data base
Checking and changing the data types
Checking for missing values
Checking for duplicated data
Conclusion
Part 2. Exploratory data analysis (EDA)
Correlation matrix
- missed incoming calls
- waiting time
- outgoing calls
- internal calls
Ineffective Operators
Studying the features for different tarrif plans
Parameters of efficiency by tariff plans
List of not effective operators by tariff plan
Conclusion
Part 3. Hypotheses testing
Mann-Whitney U Test
T-test
Part 4. Overall conclusion
Tableau dashboard
Presentation
Sources
import pandas as pd
from IPython.display import display
import plotly.express as px
from plotly import graph_objects as go
import matplotlib.pyplot as plt
import matplotlib as mpl
import numpy as np
from numpy import median
from scipy import stats as st
import seaborn as sns
import math
from plotly.subplots import make_subplots
#adjustments of the path for the Practicum platform:
#df=pd.read_csv('/telecom_clients_us.csv')
df_cl = pd.read_csv('./telecom_clients_us.csv')
display(df_cl.head())
#adjustments of the path for the Practicum platform:
#df=pd.read_csv('/telecom_dataset_us.csv')
df_dataset = pd.read_csv('./telecom_dataset_us.csv')
display(df_dataset.head())
df_cl.info(memory_usage='deep')
#changing data type
df_cl['date_start'] = pd.to_datetime(df_cl['date_start'])
df_cl.info(memory_usage='deep')
df_dataset.info(memory_usage='deep')
#'date' column
#changing data type
df_dataset['date'] = pd.to_datetime(df_dataset['date'], format=('%Y-%m-%d %H:%M:%S'))
#getting rid of time zones
df_dataset['date']= df_dataset['date'].dt.tz_localize(None)
df_dataset['internal'] = df_dataset['internal'].astype(bool)
df_dataset.info(memory_usage='deep')
Changing the data type for columns (object to datetime and object to boolean) has significantly improved the memory usage: from 58.0 KB to 35.8 KB and from 7.3 MB to 4.2 MB for the two data sets. After dealing with the missing values we will also change the type of 'operator_id' column.
# checking for missing values in the data set
df_dataset.isnull().sum()
df_dataset['operator_id'].value_counts()
df_dataset['user_id'].nunique()
df_cl['tariff_plan'].value_counts()
df_cl.isnull().sum()
df_dataset.shape
df_dataset.groupby(['date']).size().plot(
kind='bar',
grid=True,
title='Number of data without NaNs',
figsize=(25,5)
)
plt.show()
We can see that there is clear repeatability in shares of collected data: amount of data collected on Saturdays and Sundays is significantly smaller than on other days of the week.
Studying missing values in 'operator_id".
df_dataset.groupby(['user_id'])['operator_id'].nunique().reset_index().sort_values(by='operator_id', ascending=False)
According to the table users can have from 0 to 50 different opperators.
users_operators = df_dataset.groupby(['user_id'])['operator_id'].nunique().reset_index()\
.groupby(['operator_id'])['user_id'].nunique().reset_index()
users_operators.columns = ['operators_cnt', 'users_cnt']
users_operators.head(3)
users_operators.plot(x='operators_cnt', y='users_cnt', kind='bar', grid=True, figsize=(20,5), cmap='Set2', \
title='Number of unique users per operators')
plt.show()
17 unique users are asigned to non operator;
1 operator has 107 unique users;
2 operatos have 63 unique users;
50 operatos have 1 unique user.
#Calculatinug the 95th and 99th percentiles for the number of users.
np.percentile(users_operators['users_cnt'], [95, 99])
Only 5% of operators have mothe than 60 unique users; only 1% of operators have mothe than 97 unique users.
#calculating percentage of missing values:
df_dataset.isnull().sum()/len(df_dataset)*100
Percentage of missing values in 'operator_id' is 15%. As one user can have more than 1 operator ( the number of operators per user varies from 1 to 50), we cannot use transform() method grouping by users to fill the missing valuses. We will have to drop these part of the data as we cannot restore the missing values here.
df_dataset = df_dataset.dropna(axis=0, subset=['operator_id'])
df_dataset.isnull().sum()/len(df_dataset)*100
df_dataset['internal'].value_counts()
As now we have only minor number of missing values in column 'internal' (<2% of the data set) we can drop them to proceed with the data further.
df_dataset = df_dataset.dropna(axis=0, subset=['internal'])
Now we can change the data type
# changing type of the 'operator_id' column:
df_dataset['operator_id']= df_dataset['operator_id'].astype(int)
df_dataset.sample()
print('There are', df_dataset['operator_id'].nunique(), 'unique operators in the amended data set.')
print('There are', df_dataset['user_id'].nunique(), 'unique users in the amended data set.')
#Checking data for zeros:
for i in df_dataset.columns:
print(i, len(df_dataset[df_dataset[i]==0]))
#Checking data for zeros:
for i in df_cl.columns:
print(i, len(df_cl[df_cl[i]==0]))
We have a lot of zeros for internal calls and missed calls, as well as call duration. In the second data bse we do not have zeros at all.
#Checking for duplicated data
duplicated1=df_dataset.duplicated().sum()
print('There are', duplicated1, 'duplicates in the general data set.')
#Checking for duplicated data
duplicated2=df_cl.duplicated().sum()
print('There are', duplicated2, 'duplicates in the additional data set (telecom_clients).')
#checking what kind of duplicates we have
duplicated_rows = df_dataset[df_dataset.duplicated()]
duplicated_rows.sample(15)
#checking the dates when the data was duplicated:
duplicated_rows['date'].dt.date.unique()
print('General data represents data for the period from', df_dataset['date'].min(), 'till', df_dataset['date'].max(), \
'.\nThe duplicated rows came across the data set from the period', duplicated_rows['date'].min(), 'till', \
duplicated_rows['date'].max(), '.' )
It means that there are duplicates throughout the whole period. Presumably that might have happened due to some server problems that continuously produced duplicated data.
#checking the operators whose data was duplicated:
duplicated_rows['operator_id'].value_counts()
#checking area of the problem caused duplicated data:
for i in duplicated_rows:
print(i, ":", duplicated_rows[i].nunique())
df_dataset.query('operator_id==891410')
So far we cannot see any pattern in duplicates. There are data on different events, dates (114 days from 05.08.2019 till 21.09.2019), different operators (775 unique operators have duplicates in their data).
# length before dropping duplicates
length1= len(df_dataset)
# dropping duplicate values
df_dataset.drop_duplicates(keep=False,inplace=True)
# length after removing duplicates
length2=len(df_dataset)
# printing all data frame lengths
print('Length before dropping duplicates:', length1, '\nLength before dropping duplicates:', length2)
print('Final data includes: \nunique users:', df_dataset['user_id'].nunique(),\
'\nunique operators:', df_dataset['operator_id'].nunique())
The data for analysis is splited between the 2 tables: telecom_dataset_us and telecom_clients. The first table is the main one. It contains:
As one user can have more than 1 operator ( the number of operators per user varies from 1 to 50), we cannot use transform() method grouping by users to fill the missing valuses. We had no other option but to drop these part of the data as we cannot restore the missing values here.
As now we have only minor number of missing values in column 'internal' (<2% of the data set) we can drop them without scewing the data.
We have determined that there are 4179 duplicates in the general data set. Studying the duplicated rows we have not figured any pattern. there are duplicates throughout the whole period. Presumably that might have happened due to some server problems that continuously produced duplicated data.
All in all during preprocessing we have:
#statistical summary for numerical variables
df_dataset.drop(['user_id', 'operator_id'], axis = 1).describe().round(2)
df_dataset['direction'].value_counts()
#defining correlation between the features
corr = df_dataset.drop(['user_id', 'operator_id'] , axis = 1).corr()
plt.figure(figsize=(10,8))
sns.heatmap(corr, xticklabels=corr.columns, yticklabels=corr.columns, annot=True,
cmap=sns.diverging_palette(220, 10, as_cmap=True))
plt.show()
The matrix shows that we have 2 pairs of itercorrelated features: call_duration with total_call_duration and calls_count with total_call_duration. We are going to substitude call_duration with total_call_duration features with average waiting time. This way we will get a set of 4 not correlated features and will set them as a criteria of effectiveness.
We will define ineffective operatos according to the 1st parameter - naumber of missed calls.
missed_calls = df_dataset.query('direction=="in"').groupby('operator_id')['is_missed_call'].sum().reset_index().sort_values(by='is_missed_call', ascending=False)
missed_calls['is_missed_call'] = missed_calls['is_missed_call'].astype(int)
missed_calls
plt.figure(figsize=(15,3))
missed_calls['is_missed_call'].plot.hist(density=True, bins= 27, color='#cce1e1', log=True)
plt.title('Number of missed calls')
plt.show()
# box-and-whisker plot
plt.figure(figsize=(15,3))
ax = sns.boxplot(x="is_missed_call", data=missed_calls, color='#cce1e1')
ax.set(xlabel='number of missed calls')
plt.show()
#Calculatinug the 95th and 99th percentiles for the number of missed calls (setting out outliers).
np.percentile(missed_calls['is_missed_call'], [95, 99])
missed_calls['is_missed_call'].describe()
#operators with missed incoming calls:
operators_with_missed = missed_calls.query('is_missed_call>0')['operator_id']
operators_with_missed.values.tolist()
operators_with_missed.shape
In general 739 operators have from 0 to 27 missed calls during the period. Middle 50% of them have from 0 to 1 missed call.
Only 5% of operators have over 4 missed calls, and only 1% of operators have over 11 missed calls.
509 out of the 739 operators do not have missed incoming calls.
We have received a list of 230 operators IDs that had missed calls during this period.
waiting_calls = df_dataset.query('direction=="in"')
#and is_missed_call==True')
waiting_calls.head()
waiting_calls_operators = df_dataset.query('direction=="in"').groupby('operator_id')\
.agg({'calls_count': 'sum', 'call_duration': 'sum', 'total_call_duration': 'sum'})\
.reset_index().sort_values(by='calls_count', ascending=False)
waiting_calls_operators.sample()
waiting_calls_operators['waiting_time'] = waiting_calls_operators['total_call_duration'] - waiting_calls_operators['call_duration']
waiting_calls_operators['avg_waiting_time'] = waiting_calls_operators['waiting_time']/waiting_calls_operators['calls_count']
waiting_calls_operators.sample()
waiting_calls_operators['avg_waiting_time'].describe()
plt.figure(figsize=(15,3))
waiting_calls_operators['avg_waiting_time'].plot.hist(density=True, bins= 50, color='#cce1e1', log=True)
plt.title('Distribution of average wating time among operators')
plt.xlabel('Waiting time, sec.')
plt.show()
Destribution of average waiting_time is scewed with long tail to the right. There are lots of outliers but most of the distribution lies within 1 minute. Average waiting_time varies greatly from 1 sec to 115 sec. Median value is 14 sec. with standard deviation of 12. Middle 50% of operators do not pick up on average from 9 to 22 seconds.
We will define how successful is every operator in relation to the mean value of his/her average waiting time. To do that we will calculate z-score for the distribution as a numerical measurement that describes a value's relationship to the mean of a group of values. If a Z-score is 0, it indicates that the score of number of avg wating time of an operator is identical to the mean score.
#counting z-score for sales to define distribution
waiting_calls_operators['waiting_time_z']=(waiting_calls_operators['avg_waiting_time']-waiting_calls_operators['avg_waiting_time'].mean())/waiting_calls_operators['avg_waiting_time'].std()
waiting_calls_operators['operator_id'] = waiting_calls_operators['operator_id'].astype(str)
waiting_calls_operators.head(3)
Now we will plot part of the distribution (operators with z-score close to 0) to illustrate the calculations.
#distinguishing z-score by colors: green for negative values (shorter waiting time), red for positive ones (longer waiting time)
waiting_calls_operators_z = waiting_calls_operators.query('-0.05 <=waiting_time_z<=0.08')\
.sort_values(by='avg_waiting_time', ascending=False).reset_index(drop=True)
waiting_calls_operators_z['colors'] = ['teal' if x<0 else 'red' for x in waiting_calls_operators_z['waiting_time_z']]
waiting_calls_operators_z['colors']
plt.figure(figsize=(14,10))
plt.hlines(y=waiting_calls_operators_z.operator_id, xmin=0, xmax=waiting_calls_operators_z.waiting_time_z, colors=waiting_calls_operators_z.colors, alpha=0.4, linewidth=10)
plt.title('Operators average waiting time relation to mean value of the waiting time \n(z-score)', fontsize=18)
plt.xlabel('z-score', fontsize=18)
plt.ylabel('Operators IDs', fontsize=16)
plt.show()
Now we can create a list of the least successful operators IDs that have the longest average waiting time (those with positive z-score):
operators_max_waiting = waiting_calls_operators.query('waiting_time_z>=0')['operator_id']
operators_max_waiting = operators_max_waiting.astype(int).values.tolist()
operators_max_waiting
len(operators_max_waiting)
This way we have received 270 operators IDs that have maximum average waiting time (longer or equal to the average one).
Small number of outgoing calls is a sign of a bad operator.
outgoing_calls = df_dataset.query('direction=="out"').groupby('operator_id')['calls_count'].sum().reset_index().sort_values(by='calls_count', ascending=False)
outgoing_calls.sample()
plt.figure(figsize=(15,3))
outgoing_calls['calls_count'].plot.hist(density=True, bins= 50, color='#cce1e1', log=True)
plt.title('Number of outgoing calls')
plt.xlabel('number of calls')
plt.show()
# box-and-whisker plot
plt.figure(figsize=(15,3))
ax = sns.boxplot(x="calls_count", data=outgoing_calls, color='#cce1e1')
ax.set(xlabel='number of outgoing calls')
plt.show()
outgoing_calls['calls_count'].describe()
#Calculatinug the 95th and 99th percentiles for the number of outgoing calls (setting out outliers).
np.percentile(outgoing_calls['calls_count'], [95, 99])
In general 873 operators have from 1 to 53301 outgoing calls during the period. Middle 50% of them have from 10 to 533 outgoing calls with median value of 84 calls.
Only 5% of operators have over 2153 outgoing calls, and only 1% of operators have over 4732 outgoing calls.
We will define how successful is every operator in relation to the mean value of number of calls he/she performed. To to that we will calculate z-score for the distribution as a numerical measurement that describes a value's relationship to the mean of a group of values. If a Z-score is 0, it indicates that the score of number of calls of an operator is identical to the mean score.
#counting z-score for number of outgoing calls to define distribution
outgoing_calls['calls_count_z']=(outgoing_calls['calls_count']-outgoing_calls['calls_count'].mean())/outgoing_calls['calls_count'].std()
outgoing_calls['operator_id'] = outgoing_calls['operator_id'].astype(str)
outgoing_calls.head(3)
Now we will plot part of the distribution (operators with z-score close to 0) to illustrate the calculations.
#distinguishing z-score by colors: red for negative values, green for positive ones
outgoing_calls_z = outgoing_calls.query('-0.03 <=calls_count_z<=0.05').sort_values(by='calls_count', ascending=False).reset_index(drop=True)
outgoing_calls_z['colors'] = ['red' if x<0 else 'teal' for x in outgoing_calls_z['calls_count_z']]
outgoing_calls_z['colors']
plt.figure(figsize=(14,10))
plt.hlines(y=outgoing_calls_z.operator_id, xmin=0, xmax=outgoing_calls_z.calls_count_z, colors=outgoing_calls_z.colors, alpha=0.4, linewidth=10)
plt.title('Operators calls number relation to mean value of the call number \n(z-score)', fontsize=18)
plt.xlabel('z-score', fontsize=18)
plt.ylabel('Operators IDs', fontsize=16)
plt.show()
Now we can create a list of the least effective operators IDs that make the smallest number of outgoing calls (those with negative z-score):
operators_min_calls = outgoing_calls.query('calls_count_z<=0')['operator_id']
operators_min_calls = operators_min_calls.astype(int).values.tolist()
operators_min_calls
len(operators_min_calls)
This way we have received 674 operators IDs that perform fewer number of outgoing calls (smaller than the average number).
The less internal calls an operator has the more effective it is considered to be.
internal_calls = df_dataset.query('internal==True and direction=="out"').groupby('operator_id')['calls_count']\
.sum().reset_index().sort_values(by='calls_count', ascending=False)
internal_calls.head()
plt.figure(figsize=(15,3))
internal_calls['calls_count'].plot.hist(density=True, bins= 50, color='#cce1e1', log=True)
plt.title('Number of internal calls')
plt.show()
# box-and-whisker plot
plt.figure(figsize=(15,3))
ax = sns.boxplot(x="calls_count", data=internal_calls, color='#cce1e1')
ax.set(xlabel='number of internal calls')
plt.show()
internal_calls['calls_count'].describe()
#Calculatinug the 95th and 99th percentiles for the number of outgoing calls (setting out outliers).
np.percentile(internal_calls['calls_count'], [95, 99])
In general 486 operators had from 1 to 393 internal outgoing calls during the period. Middle 50% of them had from 2 to 18 internal calls with median value of 5 calls.
Only 5% of operators had over 135 internal calls, and 1% of operators had over 280 outgoing calls.
We will define how effective every operator is in relation to the number of his/her internal calls. To do that we will calculate z-score for the distribution as a numerical measurement that describes a value's relationship to the mean of a group of values. If a Z-score is 0, it indicates that the score of number of internal calls of an operator is identical to the mean score.
#counting z-score for sales to define distribution
internal_calls['internal_calls_z']=(internal_calls['calls_count']-internal_calls['calls_count'].mean())/internal_calls['calls_count'].std()
internal_calls['operator_id'] = internal_calls['operator_id'].astype(str)
internal_calls.head(3)
Now we will plot part of the distribution (operators with z-score close to 0) to illustrate the calculations.
#distinguishing z-score by colors: green for negative values (fewer internal calls), red for positive ones (more internal calls)
internal_calls_z = internal_calls.query('-0.15 <=internal_calls_z<=0.2')\
.sort_values(by='internal_calls_z', ascending=False).reset_index(drop=True)
internal_calls_z['colors'] = ['teal' if x<0 else 'red' for x in internal_calls_z['internal_calls_z']]
internal_calls_z['colors']
plt.figure(figsize=(14,10))
plt.hlines(y=internal_calls_z.operator_id, xmin=0, xmax=internal_calls_z.internal_calls_z, colors=internal_calls_z.colors, alpha=0.4, linewidth=10)
plt.title('Operators internal calls relation to mean value of the internal calls \n(z-score)', fontsize=18)
plt.xlabel('z-score', fontsize=18)
plt.ylabel('Operators IDs', fontsize=16)
plt.show()
Now we can create a list of the least effective operators IDs that have the biggest number of internal outgoing calls (those with positive z-score):
operators_max_internal = internal_calls.query('internal_calls_z>=0')['operator_id']
operators_max_internal = operators_max_internal.astype(int).values.tolist()
operators_max_internal
len(operators_max_internal)
This way we have received 99 operators IDs that have maximal number of internal calls (bigger or equal to the average one).
All in all we have 4 lists of operators IDs that perform ineffectively in 4 sphears:
On the next step we will group these IDs into 3 groups.
Defining operators that performed poorly in all 4 spheres (their IDs got into every of 4 lists)
#Operators that are in all 4 lists:
worst_operators = set(operators_with_missed) & set(operators_max_waiting)\
& set(operators_min_calls) &set(operators_max_internal)
print('The number of least effective operators:', len(worst_operators))
#IDs
display(worst_operators)
There are 17 the least effective operators: they have missed calls, their waiting time is above average, number of their outgoing calls is less than average and finally they call a lot to their colleagues.
Denerally such performance might be caused by lack of experience and qualification.
Probably these operators need some additional taining if they already work for the company long enough. If they are newcomers some supervision and proffetional guidance might help improve their performance.
Defining operators that performed poorly in 2 out of 4 spheres (their IDs got into 2 lists). We think that the most important criterion of effectivness are number of outgoing calls and waiting time.
We suppose that missed calls do not define effectiveness of an operator much because operators do not tend to miss the calls at all. Middle 50% of them have from 0 to 1 missed call and only 5% of operators have over 4 missed calls. This cannot be crucial for the current analysis.
As to the internal calls, it can also be omitted at this stage. There are only 99 operators who make them often. Most of them could be newcomers.
We will define operators that got into these two lists of ineffectiveness (number of outgoing calls and waiting time):
#Operators that are in 2 lists:
bad_operators = set(operators_max_waiting) & set(operators_min_calls)
print('The number of operators of lower effectivness:', len(bad_operators))
Defining operators that performed poorly in 1 of 4 spheres (their IDs got into 1 of 4 lists)
#Operators that are at least in one of the lists:
#joining the 4 lists
avg_effective_operators = list(set(operators_with_missed) | set(operators_max_waiting) | set(operators_min_calls) |set(operators_max_internal))
len(avg_effective_operators)
display(avg_effective_operators)
#all_operators= df_dataset.groupby('operator_id')['is_missed_call'].sum().reset_index().sort_values(by='is_missed_call', ascending=False)
#missed_calls['is_missed_call'] = missed_calls['is_missed_call'].astype(int)
#missed_calls
#list of all operators:
all_operators = df_dataset['operator_id'].unique()
all_operators = all_operators.tolist()
#operators that never got to any of the lists:
good_operators = list(set(all_operators) - set(avg_effective_operators))
good_operators
len(good_operators)
There are 225 operators that never got to any of the lists; they can be considered as the most effective ones.
#Creating a db for visualisation of results:
data_op=[{'avg_effective_operators': len(avg_effective_operators), 'bad_operators':len(bad_operators), \
'worst_operators': len(worst_operators)
}]
operators_funnel = pd.DataFrame(data_op).T.reset_index()
operators_funnel.columns=['operators_groups', 'operators_number']
operators_funnel
#chart with 4 groups :
fig = go.Figure(go.Funnel(y = operators_funnel.operators_groups, x = operators_funnel.operators_number,
opacity = 0.85, marker = {"color": ['#333160', '#5b5895', '#9467bd', '#3ac7c3' , '#72B7B2', '#c6f7e7', '#C9FBE5']}))
fig.show()
Combining the 4 lists of operators with the least effective performace in one of the spheres we have received 3 groups:
#general db (merged)
df = df_dataset.merge(df_cl, on='user_id')
df.head(3)
#Share of missed calls per tariff
tariffs_missed_calls = df.groupby('tariff_plan')['is_missed_call']\
.mean().reset_index().sort_values(by='is_missed_call', ascending=False)\
.rename({'is_missed_call': 'missed_calls_share'}, axis=1)
tariffs_missed_calls
#Share of internal calls per tariff
tariffs_internal_calls = df.groupby('tariff_plan')['internal'].mean().reset_index()
tariffs_internal_calls
#db for graphical interpretation of the features distribution
df_graphs = df
#db ajustment for graphical interpretation
df_graphs['internal'] = df_graphs['internal'].replace(to_replace=True, value ="internal").replace(to_replace=False, value ="external")
df_graphs['is_missed_call'] = df_graphs['is_missed_call'].replace(to_replace=True, value ="missed_call").replace(to_replace=False, value ="answered")
df_graphs.insert(0, "count", 1)
df_graphs.head()
#plotting features distribution
parameters_hist = ['tariff_plan', 'direction', 'internal', 'is_missed_call']
parameters_dis = ['date_start']
for x in parameters_hist:
plt.figure(figsize=(18,5))
ax = sns.barplot(x=x, y="count", data=df_graphs, hue='tariff_plan', estimator=sum, palette="mako")
for p in ax.patches:
ax.annotate(format(p.get_height(), '.0f'),
(p.get_x() + p.get_width() / 2., p.get_height()),
ha = 'center', va = 'center',
xytext = (0, 9),
textcoords = 'offset points')
plt.title('Distribution of {}'.format(x))
plt.show()
for x in parameters_dis:
plt.figure(figsize=(18,5))
sns.kdeplot(data=df_graphs, x=x, hue="tariff_plan", alpha = 0.6 , palette="mako", fill=True)
plt.show()
df_graphs
#distribution of calls_count
def logx_kde(x, y, xmin, xmax, ymin, ymax): #a function that applies the logarithm to the x-data
x = np.log10(x)
# Peform the kernel density estimate
xx, yy = np.mgrid[xmin:xmax:100j, ymin:ymax:100j]
positions = np.vstack([xx.ravel(), yy.ravel()])
values = np.vstack([x, y])
kernel = st.gaussian_kde(values)
f = np.reshape(kernel(positions).T, xx.shape)
return np.power(10, xx), yy, f
plt.figure(figsize=(18,5))
ax=sns.kdeplot(data=df_graphs, x='calls_count', hue="tariff_plan", hue_order=["A", "C", "B"], alpha = 0.6 , palette="mako", fill=True)
ax.set_xscale('log')
# box-and-whisker plot
plt.figure(figsize=(18,3))
ax = sns.boxplot(x="calls_count", y='tariff_plan', data=df_graphs, color='#cce1e1')
ax.set(xlabel='number of calls')
plt.show()
#distribution of call_duration
def logx_kde(x, y, xmin, xmax, ymin, ymax):
x = np.log10(x)
# Peform the kernel density estimate
xx, yy = np.mgrid[xmin:xmax:100j, ymin:ymax:100j]
positions = np.vstack([xx.ravel(), yy.ravel()])
values = np.vstack([x, y])
kernel = st.gaussian_kde(values)
f = np.reshape(kernel(positions).T, xx.shape)
return np.power(10, xx), yy, f
plt.figure(figsize=(18,5))
ax=sns.kdeplot(data=df_graphs, x='call_duration', hue="tariff_plan", hue_order=["A", "C", "B"], alpha = 0.6 , palette="mako", fill=True)
ax.set_xscale('log')
# box-and-whisker plot
plt.figure(figsize=(18,3))
ax = sns.boxplot(x="call_duration", y='tariff_plan', data=df_graphs, color='#cce1e1')
ax.set(xlabel='call duration, sec.')
plt.show()
#distribution of total_call_duration
def logx_kde(x, y, xmin, xmax, ymin, ymax):
x = np.log10(x)
# Peform the kernel density estimate
xx, yy = np.mgrid[xmin:xmax:100j, ymin:ymax:100j]
positions = np.vstack([xx.ravel(), yy.ravel()])
values = np.vstack([x, y])
kernel = st.gaussian_kde(values)
f = np.reshape(kernel(positions).T, xx.shape)
return np.power(10, xx), yy, f
plt.figure(figsize=(18,5))
ax=sns.kdeplot(data=df_graphs, x='total_call_duration', hue="tariff_plan", hue_order=["A", "C", "B"], alpha = 0.6 , palette="mako", fill=True)
ax.set_xscale('log')
# box-and-whisker plot
plt.figure(figsize=(18,3))
ax = sns.boxplot(x="total_call_duration", y='tariff_plan', data=df_graphs, color='#cce1e1')
ax.set(xlabel='total call duration, sec.')
plt.show()
There are 3 types of tarrif plan. The biggest number of calls have been done on tarrif plan C (13293 calls), slightly less on plan B (13024) and the smallest number - on plan A (10995).
The users used all 3 plans from beginning of the August. The beggerst number of users registrated for plan B from 1/08/2019 till 18/08/2019; Peack of registrations for plan C was during 15/08/2019-01/09/2019;
peak of registrations for plan A was the shortest: from 15/10/2019 till 20/11/2015.
The biggest number of outgoing calls are made on tariff B; the smallest number is on plan A. The biggest number and the biggest share of incoming call sare made on plan C.
The operators on plan B have the lowest share of internal calls (11%) and operators on plan A have the highest share of them (16%). It might be a sign of higher professionalism of those who work with plan B and
vice versa the lower professional level of operators on plan A.
Plan C has the biggest number of answered calls, A - the lowest. Plan A has the highest share of missed calls (35%), plan C has the lowest share of them (32%).
For all tariff plans share of missed calls is pretty high, every third call gets no answer. Phone company might think of employing additional operators to get more calls answered and lower waiting time.
Another option may be outsoursing operators during the peak periods, e.g. when a company runs some marketing campaigns and expects a large number of registrations and new-commers.
Calls count. Operators on plan B make the biggest number of calls, C makes the fewest. A and C have long tail to the right and lots of of outliers.
Call_duration. Call duration looks standart for all tariffs but tariff plan A has a very long tail to the right with great amount of outliers.
Total_call_duration. Distribution of the feature looks similar to the call_duration distribution, but average duration for tariff A is higher.
#creating a separate db
df_tariffs = df
#substituting words by numbers for chart
df_tariffs['is_missed_call'] = np.where(df_tariffs['is_missed_call'] == "missed_call", 1, 0).astype('int', copy=False)
#adding parameter of waiting time:
df_tariffs['avg_waiting_time'] = round(((df_tariffs['total_call_duration'] - df_tariffs['call_duration']) /df_tariffs['calls_count']), 2)
df_tariffs.head()
df_A = df_tariffs.query('tariff_plan == "A"')
df_B = df_tariffs.query('tariff_plan == "B"')
df_C = df_tariffs.query('tariff_plan == "C"')
df_list = [df_A, df_B, df_C]
df_A.name = 'df_A'
df_B.name = 'df_B'
df_C.name = 'df_C'
for i in df_list:
print(i.name, i.shape[0])
#List of operators with missed calls for tariff A
missed_calls_A = df_A.query('direction=="in"').groupby('operator_id')['is_missed_call'].sum().reset_index()
missed_calls_A['is_missed_call'] = missed_calls_A['is_missed_call'].astype(int)
operators_with_missed_A = missed_calls_A.query('is_missed_call>0')['operator_id']
operators_with_missed_A.values.tolist()
print('Number of operators with missed calls for tariff A:', len(operators_with_missed_A))
#List of operators with missed calls for tariff B
missed_calls_B = df_B.query('direction=="in"').groupby('operator_id')['is_missed_call'].sum().reset_index().sort_values(by='is_missed_call', ascending=False)
missed_calls_B['is_missed_call'] = missed_calls_B['is_missed_call'].astype(int)
operators_with_missed_B = missed_calls_B.query('is_missed_call>0')['operator_id'].values.tolist()
print('Number of operators with missed calls for tariff B:', len(operators_with_missed_B))
#List of operators with missed calls for tariff C
missed_calls_C = df_C.query('direction=="in"').groupby('operator_id')['is_missed_call'].sum().reset_index().sort_values(by='is_missed_call', ascending=False)
missed_calls_C['is_missed_call'] = missed_calls_C['is_missed_call'].astype(int)
operators_with_missed_C = missed_calls_C.query('is_missed_call>0')['operator_id']
operators_with_missed_C.values.tolist()
print('Number of operators with missed calls for tariff C:', len(operators_with_missed_C))
#A
waiting_calls_operators_A = df_A.query('direction=="in"').groupby('operator_id').agg({'avg_waiting_time': 'sum'}).reset_index()
#calculating z-score
waiting_calls_operators_A['waiting_time_z']=(waiting_calls_operators_A['avg_waiting_time']-waiting_calls_operators_A['avg_waiting_time'].mean())/waiting_calls_operators_A['avg_waiting_time'].std()
#picking up the operators
operators_max_waiting_A = waiting_calls_operators_A.query('waiting_time_z>=0')['operator_id'].values.tolist()
print('Number of operators with long waiting time for tariff A:', len(operators_max_waiting_A))
#B
waiting_calls_operators_B = df_B.query('direction=="in"').groupby('operator_id').agg({'avg_waiting_time': 'sum'}).reset_index()
#calculating z-score
waiting_calls_operators_B['waiting_time_z']=(waiting_calls_operators_B['avg_waiting_time']\
-waiting_calls_operators_B['avg_waiting_time'].mean())\
/waiting_calls_operators_B['avg_waiting_time'].std()
#picking up the operators
operators_max_waiting_B = waiting_calls_operators_B.query('waiting_time_z>=0')['operator_id'].values.tolist()
print('Number of operators with long waiting time for tariff B:', len(operators_max_waiting_B))
#C
waiting_calls_operators_C = df_C.query('direction=="in"').groupby('operator_id').agg({'avg_waiting_time': 'sum'}).reset_index()
#calculating z-score
waiting_calls_operators_C['waiting_time_z']=(waiting_calls_operators_C['avg_waiting_time']\
-waiting_calls_operators_C['avg_waiting_time'].mean())\
/waiting_calls_operators_C['avg_waiting_time'].std()
#picking up the operators
operators_max_waiting_C = waiting_calls_operators_C.query('waiting_time_z>=0')['operator_id'].values.tolist()
print('Number of operators with long waiting time for tariff C:', len(operators_max_waiting_C))
#A
outgoing_calls_A = df_A.query('direction=="out"').groupby('operator_id')['calls_count'].sum().reset_index()
#z-score
outgoing_calls_A['calls_count_z']=(outgoing_calls_A['calls_count']-outgoing_calls_A['calls_count'].mean())\
/outgoing_calls_A['calls_count'].std()
operators_min_calls_A = outgoing_calls_A.query('calls_count_z<=0')['operator_id'].values.tolist()
print('Number of operators with small number of outgoing calls for tariff A:', len(operators_min_calls_A))
#B
outgoing_calls_B = df_B.query('direction=="out"').groupby('operator_id')['calls_count'].sum().reset_index()
#z-score
outgoing_calls_B['calls_count_z']=(outgoing_calls_B['calls_count']-outgoing_calls_B['calls_count'].mean())\
/outgoing_calls_B['calls_count'].std()
operators_min_calls_B = outgoing_calls_B.query('calls_count_z<=0')['operator_id'].values.tolist()
print('Number of operators with small number of outgoing calls for tariff B:', len(operators_min_calls_B))
#C
outgoing_calls_C = df_C.query('direction=="out"').groupby('operator_id')['calls_count'].sum().reset_index()
#z-score
outgoing_calls_C['calls_count_z']=(outgoing_calls_C['calls_count']-outgoing_calls_C['calls_count'].mean())\
/outgoing_calls_C['calls_count'].std()
operators_min_calls_C = outgoing_calls_C.query('calls_count_z<=0')['operator_id'].values.tolist()
print('Number of operators with small number of outgoing calls for tariff C:', len(operators_min_calls_C))
#A
internal_calls_A = df_A.query('internal=="internal" and direction=="out"').groupby('operator_id')['calls_count'].sum().reset_index()
#z-score
internal_calls_A['internal_calls_z']=(internal_calls_A['calls_count']-internal_calls_A['calls_count'].mean())\
/internal_calls_A['calls_count'].std()
operators_max_internal_A = internal_calls_A.query('internal_calls_z>=0')['operator_id'].values.tolist()
print('Number of operators with big number of outgoing internal calls for tariff A:', len(operators_max_internal_A))
#B
internal_calls_B = df_B.query('internal=="internal" and direction=="out"').groupby('operator_id')['calls_count'].sum().reset_index()
#z-score
internal_calls_B['internal_calls_z']=(internal_calls_B['calls_count']-internal_calls_B['calls_count'].mean())\
/internal_calls_B['calls_count'].std()
operators_max_internal_B = internal_calls_B.query('internal_calls_z>0')['operator_id'].values.tolist()
print('Number of operators with big number of outgoing internal calls for tariff B:', len(operators_max_internal_B))
#C
internal_calls_C = df_C.query('internal=="internal" and direction=="out"').groupby('operator_id')['calls_count'].sum().reset_index()
#z-score
internal_calls_C['internal_calls_z']=(internal_calls_C['calls_count']-internal_calls_C['calls_count'].mean())\
/internal_calls_C['calls_count'].std()
operators_max_internal_C = internal_calls_C.query('internal_calls_z>0')['operator_id'].values.tolist()
print('Number of operators with big number of outgoing internal calls for tariff C:', len(operators_max_internal_C))
#Operators that are in all 4 lists:
worst_operators_A = set(operators_with_missed_A) & set(operators_max_waiting_A)\
& set(operators_min_calls_A) &set(operators_max_internal_A)
print('The number of least effective operators for tariff A:', len(worst_operators_A))
#Operators that are in 2 lists:
bad_operators_A = set(operators_max_waiting_A) & set(operators_min_calls_A)
print('The number of operators of lower effectivness:', len(bad_operators_A))
#Operators that are at least in one of the lists:
avg_effective_operators_A = list(set(operators_with_missed_A) | set(operators_max_waiting_A) \
| set(operators_min_calls_A) |set(operators_max_internal_A))
print('The number of not very effective operators for tariff A:', len(avg_effective_operators_A))
#Operators that are in all 4 lists:
worst_operators_B = set(operators_with_missed_B) & set(operators_max_waiting_B)\
& set(operators_min_calls_B) &set(operators_max_internal_B)
print('The number of least effective operators for tariff B:', len(worst_operators_B))
#Operators that are in 2 lists:
bad_operators_B = set(operators_max_waiting_B) & set(operators_min_calls_B)
print('The number of operators of lower effectivness:', len(bad_operators_B))
#Operators that are at least in one of the lists:
avg_effective_operators_B = list(set(operators_with_missed_B) | set(operators_max_waiting_B) \
| set(operators_min_calls_B) |set(operators_max_internal_B))
print('The number of not very effective operators for tariff B:', len(avg_effective_operators_B))
#Operators that are in all 4 lists:
worst_operators_C = set(operators_with_missed_C) & set(operators_max_waiting_C)\
& set(operators_min_calls_C) &set(operators_max_internal_C)
print('The number of least effective operators for tariff C:', len(worst_operators_C))
#Operators that are in 2 lists:
bad_operators_C = set(operators_max_waiting_C) & set(operators_min_calls_C)
print('The number of operators of lower effectivness:', len(bad_operators_C))
#Operators that are at least in one of the lists:
avg_effective_operators_C = list(set(operators_with_missed_C) | set(operators_max_waiting_C) \
| set(operators_min_calls_C) |set(operators_max_internal_C))
print('The number of not very effective operators for tariff C:', len(avg_effective_operators_C))
#Creating a db for visualisation of results:
#A
data_op_tariff_A=[{'avg_effective_operators': len(avg_effective_operators_A), 'bad_operators':len(bad_operators_A), \
'worst_operators': len(worst_operators_A)
}]
operators_funnel_tariff_A = pd.DataFrame(data_op_tariff_A).T.reset_index()
operators_funnel_tariff_A.columns=['operators_groups', 'number']
operators_funnel_tariff_A.insert(0, "tariff", "A")
#B
data_op_tariff_B=[{'avg_effective_operators': len(avg_effective_operators_B), 'bad_operators':len(bad_operators_B), \
'worst_operators': len(worst_operators_B)
}]
operators_funnel_tariff_B = pd.DataFrame(data_op_tariff_B).T.reset_index()
operators_funnel_tariff_B.columns=['operators_groups', 'number']
operators_funnel_tariff_B.insert(0, "tariff", "B")
#C
data_op_tariff_C=[{'avg_effective_operators': len(avg_effective_operators_C), 'bad_operators':len(bad_operators_C), \
'worst_operators': len(worst_operators_C)
}]
operators_funnel_tariff_C = pd.DataFrame(data_op_tariff_C).T.reset_index()
operators_funnel_tariff_C.columns=['operators_groups', 'number']
operators_funnel_tariff_C.insert(0, "tariff", "C")
frames = [operators_funnel_tariff_A, operators_funnel_tariff_B, operators_funnel_tariff_C]
operators_funnel_tariffs = pd.concat(frames)
operators_funnel_tariffs
#chart with 4 groups :
fig = go.Figure(go.Funnel(y = operators_funnel_tariffs.operators_groups, x = operators_funnel_tariffs.number,
opacity = 0.85, marker = {"color": ['#333160', '#5b5895', '#9467bd', '#3ac7c3' , '#72B7B2', '#c6f7e7', '#C9FBE5', 'teal', 'mediumturquoise', 'steelblue']}))
fig.show()
In course of work we have defined 3 groups of operators with different degree of inefficiency in general and for every tariff plan:
The biggest share of ineffective operators (those who got to every of 4 lists) refer to tariff plan A, and the smallest - to tariff plan C.
On average operators working with tariff plan A demontarte less professional work than their colleagues. The data shows that there is a big number of new users of tariff plan A joining the company in the middle of October. Probably the company had to employ new operators for tariff A or have not arranged enough traning for the old ones.
We would recommend the following:
We are going to test the following hypothesis: number of calls (incoming and outgoung) depens on tariff plan of the call.
Stating hypothesis:
H0: average number of calls per operator is equal for all tariff plans
H1: average number of calls differ for all tariff plans
#Getting rid of outliers
#defining outliers with 3-sigma method for all tariff plans
#as we are calulating std for sample ddof is set to 1
std_score_A = np.std(df_A['calls_count'], ddof=1)
three_sigma_score_A_lower = round((df_A['calls_count'].mean() - std_score_A*3),2)
#we do not use the lower fence as it will give us negative value for user scores.
three_sigma_score_A_upper = round((df_A['calls_count'].mean() + std_score_A*3),2)
std_score_B = np.std(df_B['calls_count'], ddof=1)
three_sigma_score_B_lower = round((df_B['calls_count'].mean() - std_score_B*3),2)
#we do not use the lower fence as it will give us negative value for user scores.
three_sigma_score_B_upper = round((df_B['calls_count'].mean() + std_score_B*3),2)
std_score_C = np.std(df_C['calls_count'], ddof=1)
three_sigma_score_C_lower = round((df_C['calls_count'].mean() - std_score_C*3),2)
#we do not use the lower fence as it will give us negative value for user scores.
three_sigma_score_C_upper = round((df_C['calls_count'].mean() + std_score_C*3),2)
print('99.7% of calls A are from 0 to ', three_sigma_score_A_upper,\
'. \n99.7% of calls B are from 0 to ', three_sigma_score_B_upper,'.'\
'\n99.7% of calls C are from 0 to ', three_sigma_score_C_upper,'.')
#setting df without outliers
df_A_no_outliers = df_A.query('calls_count<=@three_sigma_score_A_upper')
df_B_no_outliers = df_B.query('calls_count<=@three_sigma_score_B_upper')
df_C_no_outliers = df_C.query('calls_count<=@three_sigma_score_C_upper')
#defining Variance for the 3 atriffs to define whether or not we can consider them as equal for t-test.
variance_A = np.var(df_A_no_outliers['calls_count'])
variance_B = np.var(df_B_no_outliers['calls_count'])
variance_C = np.var(df_C_no_outliers['calls_count'])
print('Variance for A calls sample is ', variance_A,\
'\nVariance for B calls sample is ', variance_B,\
'\nVariance for C calls sample is ', variance_C,)
Variances vary greatly.
Testing if distribution is normal for 3 groups
sample_A = df_A_no_outliers['calls_count'].values.tolist()
sample_B = df_B_no_outliers['calls_count'].values.tolist()
sample_C = df_C_no_outliers['calls_count'].values.tolist()
samples_list = [sample_A, sample_B, sample_C]
def check_normality(sample, alpha):
# alpha = .05 # significance level
results = st.shapiro(sample)
p_value = results[1] # the second value in the array of results (with index 1) - the p-value
print('p-value: ', p_value)
if (p_value < alpha):
print("Null hypothesis rejected: the distribution is not normal")
else:
print("Failed to reject the null hypothesis: the distribution seems to be normal")
for i in samples_list:
check_normality(i, alpha=0.05)
The calculations proved the distribution is not normal for all samples.
Consequently we should choose a non parametric test fo testing hypothesis (it doesn’t assume the data fits a specific distribution type). Non parametric tests include the Wilcoxon signed rank test, the Mann-Whitney U Test and the Kruskal-Wallis test.
We are going to calculate statistical significance of the difference in conversion between the groups using the Mann-Whitney U test because we have independent data samples with quantative variables (nonparametric version of the Student t-test), also the data doesn't follow normal distribution. We can see that the data is not normally distributed.
As we are going to run 3 comparisons (A/B, B/C, A/C) we will alter the sinificance level respectively, using the Šidák method. Normally Alpha is set to 0.5 as a commonly accepted level in industry.
Calculating alpha with The Šidák method (guarantees family-wise error rate < ɑ).
The required significance level will be found as follows:
alpha_sidak = 1 - (1 - 0.05)**(1/3) #we have 3 groups which gives us 3 combinations for comparison (246/247, 246/248, 247/248)
round(alpha_sidak,4)
from scipy.stats import mannwhitneyu
#forming 3 groups:
callsByOperators_A = df_A_no_outliers.groupby('operator_id', as_index=False).agg({'calls_count' : 'sum'})
callsByOperators_B = df_B_no_outliers.groupby('operator_id', as_index=False).agg({'calls_count' : 'sum'})
callsByOperators_C = df_C_no_outliers.groupby('operator_id', as_index=False).agg({'calls_count' : 'sum'})
outliersA = int(np.percentile(callsByOperators_A['calls_count'], [99]))
outliersA
outliersB = int(np.percentile(callsByOperators_B['calls_count'], [99]))
outliersB
outliersC = int(np.percentile(callsByOperators_C['calls_count'], [99]))
outliersC
#arranging the data in the riught order for the Mann-Whitney U test
sample_op_A = callsByOperators_A.query('calls_count < @outliersA')['calls_count'].values.tolist() #values for A sample
sample_op_B = callsByOperators_B.query('calls_count < @outliersB')['calls_count'].values.tolist() #values for B sample
sample_op_C= callsByOperators_C.query('calls_count < @outliersC')['calls_count'].values.tolist() #values for C sample
#function for testing hypothesis
def check_hypothesis(group1,group2, alpha):
p_value = st.mannwhitneyu(group1, group2)[1]
print('p-value: {0:.3f}'.format(p_value))
if (p_value < alpha):
print('We reject the null hypothesis for the two groups')
else:
print("We can't reject the null hypothesis for the two groups")
#Comparison of number of calls for operators, tariff plans A and B
check_hypothesis(sample_op_A,sample_op_B, alpha_sidak)
#Comparison of number of calls for operators, tariff plans A and C
check_hypothesis(sample_op_A,sample_op_C, alpha_sidak)
#Comparison of number of calls for operators, tariff plans B and C
check_hypothesis(sample_op_B,sample_op_C, alpha_sidak)
The data provides sufficient evidence, given the significance level we selected (alpha_sidak = 1.7%), to reject the null hypothesis for the groups A and B, A and C. Therefore, we can conclude that average number of calls made and received by operators for tafiff A differ from those for tariff B(μA != μB); average number of calls made and received by operators for tafiff A differ from those for tariff C (μA != μC).
At the same time data does not provide sufficient evidence, given the significance level of 1.7%, to reject the null hypothesis for groups B and C. Therefore, we can conclude that average number of calls made and received by operators for tafiff B and the average number of calls made and received by operators for tafiff C are the same (μB = μC).
We will plot the charts to visualize the calculations.
fig = plt.figure(figsize= (10, 10))
ax = fig.add_subplot(111)
ax.set_title("Box Plot of calls by tariff plans", fontsize= 20)
ax.set
data = [callsByOperators_A['calls_count'],
callsByOperators_B['calls_count'],
callsByOperators_C['calls_count']]
ax.boxplot(data,
labels= ['A', 'B', 'C'],
showmeans= True)
plt.xlabel("Tariff plan")
plt.ylabel("Number of calls by operators")
plt.show()
The graphical testing of homogeneity of variances supports the statistical testing findings.
There is much more difference between number of calls for tariffs A/B and A/C;
there is no statistically significant difference between groups B and C.
The t-test is invalid for small samples from non-normal distributions, but it is valid for large samples from non-normal distributions. Thus we are going to run the t-test for ous samples as they are are big enough (>50).
We will test the difference for every group in general to prove the previous calculations with t-test.
#forming a table
samplesA_B_C = df_tariffs.pivot_table(index='date', values='calls_count', columns='tariff_plan', aggfunc='sum').reset_index().fillna(0)
samplesA_B_C.columns = ['date', 'A', 'B', 'C']
samplesA_B_C.head()
Setting out outliers (threshold of number of calls for 99% of the sample)
outliersA_t = int(np.percentile(samplesA_B_C['A'], [99]))
outliersA_t
outliersB_t = int(np.percentile(samplesA_B_C['B'], [99]))
outliersB_t
outliersC_t = int(np.percentile(samplesA_B_C['C'], [99]))
outliersC_t
Forming samples
sample_A_t = samplesA_B_C.query('A < @outliersA_t')['A'].values.tolist()
sample_B_t = samplesA_B_C.query('B < @outliersB_t')['B'].values.tolist()
sample_C_t = samplesA_B_C.query('C < @outliersC_t')['C'].values.tolist()
Checking variance
#defining Variance for the 3 atriffs to define whether or not we can consider them as equal for t-test.
variance_A_t = np.var(sample_A_t)
variance_B_t = np.var(sample_B_t)
variance_C_t = np.var(sample_C_t)
print('Variance for A sample is ', variance_A_t.round(1),\
'\nVariance for B sample is ', variance_B_t.round(1),\
'\nVariance for C sample is ', variance_C_t.round(1))
Variances vary greatly.
Testing hipotheses
results_A_B = st.ttest_ind(
sample_A_t,
sample_B_t, equal_var = False)
results_A_C = st.ttest_ind(
sample_A_t,
sample_C_t, equal_var = False)
results_C_B = st.ttest_ind(
sample_C_t,
sample_B_t, equal_var = False)
#we are going to use alpha_sidak here as there were 3 comparisons made
print('p-value to compare the groups A and B: ', results_A_B.pvalue)
print('p-value to compare the groups A and C: ', results_A_C.pvalue)
print('p-value to compare the groups C and B: ', results_C_B.pvalue)
if (results_A_B.pvalue < alpha_sidak):
print("Null hypothesis rejected for groups A and B")
else:
print("Null hypothesis not rejected for groups A and B")
if (results_A_C.pvalue < alpha_sidak):
print("Null hypothesis rejected for groups A and C")
else:
print("Null hypothesis not rejected for groups A and C")
if (results_C_B.pvalue < alpha_sidak):
print("Null hypothesis rejected for groups C and B")
else:
print("Null hypothesis not rejected for groups C and B")
fig = plt.figure(figsize= (8, 8))
ax = fig.add_subplot(111)
ax.set_title("Box Plot of calls by tariff plans", fontsize= 20)
ax.set
data = [samplesA_B_C['A'],
samplesA_B_C['B'],
samplesA_B_C['C']]
ax.boxplot(data,
labels= ['A', 'B', 'C'],
showmeans= True)
plt.xlabel("Tariff plan")
plt.ylabel("Number of calls")
plt.show()
Both tests proved that we cannot reject the null hypothesis for groups B and C. Therefore, we can conclude that average number of calls for tariff plan B and C do not differ: (μB = μC ).
But we can reject it for comparisons A/B and A/C: the data provides sufficient evidence, given the significance level (1.7%), to reject the null hypothesis).
Therefore, we can conclude that average number of calls for tariff plan A differs from those for tariff plan B and C: (μA != μB and μA != μC ).
Consequently the difference in operators performance is not due to chance and has is statistically significant. Work of operators fo tariff A differs from work of operators for other tariffs.
In course of work we have studied data on call company clients and operators:
Having studied the given features (number of missed incoming calls, waiting time for incoming calls (average speed of answer), number of outgoing calls, number of internal calls) we have generated 4 lists of operators that performed worse than their colleagues on average. We used z-score method to define the threshold. Next we created 3 groups of operators combining the 4 lists:
Likewize we created such groups for every tariff plan. Analyzing the operators performance for different tariff plans we have defined that that share of operators that performed badly is the highest for tariff A and the lowest for tariff C. Tariff A has the smallest number of outgoing calls, the smallest number of answered calls and the biggest share of internal calls. Generally operators working on plan A demonstrate lower professional level than their colleagues working on other tariff plans.
We have tested hypothesis that number of calls depend od tariff plan of the call. The calculations show that with given significance level (1.7%) there is a statistically signifficant difference in number of calls for tariff A, and there is no statistical difference between number of calls for tariff B and C. Supposedly due to a big number of new users of tariff A that joined the company in the middle of October the company have not allocated enough operators and not provided enough trainings.
The results of the study can be used further for labeling , supervised learning and developing a model for ineffective operators . This will automate the process of defining ineffective operators in future.
General recommendations:
- additional trainings for operators working on tariff A (more insights into tariff A aspects);
- additional motivation for operators working on tariff A (bonus system, incentives, etc.)
- monitoring operators' performance that got to group 'operators of average effectiveness’;
- arrange additional training for those who got to group ' operators of lower effectiveness’;
- additional training and supervised work for those who got to group 'the least effective operators’
- rewards and incentives for those who got to group ‘good operators’.
To concude we would recommend to make a shift from defining operators' effectiveness by setting such goals as handling a certain number of calls within a designated period of time. This puts a lot of pressure on operators and does not lead to clients' sutisfaction in the long run. Today's effectiveness might rather be measured in resolved issues rate and retantion rate. Resolving issues and customers satisfaction can be better motivation for operators than reaching big number of calls.