本文为原创文章，未经本人允许，禁止转载。转载请注明出处。

1.读取客户流失数据

import pandas as pd

df = pd.read_csv("customer_churn.csv", header=0, index_col=0)
print(df.head())

header=0表示第一行为标题行。
index_col=0表示第一列为索引。

  state  account_length  ... number_customer_service_calls churn
  KS             128  ...                             1    no
  OH             107  ...                             1    no
  NJ             137  ...                             0    no
  OH              84  ...                             2    no
  OK              75  ...                             3    no

2.数据前处理

# 去掉前三列
df = df.iloc[:, 3:]
# one-hot编码
cat_var = ["international_plan", "voice_mail_plan", "churn"];
for var in cat_var:
    df[var] = df[var].map(lambda e: 1 if e == 'yes' else 0)
print(df.head())
y = df.iloc[:, -1]
X = df.iloc[:, :-1]

iloc用法见：链接。
map用法见：链接。

   international_plan  voice_mail_plan  ...  number_customer_service_calls  churn
                 0                1  ...                              1      0
                 0                1  ...                              1      0
                 0                0  ...                              0      0
                 1                0  ...                              2      0
                 1                0  ...                              3      0

3.使用决策树建立分类模型

from sklearn import tree

clf = tree.DecisionTreeClassifier(max_depth=5)
clf.fit(X, y)
tree.export_graphviz(clf, out_file='tree.dot')

【Python基础】第二十七课：分类模型之决策树。

4.检视分类结果

import numpy as np

acc = np.sum(y == clf.predict(X)) / len(y)
print(acc) # 0.9525952595259526

5.其他方法

5.1.使用逻辑回归

【Python基础】第二十八课：分类模型之Logistic Regression。

from sklearn.linear_model import LogisticRegression

clf2 = LogisticRegression()
clf2.fit(X, y)
acc = np.sum(y == clf2.predict(X)) / len(y)
print(acc) # 0.8622862286228623

5.2.使用SVM

【Python基础】第二十九课：分类模型之SVM。

from sklearn.svm import SVC

model = SVC()
model.fit(X, y)
acc = np.sum(y == model.predict(X)) / len(y)
print(acc) # 1.0

6.代码地址

使用分类模型预测客户流失

【Python基础】第三十二课：使用分类模型预测客户流失

使用分类模型预测客户流失

1.读取客户流失数据

2.数据前处理

3.使用决策树建立分类模型

4.检视分类结果

5.其他方法

5.1.使用逻辑回归

5.2.使用SVM

6.代码地址

CATALOG

FEATURED TAGS