【Python基础】第三十四课:模型评估方法

留出法,train_test_split,交叉验证法,KFold,cross_val_score,留一法,LeaveOneOut

Posted by x-jeff on February 15, 2022

本文为原创文章,未经本人允许,禁止转载。转载请注明出处。

1.留出法

“留出法”详解请见:链接

👉引用数据与建立模型:

1
2
3
4
5
6
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier

iris = load_iris()
X = iris.data
y = iris.target

👉建立训练与测试数据集:

1
2
3
4
5
from sklearn.model_selection import train_test_split

train_X, test_X, train_y, test_y = train_test_split(X, y, test_size=0.33, random_state=123)
clf = DecisionTreeClassifier()
clf.fit(train_X, train_y)

👉产生准确度:

1
2
3
4
from sklearn.metrics import accuracy_score

predicted = clf.predict(test_X)
accuracy_score(test_y, predicted)

👉建立混淆矩阵:

1
2
3
from sklearn.metrics import confusion_matrix

m = confusion_matrix(test_y, predicted)

2.交叉验证法

“交叉验证法”详解请见:链接

1
2
3
4
5
6
7
8
9
from sklearn.model_selection import KFold

kf = KFold(n_splits=10)
for train, test in kf.split(X):
    train_X, test_X, train_y, test_y = X[train], X[test], y[train], y[test]
    clf = DecisionTreeClassifier()
    clf.fit(train_X, train_y)
    predicted = clf.predict(test_X)
    print(accuracy_score(test_y, predicted))
1
2
3
4
5
6
7
8
9
10
1.0
1.0
1.0
0.9333333333333333
0.9333333333333333
0.8666666666666667
1.0
0.8666666666666667
0.8
1.0

另一种方法:

1
2
3
4
5
from sklearn.model_selection import cross_val_score

acc = cross_val_score(clf, X=iris.data, y=iris.target, cv=10)
print(acc)
print(acc.mean())
1
2
3
[1.         0.93333333 1.         0.93333333 0.93333333 0.86666667
 0.93333333 0.93333333 1.         1.        ]
0.9533333333333334

3.留一法

“留一法”详解请见:链接

1
2
3
4
5
6
7
8
9
10
11
from sklearn.model_selection import LeaveOneOut

res = []
loo = LeaveOneOut()
for train, test in loo.split(X):
    train_X, test_X, train_y, test_y = X[train], X[test], y[train], y[test]
    clf = DecisionTreeClassifier()
    clf.fit(train_X, train_y)
    predicted = clf.predict(test_X)
    res.extend((predicted == test_y).tolist())
print(sum(res)) #143

4.代码地址

  1. 模型评估方法