Sklearn (Cont.)
Use Sklearn in Python
The use of this library generally starts with splitting the dataset into training and test sets, here is how you can split your data:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split( x, y, random_state=0 )
|
Then we need to process the data to fit it into a machine learning model. Here we generally need to scale the data which can be done by using standardization and normalization. Below is the scikit-learn’s way of processing the data:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler( ).fit( x_train )
scaler.transform( x_train )
scaler.transform( x_test )
from sklearn.preprocessing import Normalizer
scaler = Normalizer( ).fit( x_train )
scaler.transform( x_train )
scaler.transform( x_test )
|
As the next step, we need to fit the data into the model. Below is an implementation of training some of the most common machine learning algorithms:
from sklearn.linear_model import LinearRegression
from sklearn.svm import SVC
from sklearn.naive_bayes import GaussianNB
from sklearn import neighbors
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans
lr = LinearRegression( normalize=True )
lr.fit( x_train, y_train )
knn = neighbors.KNeighborsClassifier( n_neighbors=5 )
knn.fit( x_train, y_train )
svc = SVC( kernel='linear' )
svc.fit( x_train, y_train )
k_means = KMeans( n_clusters=3, random_state=0 )
k_means.fit( x_train )
pca = PCA( n_components=0.95 )
pca.fit_transform( x_train )
|