Sklearn (Cont.)


Use Sklearn in Python
The use of this library generally starts with splitting the dataset into training and test sets, here is how you can split your data:

from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split( x, y, random_state=0 )

Then we need to process the data to fit it into a machine learning model. Here we generally need to scale the data which can be done by using standardization and normalization. Below is the scikit-learn’s way of processing the data:

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler( ).fit( x_train )
scaler.transform( x_train )
scaler.transform( x_test )

from sklearn.preprocessing import Normalizer
scaler = Normalizer( ).fit( x_train )
scaler.transform( x_train )
scaler.transform( x_test )

As the next step, we need to fit the data into the model. Below is an implementation of training some of the most common machine learning algorithms:

from sklearn.linear_model import LinearRegression
from sklearn.svm import SVC
from sklearn.naive_bayes import GaussianNB
from sklearn import neighbors
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans

lr = LinearRegression( normalize=True )
lr.fit( x_train, y_train )

knn = neighbors.KNeighborsClassifier( n_neighbors=5 )
knn.fit( x_train, y_train )

svc = SVC( kernel='linear' )
svc.fit( x_train, y_train )

k_means = KMeans( n_clusters=3, random_state=0 )
k_means.fit( x_train )

pca = PCA( n_components=0.95 )
pca.fit_transform( x_train )