'Scikit Learn Dataset
In Scikit learn, when doing X,Y = make_moons(500,noise = 0.2) and after printing X and Y, I see that they are like arrays with a bunch of entries but with no commas?
I have data that I want to use instead of the Scikit learn moons dataset, but I dont understand what data type these Scikit learn data sets are and how I can make my data follow this data type.
Solution 1:[1]
The first one X is a 2d array:
array([[-6.72300890e-01, 7.40277997e-01],
[ 9.60230259e-02, 9.95379113e-01],
[ 3.20515776e-02, 9.99486216e-01],
[ 8.71318704e-01, 4.90717552e-01],
....
[ 1.61911895e-01, -4.55349012e-02]])
Which contains the x-axis, and y-axis position of points.
The second part of the tuple: y, is an array that contains the labels (0 or 1 for binary classification).
array([0, 0, 0, 0, 1, ... ])
To use this data in a simple classification task, you could do the following:
from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report
# Create dataset
X, y = make_moons(500,noise = 0.2)
# Split dataset in a train part and a test part
train_X, test_X, train_y, test_y = train_test_split(X, y)
# Create the Logistic Regression classifier
log_reg = LogisticRegression()
# Fit the logistic regression classifier
log_reg.fit(train_X, train_y)
# Use the trained model to predit con the train and predict samples
train_y_pred = log_reg.predict(train_X)
test_y_pred = log_reg.predict(test_X)
# Print classification report on the training data
print(classification_report(train_y, train_y_pred))
# Print classification report on the test data
print(classification_report(test_y, test_y_pred))
The results are:
On training data
precision recall f1-score support
0 0.88 0.87 0.88 193
1 0.86 0.88 0.87 182
accuracy 0.87 375
macro avg 0.87 0.87 0.87 375
weighted avg 0.87 0.87 0.87 375
On test data
precision recall f1-score support
0 0.81 0.89 0.85 57
1 0.90 0.82 0.86 68
accuracy 0.86 125
macro avg 0.86 0.86 0.86 125
weighted avg 0.86 0.86 0.86 125
As we can see, the f1_score is not very different between the train and the test set, the model is not overfitting.
Solution 2:[2]
Lets say you have something like that (I added No):
if ( condition1 ) {
//some code 1
if ( condition2 ) {
//some code 2
if ( condition3 ) {
//some code 3
} else {
return false;
}
} else {
return false;
}
} else {
return false;
}
Since each time a condition is false, you exit the function returning false, you can directly test if the condition is false using a negation (if the negated condition is true):
if ( !condition1 ) {
return false;
}
//some code 1
if ( !condition2 ) {
return false;
}
//some code 2
if ( !condition3 ) {
return false;
}
//some code 3
This doesn't reduce the number of if statements, but you avoid many nesting levels and the else statements.
Solution 3:[3]
You can also try the switch statement. For many situations it will produce cleaner code.
<?php
if ($i == 0) {
echo "i equals 0";
} elseif ($i == 1) {
echo "i equals 1";
} elseif ($i == 2) {
echo "i equals 2";
}
switch ($i) {
case 0:
echo "i equals 0";
break;
case 1:
echo "i equals 1";
break;
case 2:
echo "i equals 2";
break;
}
?>
The switch statement is also compatible with using strings:
<?php
switch ($i) {
case "apple":
echo "i is apple";
break;
case "bar":
echo "i is bar";
break;
case "cake":
echo "i is cake";
break;
}
?>
Good luck! :)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Benjamin Breton |
| Solution 2 | |
| Solution 3 | Maximo Migliari |
