농업 및 딥 러닝: 토양 및 작물 수확량 개선

플라톤에 의해 재발행

팔로워 : 0

개요

농업은 많은 인도인들에게 단순한 직업 그 이상입니다. 그것은 삶의 방식입니다. 이는 그들이 생계를 지원하고 인도 경제에 막대한 기여를 하는 수단입니다. 점토, 모래, 미사 입자가 각각의 비율로 함유된 토양의 종류를 결정하는 것은 적합한 작물 선택과 잡초의 성장을 식별하는 데 중요합니다. 농업 분야 딥러닝의 잠재력을 알아보세요. 인도의 토양 유형과 잡초 탐지의 중요성을 이해합니다.

깊은 학습 모든 분야에서 도움이 되는 새로운 기술입니다. 딥 러닝은 현장 모니터링, 현장 운영, 로봇 공학, 토양, 물, 기후 조건 예측, 경관 수준의 토지 및 작물 유형 모니터링 등 다양한 규모의 스마트 농업에 널리 적용되었습니다. 토양 사진을 딥러닝 아키텍처에 공급하고 특징 감지 방법을 학습하도록 안내한 다음 딥러닝 아키텍처를 사용하여 토양을 분류할 수 있습니다.

이번 블로그에서는 농업에서 토양의 중요성에 대해 논의하겠습니다. 머신러닝과 딥러닝 모델을 활용하여 토양을 분류해보겠습니다.

학습 목표

농업에서 토양이 얼마나 중요한지 알게 될 것입니다.
기계 학습 알고리즘이 토양 유형을 분류하는 방법을 배우게 됩니다.
토양 유형을 분류하기 위해 농업에 딥러닝 모델을 구현하게 됩니다.
예측의 정확성을 높이기 위해 멀티 스택 앙상블 학습의 개념을 살펴보세요.

이 기사는 데이터 과학 블로그.

차례

농업에서 토양의 역할

식물과 동물에서 배설되는 유기물, 미네랄, 가스, 액체 및 기타 물질은 농업의 기초가 되는 중요한 토양을 형성합니다. 농업의 기초는 식물과 동물에서 나오는 가스, 광물, 유기물 및 기타 물질에 있으며 토양 시스템을 형성합니다.

인도 경제는 전적으로 농업에 의존하고 있습니다. 토양은 작물에 중요하며 비옥함으로 인해 원치 않는 잡초가 발생하게 됩니다.

수분과 온도는 토양의 공극과 입자 형성에 영향을 주어 뿌리 성장, 물 침투 및 식물 출현 속도에 영향을 미치는 물리적 변수입니다.

그러나 주로 토양에는 모래와 점토 입자가 있습니다. 탐사 현장에는 널리 이용 가능한 토양 입자 가운데 점토가 풍부합니다. 표면의 점토 입자의 가용성은 공급되는 풍부한 영양분 때문입니다. 이탄과 양토는 거의 존재하지 않습니다. 점토형 토양은 사이가 넓어서 물이 고여 있습니다.

데이터 세트

캐글 링크

특징 추출은 좋은 딥러닝 모델을 구축하는 주요 단계 중 하나입니다. 기계 학습 알고리즘을 구축하는 데 필요할 수 있는 기능을 식별하는 것이 중요합니다. 우리는 마호타 영상의 공간정보와 질감정보를 가지고 있는 Haralick 특징을 추출하는 라이브러리입니다.

우리는 skimage 라이브러리를 사용하여 이미지를 회색조로 변환하고 객체 감지에 유용한 HOG(Histogram of Gradient) 기능을 추출합니다. 마지막으로 기능 값을 배열로 연결하고 나중에 이를 기계 학습 및 딥 러닝 알고리즘에 사용합니다.

import mahotas as mh
from skimage import color, feature, io
import numpy as np

# Function to extract features from an image
def extract_features(image_path):
    img = io.imread(image_path)
    gray_img = color.rgb2gray(img)  # Converting image to grayscale
    
    # Converting the grayscale image to integer type
    gray_img_int = (gray_img * 255).astype(np.uint8)
    
    # Extracting Haralick features using mahotas
    haralick_features = mh.features.haralick(gray_img_int).mean(axis=0)
    
    # Extracting Histogram of Gradients (HOG) features
    hog_features, _ = feature.hog(gray_img, visualize=True)
    
    # Printing the first few elements of each feature array
    print("Haralick Features:", haralick_features[:5])
    print("HOG Features:", hog_features[:5])
    
    # Concatenating the features into a single array
    all_features = np.concatenate((haralick_features, hog_features))
    
    return all_features

image_path = '/kaggle/input/soil-classification-dataset/Soil-Dataset/Yellow Soil/20.jpg'
features = extract_features(image_path)
print("Extracted Features:", features)

토양 분류의 기계 학습 알고리즘

이제 Kaggle에서 얻은 토양 이미지를 사용하여 머신러닝 모델을 구축해 보겠습니다.

먼저 모든 라이브러리를 가져온 다음 추출_기능 이미지에서 특징을 추출합니다. 그런 다음 이미지를 가져와 회색조로 변환하는 작업을 포함하여 처리한 다음 이러한 기능을 얻습니다. 그런 다음 각 이미지의 특징을 추출한 후 다음을 사용하여 레이블을 인코딩합니다. 라벨 인코더.

import os
import numpy as np
import mahotas as mh
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.ensemble import RandomForestClassifier, StackingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score, classification_report
from skimage import color, feature, io

# Function to extract features from an image
def extract_features(image_path):
    img = io.imread(image_path)
    gray_img = color.rgb2gray(img)  # Converting image to grayscale
    gray_img_int = (gray_img * 255).astype(np.uint8)
    haralick_features = mh.features.haralick(gray_img_int).mean(axis=0)
    hog_features, _ = feature.hog(gray_img, visualize=True)
    hog_features_flat = hog_features.flatten()  # Flattening the HOG features
    # Ensuring both sets of features have the same length
    hog_features_flat = hog_features_flat[:haralick_features.shape[0]]
    return np.concatenate((haralick_features, hog_features_flat))

data_dir = "/kaggle/input/soil-classification-dataset/Soil-Dataset"

image_paths = []
labels = []

class_indices = {'Black Soil': 0, 'Cinder Soil': 1, 'Laterite Soil': 2, 
'Peat Soil': 3, 'Yellow Soil': 4}

for soil_class, class_index in class_indices.items():
    class_dir = os.path.join(data_dir, soil_class)
    class_images = [os.path.join(class_dir, image) for image in os.listdir(class_dir)]
    image_paths.extend(class_images)
    labels.extend([class_index] * len(class_images))

# Extracting features from images
X = [extract_features(image_path) for image_path in image_paths]

# Encoding labels
le = LabelEncoder()
y = le.fit_transform(labels)

# Splitting the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initializing and training a Random Forest Classifier
rf_classifier = RandomForestClassifier(n_estimators=100, random_state=42)
rf_classifier.fit(X_train, y_train)

# Making predictions
y_pred_rf = rf_classifier.predict(X_test)

# Evaluating the Random Forest model
accuracy_rf = accuracy_score(y_test, y_pred_rf)
report_rf = classification_report(y_test, y_pred_rf)

print("Random Forest Classifier:")
print("Accuracy:", accuracy_rf)
print("Classification Report:n", report_rf)

깊은 신경망

이는 계산 단위와 뉴런 수를 기반으로 작동합니다. 각 뉴런은 입력을 받아들이고 출력을 제공합니다. 정확도를 높이고 더 나은 예측을 하는 데 사용되는 반면, 기계 학습 알고리즘은 데이터 해석에 의존하며 이를 기반으로 결정이 내려집니다.

또한 읽기: 딥 러닝과 신경망 입문 가이드

이제 Keras의 Sequential API를 사용하여 정의된 모델을 빌드해 보겠습니다. 이 모델에는 Conv2D 컨볼루션 레이어, MaxPooling2D, 평탄화 레이어 Flatten 및 조밀 레이어 Dense가 있습니다.

마지막으로 모델은 다음을 사용하여 컴파일됩니다. 아담 옵티마이저 및 범주형 교차 엔트로피 손실.

import tensorflow as tf
from tensorflow.keras.preprocessing import image_dataset_from_directory

data_dir = "/kaggle/input/soil-classification-dataset/Soil-Dataset"

# Setting up data generators
batch_size = 32
image_size = (224, 224)

# Using image_dataset_from_directory to load and preprocess the images
train_dataset = image_dataset_from_directory(
    data_dir,
    labels='inferred',
    label_mode='categorical',
    validation_split=0.2,
    subset='training',
    seed=42,
    image_size=image_size,
    batch_size=batch_size,
)

validation_dataset = image_dataset_from_directory(
    data_dir,
    labels='inferred',
    label_mode='categorical',
    validation_split=0.2,
    subset='validation',
    seed=42,
    image_size=image_size,
    batch_size=batch_size,
)

# Displaying the class indices
print("Class indices:", train_dataset.class_names)

model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(224, 224, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(len(train_dataset.class_names), activation='softmax')
])

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Training the model
epochs = 10
history = model.fit(train_dataset, epochs=epochs, validation_data=validation_dataset)

import numpy as np
from tensorflow.keras.preprocessing import image

# Function to load and preprocess an image for prediction
def load_and_preprocess_image(img_path):
    img = image.load_img(img_path, target_size=image_size)
    img_array = image.img_to_array(img)
    img_array = np.expand_dims(img_array, axis=0)
    img_array /= 255.0  
    return img_array

image_path = '/kaggle/input/soil-classification-dataset/Soil-Dataset/Peat Soil/13.jpg'
new_image = load_and_preprocess_image(image_path)

# Making predictions
predictions = model.predict(new_image)
predicted_class = np.argmax(predictions[0])

# Getting the class label based on the class indices
class_labels = {0: 'Black Soil', 1: 'Cinder Soil', 2: 'Laterite Soil',
 3: 'Peat Soil', 4: 'Yellow Soil'}
predicted_label = class_labels[predicted_class]

# Displaying the prediction
print("Predicted Class:", predicted_class)
print("Predicted Label:", predicted_label)

보시다시피 예측 클래스는 0, 즉 Black Soil입니다. 따라서 우리 모델은 토양의 유형을 올바르게 분류하고 있습니다.

제안된 멀티 스택 앙상블 학습 모델 아키텍처

XNUMXD덴탈의 스태킹 분류기 baseClassifiers로 초기화되며 로지스틱 회귀 메타 분류자 final_estimator. 이는 baseClassifier의 출력을 결합하여 최종 예측을 만듭니다. 그런 다음 훈련 및 예측 후 정확도가 계산됩니다.

base_classifiers = [
    ('rf', RandomForestClassifier(n_estimators=100, random_state=42)),
    ('knn', KNeighborsClassifier(n_neighbors=5)),
    ('svm', SVC(kernel='rbf', C=1.0, probability=True)),
    ('nb', GaussianNB())
]

# Initializing the stacking classifier with a logistic regression meta-classifier
stacking_classifier = StackingClassifier(estimators=base_classifiers, 
final_estimator=LogisticRegression())

# Training the stacking classifier
stacking_classifier.fit(X_train, y_train)

# Making predictions with Stacking Classifier
y_pred_stacking = stacking_classifier.predict(X_test)

# Evaluating the Stacking Classifier model
accuracy_stacking = accuracy_score(y_test, y_pred_stacking)
report_stacking = classification_report(y_test, y_pred_stacking)

print("nStacking Classifier:")
print("Accuracy:", accuracy_stacking)
print("Classification Report:n", report_stacking)

결론

토양은 좋은 작물을 생산하는 데 중요한 요소입니다. 특정 작물을 생산하는 데 어떤 토양 유형이 필요한지 아는 것이 중요합니다. 따라서 토양의 종류를 분류하는 것이 중요합니다. 토양의 종류를 수동으로 분류하는 것은 시간이 많이 걸리는 작업이므로 딥러닝 모델을 사용하여 토양을 분류하는 것이 쉬워집니다. 이 문제 설명을 구현하기 위한 많은 기계 학습 모델과 딥 러닝 모델이 있습니다. 가장 좋은 것을 선택하는 것은 데이터 세트에 있는 데이터의 품질과 양, 그리고 현재 문제 설명에 따라 달라집니다. 최상의 알고리즘을 선택하는 또 다른 방법은 각각을 평가하는 것입니다. 우리는 토양을 얼마나 정확하게 분류할 수 있는지에 따라 정확도를 측정함으로써 이를 수행할 수 있습니다. 마지막으로 여러 모델을 사용하여 최상의 모델을 구축하는 Multi-Stacking 앙상블 모델을 구현했습니다.