Abstract
Coffee is a superior plantation commodity in the export sector with high economic value. Coffee quality is the most important factor affecting the selling price, so coffee quality assessment is the main key in setting market prices and determining the export potential of coffee-producing countries. Coffee quality is divided into specialty, premium and regular based on bean defects and taste test values. Coffee quality prediction is needed to find out which coffee has the best quality. This study compares the Random Forest and K-Nearest Neighbor (KNN) methods to find out which algorithm is most effective in predicting coffee quality. The working principle of Random Forest is to build more than one decision tree and then determine the estimated value based on majority voting. KNN classifies data based on the distance between the data and other data. The coffee dataset used is sourced from the Coffee Quality Institute (CQI) Database. The data has problems to match resulting in a small recall value in the minority class, the SMOTE oversampling algorithm is used to improve classification performance. The advantage of oversampling compared to undersampling is that it does not lose data information. The results showed that the Random Forest method after SMOTE produced the best classification performance with accuracy and memory values of 80.26% and 80.59%, respectively.