The photoelectric conversion efficiency (PCE) of perovskites remains beneath the Shockley-Queisser limit, despite its significant potential for solar cell applications. The present focus is on investigating potential multicomponent perovskite candidates, particularly on the application of machine learning to expedite band gap screening. To efficiently identify high-performance perovskites, we utilized a data set of 1346 hybrid organic-inorganic perovskites and employed 11 machine learning models, including decision trees, convolutional neural networks (CNNs), and graph neural networks (GNNs). Four descriptors were utilized for high-throughput screening: sine matrix, Ewald sum matrix, atom-centered symmetry functions (ACSF), and many-body tensor representation (MBTR). The results indicated that LightGBM and CatBoost somewhat surpassed XGBoost in decision tree models, but random forests lagged. Among the CNN models utilizing the same four descriptors, CustomCNN and VGG16 surpassed Xception, while EfficientNetV2B0 exhibited the least favorable performance. When the sine matrix and Ewald sum matrix served as adjacency matrices in GNN models, GCSConv exhibited a considerable improvement over GATConv and a slight advantage over GCNConv. Significantly, GCSConv outperformed other models when utilized with the Ewald sum matrix. The ideal combination of descriptors and algorithms identified was MBTR + CustomCNN, with an R2 of 0.94. Subsequently, three perovskites exhibiting appropriate Heyd-Scuseria-Ernzerhof (HSE06) band gaps were identified to define the defects. Among them, CH3C(NH2)2SnI3 exhibited superior performance in both vacancy and substitutional defects compared to C3H8NSnI3 and (CH3)2NH2SnI3. This high-throughput screening method with machine learning establishes a robust foundation for selecting solar materials with exceptional photoelectric properties.