Objectives: To assess the level of agreement between experts in distinguishing between septate and normal/arcuate uterus using their subjective judgment when reviewing the coronal view of the uterus from three-dimensional ultrasound. Another aim was to determine the interobserver reliability and diagnostic test accuracy of three measurements suggested by recent guidelines, using as reference standard the decision made most often by experts (Congenital Uterine Malformation by Experts (CUME)). Methods: Images of the coronal plane of the uterus from 100 women with suspected fundal internal indentation were anonymized and provided to 15 experts (five clinicians, five surgeons and five sonologists). They were instructed to indicate whether they believed the uterus to be normal/arcuate (defined as normal uterine morphology or not clinically relevant degree of distortion caused by internal indentation) or septate (clinically relevant degree of distortion caused by internal indentation). Two other observers independently measured indentation depth, indentation angle and indentation-to-wall-thickness (I:WT) ratio. The agreement between experts was assessed using kappa, the interobserver reliability was assessed using the concordance correlation coefficient (CCC), the diagnostic test accuracy was assessed using the area under the receiver–operating characteristics curve (AUC) and the best cut-off value was assessed using Youden's index, considering as the reference standard the choice made most often by the experts (CUME). Results: There was good agreement between all experts (kappa, 0.62). There were 18 septate and 82 normal/arcuate uteri according to CUME; European Society of Human Reproduction and Embryology (ESHRE)-European Society for Gynaecological Endoscopy (ESGE) criteria (I:WT ratio > 50%) defined 80 septate and 20 normal/arcuate uteri, while American Society for Reproductive Medicine (ASRM) criteria defined five septate (depth > 15 mm and angle < 90°), 82 normal/arcuate (depth < 10 mm and angle > 90°) and 13 uteri that could not be classified (referred to as the gray-zone). The agreement between ESHRE-ESGE and CUME was 38% (kappa, 0.1); the agreement between ASRM criteria and CUME for septate was 87% (kappa, 0.39), and considering both septate and gray-zone as septate, the agreement was 98% (kappa, 0.93). Among the three measurements, the interobserver reproducibility of indentation depth (CCC, 0.99; 95% CI, 0.98–0.99) was better than both indentation angle (CCC, 0.96; 95% CI, 0.94–0.97) and I:WT ratio (CCC, 0.92; 95% CI, 0.90–0.94). The diagnostic test accuracy of these three measurements using CUME as reference standard was very good, with AUC between 0.96 and 1.00. The best cut-off values for these measurements to define septate uterus were: indentation depth ≥ 10 mm, indentation angle < 140° and I:WT ratio > 110%. Conclusions: The suggested ESHRE-ESGE cut-off value overestimates the prevalence of septate uterus while that of ASRM underestimates this prevalence, leaving in the gray-zone most of the uteri that experts considered as septate. We recommend considering indentation depth ≥ 10 mm as septate, since the measurement is simple and reliable and this criterion is in agreement with expert opinion. Copyright © 2017 ISUOG. Published by John Wiley & Sons Ltd.