Introduction

The Mediterranean marine intensive aquaculture has become one of the most important parts of the European Union primary production sector (HAPO 2023). This development is supported by continuous improvements in multiple aspects of the production cycle, e.g., fish handling and nutrition, fish health, and equipment (Føre et al. 2018; Li and Du 2022; Wu et al. 2022). A highly crucial point for the effective monitoring and management of the production progress is the knowledge, hopefully at any time, of fish body weight (BW) within sea cages (Tonachella et al. 2022; Chatziantoniou et al. 2023). Frequent evaluation of fish BW together with feed intake allows for estimation of fish growth performance and feed efficiency and thus for decisions regarding, e.g., grading, feeding, and harvest (Tonachella et al. 2022; Chatziantoniou et al. 2023).

In the aquaculture practice of marine Mediterranean fish species, the BW of the fish inside a cage is estimated through scheduled manual weighing of several small fish groups (Tonachella et al. 2022; Shi et al. 2023). The procedure requires labor effort, is time-consuming, and can be stressful to fish. Besides, there is the risk of error in the estimation since only a small sample of fish are weighed. Acknowledging the necessity for simplification and accuracy of the procedure, as well as for a “noninvasive, objective and repeatable measurement” (Li et al. 2020), several intelligent and automatic systems for BW estimation, mainly based on computer vision, have been developed, while research and interest on this field is continuously growing (e.g., Li et al. 2020; Li and Du 2022; Tonachella et al. 2022).

Not any kind of underwater equipment can weigh the fish, but instead, it can acquire images where fish morphometric traits can be visible and measurable. Thus, an important feature of any automatic weight-estimator system is the precise conversion of fish morphometric traits visible in the image to fish BW (Viazzi et al. 2015; Li et al. 2020; Holmes and Jeffres 2021; Li and Du 2022). The most common way to achieve the goal is to establish accurate regression models that predict fish BW from one or more traits. The weight-length relationship has been widely used for predictions of fish BW from total, standard, or fork length (e.g., gilthead seabream Sparus aurata, meagre Argyrosomus regius, red porgy Pagrus pagrus, Navarro et al. 2016; red tilapia Oreochromis niloticus, Jongjaraunsuk and Taparhudee 2022; gilthead seabream, Tonachella et al. 2022). However, available data suggest that the use of other morphometric data (e.g., surface area, body height) or the use of multiple traits can provide more accurate predictions (Alaskan Salmon species Oncorhynchus sp., Balaban et al. 2010a; Alaskan pollock Theragra chalcogramma, Balaban et al. 2010b; rainbow trout Oncorhynchus mykiss, Gümüş and Balaban 2010; herring Clupea harengus, Mathiassen et al. 2011; European seabass Dicentrarchus labrax larvae, de Verdal et al. 2014, Azevedo et al. 2023; Jade perch Scortum barcoo, Viazzi et al. 2015; Asian seabass Lates calcarifer, Konovalov et al. 2018, 2019; Nile tilapia, Fernandes et al. 2020; Taparhudee and Jongjaraunsuk 2023; European catfish Silurus glanis and African catfish Clarias gariepinus, Gümüş et al. 2021, chinook salmon Oncorhynchus tshawytscha, Holmes and Jeffres 2021; Australasian snapper Chrysophrys auratus, Yang et al. 2021; scaled and mirror carp Cyprinus carpio, Gümüş et al. 2023).

Given the importance of gilthead seabream Sparus aurata as a major reared species of intensive Mediterranean aquaculture, the objective of the present study was to establish the regression models needed to predict BW from traits measurable from images. Present data covered a BW range from 50 to 1000 g aiming to satisfy the needs of the main part of the production cycle in sea cages.

Materials and methods

Fish

A total of 3312 gilthead seabream were used in this study. Within the range of the targeted body weight (BW), i.e., 50–1000 g, special care was taken to have a minimum of 50 observations per 25 g weight classes (a total of 38 classes; in 3 out of 38 classes, the number of observations is 39–45 due to unsuccessful camera focus; see “Measurements”) (Table 1). The main bulk of samples (from 175 up to 1000 g) was obtained in a commercial packaging plant. In brief, harvested fish were slaughtered on site in 1 m3 bins with ice-seawater slurry and were thus transported to the packaging plant with refrigerator tracks. Upon arrival, fish were distributed according to their weight class (automatic sorting) and packaged (whole, ungutted, ventral side upwards), with flaked ice, in polystyrene boxes at 1–2 °C. Present measurements were performed in fish that had not reached full rigor mortis and not later than 12 h from slaughter. Fish of lower BW (from 50 up to 175 g) were obtained from a commercial fish farm. Although fish with BW ≤ 175 g were found among harvested fish in the packaging plant, they were not considered as representative of robust fish intensively growing. When the fish of a sea cage reach mean commercial weight and harvest begins, fish of much lower and much larger BW than the commercial size are also present, are harvested, and reach the packaging plant. However, the much smaller fish (i.e., <175 g) are either fish with low growth rates and/or fish that did not manage to feed well and thus were not suitable to be included in the present study. Instead, fish of 50–175 g, still growing in sea cages, are much more appropriate. On the farm, fish were netted from their cage and anesthetized in buckets. After weighing and photographing, fish recovered in buckets with clean sea water and were returned to their cage.

Table 1 Frequency tabulation of body weight for gilthead seabream used in the present study

Measurements

Each fish was individually weighed (scale precision 0.2 g for fish > 175 g and 0.1 g for fish <175 g), after wiping the surplus of water, and photographed. A photo camera (Canon, EOS M50) was mounted on a tripod, over the scale, and took lateral pictures of each fish. A ruler next to the fish was embedded in each photo to be used as a reference during morphometric traits measurements. The latter were performed using an image analyses software (Image-Pro Plus, v. 6.0). The software was set so that once the necessary landmarks were marked in the photo, the desired lengths (mm, Fig. 1), previously calibrated with reference to the ruler, were provided in an excel file. More precisely, the morphometric traits measured were total body length (TBL), fork body length (FBL), standard body length (SBL), body height (BH), head length (HL), and eye diameter (ED). Also, for each fish, the contour of its body (without fins), head, and eye was used to measure body area (BA), head area (HA), and eye area (EA) (one side, mm2).

Fig. 1
figure 1

Morphometric traits measured. 1–2: total body length—TBL, mm; 1–3: fork body length—FBL, mm; 1–4: standard body length—SBL, mm; 5–6: body height—BH, mm; 1–7: head length—HL, mm, 8–9: eye diameter—ED, mm

Data analysis

All statistical analysis was carried out using Stata 18 software (Stata Corp, College Station, TX, USA). Data analysis was performed on the whole data set (i.e., 50–1000 g) and on three subgroups (Subgroup 1, S1: 50–100 g; Subgroup 2, S2: 100–500 g; Subgroup 3, S3: 500–1000 g). The weight range of the chosen subgroups corresponds to specific weight classes of importance to gilthead seabream intensive rearing. In particular, S1 has the fastest growing fish, while growth rate decreases as fish grow from S1 to S3; S2 includes fish that reach the commercial size, and S3 includes larger fish which are usually directed in special markets (e.g., restaurants), filleted, or selected as possible parents in breeding selection programs. The analysis of all the data together and in the three subgroups was decided in order to investigate whether more accurate BW prediction models can be obtained when the BW range is more limited.

For the whole data set and for each of the above-mentioned subgroups, Spearman rank correlations between BW and all morphometric traits measured, as well as the ratios BA/SBL, BA/BH, BA/HA, BA/EA, and SBL/BH, were calculated. Correlation coefficients were assessed according to Asuero et al. (2006) and Ratner (2009); special emphasis was given to identify a very high association of BW with a given morphometric trait (i.e., r>0.95). Based on correlation analysis results and taking into consideration that the fins may not be easily distinguished underwater while they contribute much to area but little to fish BW, regression models examined to predict BW were focused on SBL, BH, BA, BA/SBL, and BA/BH.

Simple, multiple linear, and 2nd-order polynomial regression models were applied to obtained data. In the case of simple regression, for all parameters examined, the power equation [Y = a Xb, Y: BW, X: morphometric traits or ratios, ln(a): intercept, b: slope] resulted in the highest coefficient of determination (R2) values, so other models are not included in the present study. In the case of multiple linear regression (Y = C0 + C1 X1 + C2 X2 + … + Ci Xi; Y: BW; C0 – Ci: constants; X1 – Xi: morphometric traits or ratios), stepwise regression with backward selection was applied. The analysis was performed including only the basic morphometric traits (i.e., SBL, BH, BA) or including only the ratios (i.e., BA/SBL, BA/BH) or including all data together (i.e., SBL, BH, BA, BA/SBL, BA/BH). In the tables, the final chosen model of each case is presented. Finally, each morphometric trait or ratio was also fitted to a 2nd-order polynomial regression model (Y = C0 + C1 X + C2 X2; Y: BW; C0 – C2: constants; X: morphometric traits or ratios). When the P value of the 2nd-order term of the polynomial was greater than 0.05, then the term was removed, and the order of the model was reduced to one.

To compare and assess the precision of the models obtained, the following metrics were used (Asuero et al. 2006; Ratner 2009; Chicco et al. 2021): (a) coefficient of determination (R2), as a measure of the proportion of the variance in the dependent variable (i.e., BW) that is predictable from the independent variables (i.e., morphometric traits). Models with R2 ≥ 98% were considered strong. (b) Mean absolute error (MAE), as an estimate of the absolute difference between the actual and predicted values. (c) Mean absolute percentage error (MAPE), as an estimate of the percentage difference between predicted values and actual values. MAPE is the percentage equivalent of MAE. (d) Root mean square error (RMSE), as a measure of the standard deviation of residuals. In the present data, the best models were initially sought among those with the highest R2 (Chicco et al. 2021). Once these were identified, then selection was refined among those models with the lowest error terms.

Results

Whole data set (50–1000 g)

The mean, median, standard deviation, minimum-maximum value, and coefficient of variation for BW and morphometric traits recorded and calculated (i.e., ratios) for the whole data set (BW range 50–1000 g) are shown in Table 2. Spearman rank correlation coefficients (Table 3) of BW with each of the traits determined were highly significant. Regarding the traits measured, the strongest correlation (r = 0.9904) was detected between BW and BA (monolateral, fins excluded), while the weakest was between BW and parameters related to head and eye (i.e., HL, HA, ED, EA). Correlations of BW with body lengths (i.e., TBL, FBL, SBL, BH) were also strong (r > 0.98). Regarding the ratios calculated, a strong correlation was found between BW and the ratios BA/SBL and BA/BH. Correlation coefficients for BW with BA/HA and BA/EA were lower, while no association is indicated between BW and the ratio SBL/BH.

Table 2 Descriptive statistics for body weight (50–1000 g) and morphometric traits recorded in gilthead seabream used in the present study
Table 3 Spearman rank correlation coefficients between body weight and morphometric traits recorded in gilthead seabream 50–1000 g

Regression analysis performed in the present study focused on SBL, BH, BA, BA/SBL, and BA/BH as possible estimators of the BW (Table 4). Simple power regression (Eqs. 1–5, Table 4) showed that the relationship with the highest R2 and the lowest errors (MAPE, MAE, RMSE) was the one that considers BA as X (Eq. 3, Table 4, Fig. 2). Although the other power equations also had high R2, the error level was much higher. The investigation of multiple linear models (Eqs. 6–8, Table 4) pointed out the equation that involved SBL, BA, BA/SBL, and BA/BH (Eq. 8, Table 4). R2, as well as the level of errors observed, were comparable to values obtained in the power regression (Eq. 3, Table 4). Finally, the 2nd-order polynomial regression (Eqs. 9–13, Table 4) did not reveal a more accurate equation than those observed with power and multiple linear models. BA was the best estimator for BW (Eq. 11, Table 4), but still R2 and error levels were inferior to those observed for Eqs. 3 and 8.

Table 4 Power, multiple linear and polynomial regression results for body weight (Y=BW) with morphometric traits (X) in gilthead seabream of 50–1000 g
Fig. 2
figure 2

Plot of fitted power model for gilthead seabream of 50–1000g; ln(BW) = −8846 +1525 × ln(BA) or BW = exp (−8846 + 1525 × BA); R2 = 98.98%; r = 0.9949; P < 0.001; n = 3312. Yellow areas mark the 95% confidence intervals

Data subgroups (50–100 g, 100–500 g, 500–1000 g)

Descriptive statistics (Table SM1), correlation (Table SM2), and regression analysis data (Tables SM3-SM5, Fig. SM1-SM3) for each subgroup of BW examined are presented in Supplementary Material (SM). All correlation coefficients were slightly lower compared to those obtained when the whole data set was considered as one, but the results obtained regarding their relative importance were identical. Regression analysis applied to each subgroup showed, in all equations, lower R2 and similar or smaller error levels than those obtained from the whole data set. Apart from these differences, in each regression model category (i.e., power, multiple linear, 2nd-order polynomial), the more accurate (in terms of R2 and errors) equations obtained were the same as those detected for the weight range 50–1000 g, i.e., power equation with BA, multiple linear equation with the involvement of SBL, BH, and BA, and polynomial equation with BA, differing in the related constants.

Discussion

In the present study, morphometric traits considered to be suitable to predict BW in a regression model were those that were strongly (r > 0.95) correlated with BW. Strong (i.e., r > 0.80) association of BW with BA, SBL, BH, and HL has also been previously reported for gilthead seabream (Boulton et al. 2011; Navarro et al. 2016) and for other species (i.e., de Verdal et al. 2014; Fernandes et al. 2020). Similarly to the present results, a weak association was detected between BA and ED in spotted scat Scatophagus argus (Chen et al. 2022) and between BA and the ratio BH/SBL (the inverse of the present examined ratio) in pirapitinga Piaractus brachypomus (Ribeiro et al. 2019). Thus, all traits detected to be strongly associated with BW had adequate potential to be included in the regression models investigated, while traits like HL, HA, ED, EA, BA/HA, BA/EA, and SBL/BH could be excluded. Among body lengths, TBL and FBL were also excluded, despite the high correlation coefficients observed, since they involve the caudal fin which may lead to inaccurate measurements (Viazzi et al. 2015; Konovalov et al. 2019; Jongjaraunsuk and Taparhudee 2021).

The present study showed that models predicting BW of gilthead seabream from morphometric traits when the whole BW range is included as one data set are more accurate, in terms of R2, compared to those obtained when the defined subgroups of BW are used. On the other hand, models identified in the latter case had similar or lower values of the error terms. However, all MAPE (mean absolute percentage error) values obtained for the best equations identified in each regression model (no matter the data set) lay at the lower limit of percentage errors (4–14 %) reported in similar studies for other fish species (e.g., Viazzi et al. 2015, Konovalov et al. 2018, 2019; Fernandes et al. 2020, Jongjaraunsuk and Taparhudee 2021). Besides, according to Chicco et al. (2021), the coefficient of determination (R2) is considered as a more “informative and truthful” metric for the evaluation of regression analyses and for the comparison among different regression models. Overall, the present results suggest that the prediction of BW from the regression models of each BW subgroup examined does not offer an advantage over the prediction resulting by using the equations of the whole data set.

A common feature of all best BW prediction models obtained in the present study, no matter the BW range that they were applied, is the involvement of fish body area (BA) as an important predictor. Previously reported studies that investigated the best estimator variables for BW prediction and included fish area along with other measurements (e.g., contour, length, width, height, eccentricity) concluded that just fish area is sufficient to produce the more accurate regression models, at least for the species they studied (Balaban et al. 2010a; de Verdal et al. 2014; Viazzi et al. 2015; Fernandes et al. 2020; Gümüş et al. 2021; Taparhudee and Jongjaraunsuk 2023). Furthermore, the fact that fins were excluded from the present BA measurement probably adds to the accuracy of obtained models, as it was observed in Jade perch (Viazzi et al. 2015) and Asian seabass (Konovalov et al. 2019; Jongjaraunsuk and Taparhudee 2021), although no effect was found for Alaskan pollock (Balaban et al. 2010b). Fins contribute much to the area but little to BW; they continuously fold and unfold in a swimming fish, while they may also be damaged. The measurement of BA without the fins is a more accurate variable and explains the improvement of related predictions, whereas it may better fit in computer vision systems.

Focusing on the prediction models obtained from the whole data set (i.e., 50–1000 g), the models with the highest R2 and the lowest errors were either the power regression of BW with BA or the multiple linear regression of BW with SBL, BA, BA/SBL, and BA/BH as predictors. In the latter case, besides BA, two more measurements are involved, i.e., SBL and BH. The accuracy of the two models is quite similar, and for reasons of simplicity, the power regression is advantageous. In the literature, the power curve has been extensively used to evaluate allometric relationships between fish BW and mostly body length in fisheries and marine biology studies (e.g., Sangun et al. 2007; Robinson et al. 2010; Mathiassen et al. 2011; Karachle and Stergiou 2012; Sinopoli et al. 2022). Morphometric traits have also been of use in the aquaculture science to identify shape-related differences between farmed and wild fish (e.g., Arechavala-Lopez et al. 2012; González et al. 2016; Fragkoulis et al. 2017) and skeletal deformities (Costa et al. 2013), as well as to differentiate between the sexes (Çoban et al. 2011).

In conclusion, the power equation between BW and BA reported in the present study [ln(BW) = −8.846 +1.525 × ln(BA) or BW = exp (−8.846 + 1.525 × BA)] can accurately predict gilthead seabream BW, in the range of 50–1000 g, through the measurement of only one trait, i.e., monolateral body area without fins. It remains for the machine vision-based methods (Li et al. 2020; Li and Du 2022) to develop a way to acquire the image, extract the fins, and measure the body area by surpassing all the practical difficulties that an underwater measurement of a great number of swimming fish involves.