1. Home
  2. Archives
  3. Vol 12 (2018) Issue 3
  4. Articles

Towards Automated Biometric Identification of Sea Turtles (Chelonia mydas)

Abstract

Passive biometric identification enables wildlife monitoring with minimal disturbance. Using a motion-activated camera placed at an elevated position and facing downwards, images of sea turtle carapaces were collected, each belonging to one of sixteen Chelonia mydas juveniles. Then, co-variant and robust image descriptors from these images were learned, enabling indexing and retrieval. In this paper, several classification results of sea turtle carapaces using the learned image descriptors are presented. It was found that a template-based descriptor, i.e. Histogram of Oriented Gradients (HOG) performed much better during classification than keypoint-based descriptors. For our dataset, a high-dimensional descriptor is a must because of the minimal gradient and color information in the carapace images. Using HOG, we obtained an average classification accuracy of 65%.

Keywords

1 Introduction

Biometric identification of sea turtles within a population is essential for behavioral and ecological study, allowing researchers to estimate vital statistics such as growth rate, survivorship, foraging patterns and population size. Traditional methods of permanent marking and artificial tagging induce stress and possibly harm to the animals. Furthermore, tag loss is common because of various factors, namely elapsed time after tagging, study area, target species, size of animal, piercing site and tag properties (e.g. material, color and design) [1-5]. Individual sea turtles can also be recognized via photographic identification of their natural marks, for example, based on coloration patterns around the head area [6], facial profiles [7] and facial scute patterns [8]. Still, the mark-recapture process puts a considerable amount of stress on the animal.

We propose a passive biometric identification system of sea turtles based on robust and co-variant image descriptor matching, see Figure 1. A distant camera remotely captures aerial images of sea turtles at their nesting site. These images are each learned as a robust and co-variant image descriptor, enabling indexing and retrieval. The set-up is non-invasive i.e. uses a remote camera and no flash photography. Using this set-up, images of sixteen juveniles were collected at a private breeding farm in Lundu, Sarawak. The images were taken at night (since females nest at night) inside a perimeter that imitated an actual nesting site. The image descriptor was learned from part of the sea turtle that is most visible from the air, i.e. its carapace. A Chelonia mydas carapace, see Figure 2, contains a distinctive scute pattern that can be used to identify individuals [9].

Figure 1 Our proposed framework. Matching is a minimizing function, ω�, �.

Figure 2 Left to right. An outline drawing of a Chelonia mydas carapace, sourced from [10] and an actual image of a juvenile Chelonia mydas kept in a private breeding farm in Lundu, Sarawak.

Recent works on biometric identification of animals, not limited to sea turtles, were motivated by the use of computer vision pattern matching algorithms. An automated approach is a natural progression from manual inspection by human experts. Burghardt, et al. [11] used an extended version of Belongie, et al.'s Shape Contexts [12] to encode unique phase singularities of spot patterns on individual African penguins. In a more recent work [13] they proposed the detection of shape curls to represent individual Turing-patterned animals. Dababera and Rodrigo [14] implemented an eigenface-based identification mechanism to recognize individual African elephants using frontal-view face images. Loos and Pfiter [15] combined global and local facial features for visual identification of primates. Taha, et al. [16] learned SIFT [17] features from individual horse's muzzle images, later using RANSAC to remove outliers during matching. Also using muzzle images, Monteiro [18] combined graph matching and local invariant features to recognize individual cattle. Li, et al. [19] learned Zernike moments from tail-head images of Holstein dairy cows to recognize individuals. To the best of our knowledge, this is the first time a passive biometric identification system for recognizing individual sea turtles was developed.

2.1 Linear Deformation of Scute Pattern

Matching sea turtle individuals based on images captured by a stationary camera is non-trivial due to the linear deformation of salient image features. In such settings, deformation may consist of scale change and in-plane translation and rotation. Dorai, et al. [20] suggested several pre-emptive strategies to limit the impact of image deformation when collecting biometric data. Their approach requires multiple measurements to be taken concurrently, later sorted according to the severity of deformity.

Building on a different strategy, image feature descriptors such as [17,21,22] and [23] provide a better way to match scute patterns. The gradient-based feature descriptors are co-variant, or at least robust to various image transformations. When paired together with the bag-of-words framework, the set-up enables partial matching of models, i.e. using probability to find the closest match. Due to the low-light image capturing resulting in almost zero color information plus the minimal texture of sea turtle carapaces, we theorized that keypoint-based descriptors such as SIFT [17] would not fare well against our dataset. As for template-based descriptors, such as HOG [23], the higher dimensionality should provide a more robust description of the scute pattern. Nevertheless, template-based descriptors are not co-variant to rotation.

2.2 Matching of Scute Patterns

Between unique landings it is probable for an individual sea turtle's carapace to acquire new permanent markings, for example predation marks, scarring and barnacles. It may also acquire new ephemeral markings, for example, algae, sand particles and dirt. Our feature descriptor must be robust against such noise when matching scute patterns. Existing keypoint-based and template-based feature descriptors should solve this problem by providing a degree of robustness against noise during matching. However, the degree of robustness depends on the properties of the captured images.

3 Our Dataset

We collected between two and eleven RGB images (of \(3,264 \times 2,448\) resolution) each for sixteen Chelonia mydas juveniles, see Table 1. A total of 70 images were taken without flash (as turtles are very sensitive to light). The complete dataset (CC-BY-4.0), see Figure 3, is available on the corresponding author's website. The juvenile sea turtles were kept in captivity inside a private breeding farm in Lundu, Sarawak. During data capture, each individual was placed inside a perimeter and was allowed to move around freely for 3 to 5 minutes. This set-up aimed to replicate an environment similar to a sea turtle's nesting site. All sixteen individuals, as part of a larger group, were released back to the sea in December 2015 [24].

NoNumber of
Images
NoNumber
of Images
1394
23106
36113
43126
57133
611142
74153
83163

Table 1 PANDAN-CHELOMY dataset.

The dataset was pre-processed prior to classification. Images were converted to gray scale and manually rotated to position the carapace in an upright pose, see Figure 4. The pose correction is required to enable the matching of template-based image descriptors. Inside each image, we manually set an ROI window to exclude most of the background. The remaining background elements inside the ROI were later removed via a smoothing function. The rotated images and ROI information are both included inside our PANDAN-CHELOMY dataset.

Figure 3 A From left to right, top to bottom. Sample raw images of each Chelonia mydas juvenile, Turtle 1 to Turtle 16, taken from our dataset PANDAN-CHELOMY.

Figure 4 Rotated images from PANDAN-CHELOMY, each with a visualized ROI.

4 Matching of Carapace Images

4.1 Classification

The PANDAN-CHELOMY dataset contains 70 images belonging to sixteen juveniles. Prior to matching, the ROIs were smoothed using a 4 4 × Gaussian

kernel to suppress the remaining background elements. We found this kernel size to be optimal for removing sand features. A larger kernel erodes more gradient, resulting in a higher loss of discriminative features inside the ROI. A smaller kernel retains more noise, which reduces the classification's accuracy.

Using k-fold cross-validation, a matching score for each ROI of the test set is obtained against the other ROIs of the training set based on the nearest-neighbor distance ratio (NNDR) scheme. The threshold values were varied from 0.0 to 1.0. We calculated the score, \(\beta\), as follows:

\[\beta = \omega_{top} \left( d_x, d_m \right) / \omega_{2nd} \left( d_x, d_n \right) \tag{1}\] where \(\omega_{top}\) is the distance between the template descriptor, \(d_x\), and the best-matching target descriptor, \(d_m\), and \(\omega_{2nd}\) is the distance between \(d_x\) and \(d_n\), i.e. the second best-matching target descriptor.

If both \(d_x\) and \(d_m\) belong to the same individual, the classification function, \(\xi\), returns two possible values:

\[\xi(d_{x},d_{m}) \begin{cases} True Positive & \beta \leq threshold \\ False Negative & Otherwise \end{cases}\] (2)

Otherwise,

\[\xi(d_{x},d_{m}) \begin{cases} False \ Positive \quad \beta \leq threshold \\ True \ Negative \quad Otherwise \end{cases}\] (3)

In the event the classification function returns multiple top matches, we count the result as a false negative. Based on the total number of true positives (TP), false positives (FP), true negatives (TN) and false negatives (FN) obtained from our classification exercise, the true positive rate, TPR = TP/(TP + FN), and the false positive rate, FPR = FP/(FP + TN), for each threshold value are estimated.

4.2 Image Descriptors

Scale-invariant Feature Transform (SIFT) [17], Speeded Up Robust Features (SURF) [21] and Oriented FAST and Rotated BRIEF (ORB) [22] were selected as keypoint-based descriptors. Histogram of Oriented Gradients (HOG) [23] was chosen as the sole template-based descriptor. All parameters were set to the default values as suggested in the original publications [17,21,22], except for HOG.

For HOG, we rescale each ROI to \(96 \times 128\) resolution, which translates to a descriptor length of \(\left(\frac{96}{8} - 1 \times \frac{128}{8} - 1\right) \times (2 \times 2) \times = 5,940\), with \(8 \times 8\) pixel cells, blocks of \(2 \times 2\) cells, and a 9-bin orientation histogram (0°-180°). Our ROIs were larger than the \(64 \times 128\) resolution used in [23] due to the typical dimension of a sea turtle carapace being almost equal in width and height.

Figure 5 Example of a SIFT matching result (acceptance threshold of 0.8) between Turtle 8a and Turtle 6d. The number of positive matches was 5. Image brightness was increased to improve visual clarity.

Figure 6 Visualized HOG descriptor for Turtle 8a (left) and Turtle 6d (right). The L2-norm value was 6.82. Image brightness was increased to improve visual clarity.

Additionally, for SIFT, SURF and ORB, the acceptance threshold was varied during keypoint matching from 0.2 to 0.8. The classification result for each acceptance threshold was plotted separately. See Figure 5 for an example of a SIFT matching result and Figure 6 for an example of visualized HOG descriptors for two paired carapace images.

5 Results and Discussion

Our classification of 70 sea turtle carapaces, using -fold cross-validation, produced the ROC curve plots shown in Figure 7. As predicted, all keypointbased descriptors, with acceptance threshold values ranging from 0.2 to 0.8, failed spectacularly with worse performance than random guessing. Only HOG performed better (than random guessing) with an average accuracy of 65%. Classification accuracy of this dataset via random guessing was 6.25%. Evidently, the nature of our ROI images, i.e. minimal color information and lack of texture, contributed to the failure of SIFT, SURF and ORB. Our implementation of HOG produced a descriptor length of 5,940, which is a far greater number than SIFT's 128, making it more discriminative and robust.

The optimal threshold value for HOG is 0.9. This reveals that with HOG, even though we managed to obtain an average classification accuracy of 65%, the distance between the top match and the second-best match is nominal. The confusion matrix using the optimal configuration is shown in Figure 8. There are 16 actual classes and 16+1 predicted classes. The additional predicted class, i.e. class MANY, represents cases where our classification function returned multiple top matches. Such cases were considered a false negative to penalize the configuration.

4

Figure 7 ROC curve plots obtained from all classification results.

Predicted classes
Tunte;Tunks.Tunte 3TuntesTurde'STuntesTurbe,7Turde 8Tunk 8Turde 10Tunde 1.7Tunie 13Turne 13Tunie 14Tunte 15Tunte 16NAM
Actual classesTurtle 11.000.000.000.000.000.000.000.000.000.000.000.000.000.000.000.000.00
Turtle 20.001.000.000.000.000.000.000.000.000.000.000.000.000.000.000.000.00
Turtle 30.000.000.330.000.000.000.000.000.000.000.170.000.000.000.000.000.50
Turtle 40.000.000.000.670.000.000.000.000.000.000.000.000.330.000.000.000.00
Turtle 50.000.000.000.000.140.000.000.000.000.000.000.000.290.000.000.000.57
Turtle 60.000.000.000.000.090.090.000.000.000.000.000.000.000.000.090.000.73
Turtle 70.000.000.000.000.250.000.500.000.000.000.000.000.000.000.000.000.25
Turtle 80.000.000.000.000.000.000.001.000.000.000.000.000.000.000.000.000.00
Turtle 90.000.000.000.000.000.000.000.000.750.000.000.000.000.000.000.000.25
Turtle 100.000.000.000.000.000.000.000.000.000.500.000.000.000.000.000.000.50
Turtle 110.000.000.000.000.000.000.000.000.000.001.000.000.000.000.000.000.00
Turtle 120.000.000.000.000.000.000.000.000.000.000.000.500.000.000.000.000.50
Turtle 130.000.000.000.000.000.000.000.000.000.000.000.001.000.000.000.000.00
Turtle 140.000.000.500.000.000.000.000.000.000.000.000.000.000.000.000.000.50
Turtle 150.000.000.000.000.000.000.000.000.000.000.000.000.000.001.000.000.00
Turtle 160.000.000.000.000.000.000.000.000.000.000.000.000.000.000.001.000.00

Figure 8 Confusion matrix obtained from the classification results using HOG [23], with threshold value for NNDR scheme set to 0.9.

6 Conclusion

Based on the results of this study, it can be concluded that the recognition of Chelonia mydas individuals using aerial images of their carapace is possible. By learning these ROI images as HOG descriptors, an average classification accuracy of 65% was obtained, for certain individuals 75% or higher. By dealing with cases where multiple top matches are returned from a single classification instance, it should be possible to further improve the average classification accuracy. Solutions such as cumulative voting scheme [25,26] and modular classification [27] will be explored in the future. Another potential solution is to use 3D features captured using an RGB-D sensor to represent the scutes, such as normal surface [28].

Constrained by our grant's limitation, we deliberately ignored the effect of scute deformation over time. The dataset contains images captured during a single landing. In future, we plan to collect images of multiple landings at an actual nesting site over a longer period of time.

Acknowledgement

This work was supported by Universiti Malaysia Sarawak through the following Small Grant Scheme (SGS) grant: F08(S160)/1171/2014(25).

Research Intelligence

Data from OpenAlex ↗

Metrics

6
Citations
0.57
FWCIfield-weighted
74th
Percentilevs same year + field
Article
Work type
Open Access

Citation Trend

Citation Timeline

YearCitations
20243
20231
20212

Institution Network

References

  1. Balazs, G.H., Factors Affecting the Retention of Metal Tags on Sea Turtles, Marine Turtle Newsletter, 20, pp. 11-14, 1982.
  2. Mrosovsky, N. & Shettleworth, S.J., What Double Tagging Studies Can Tell Us, Marine Turtle Newsletter, 22, pp. 11-15, 1982.
  3. Limpus, C.J., Estimation of Tag Loss in Marine Turtle Research, Wildlife Research, 19, pp. 457-469, 1992. DOI: 10.1071/wr9920457
  4. Dam R.P. van & Diez, C.E., Differential Tag Retention in Caribbean Hawksbill Turtles, Chelonian Conservation and Biology, 3(2), pp. 225-229, 1999.
  5. Bellini, C., Godfrey, M.H. & Sanches, T.M., Metal Tag Loss in Wild Juvenile Hawksbill Sea Turtles (Eretmochelys Imbricata), Herpetological review, 32(3), pp. 172-173, 2001.
  6. McDonald, D.L. & Dutton, P.H., Use of PIT Tags and Photoidentification to Revise Remigration Estimates of Leatherback Turtles (Dermochelys coriacea) Nesting in St. Croix, U. S. Virgin Islands, 1979-1995, Chelonian Conservation and Biology, 2, pp. 148-152, 1996.
  7. Bennett, P., Keuper-Bennett, U. & Balazs, G. H., Photographic Evidence for the Regression of Fibropapillomas Afflicting Green Turtles at Honokowai, Maui, in the Hawaiian Islands, in Proceedings of the Nineteenth Annual Symposium on Sea Turtle Biology and Conservation. NOAA Technical Memorandum NMFS-SEFSC-443, 2000.
  8. Reisser, J.W., Proietti, M.C., Kinas, P.G. & Sazima, I., Photographic Identification of Sea Turtles: Method Description and Validation, with an Estimation of Tag Loss, Endangered Species Research, 5, pp. 73-82, 2008. DOI: 10.3354/esr00113
  9. Wyneken, J. & Witherington, D., The Anatomy of Sea Turtles, National Marine Fisheries Service, 2001.
  10. Boulenger, G.A., Fauna of British India, Reptilia And Batrachia, Taylor & Francis, 1890.
  11. Burghardt, T., Barham, P.J., Campbell, N., Cuthill, I.C., Sherley, R.B. & Leshoro, T.M., A Fully Automated Computer Vision System for the Biometric Identification of African Penguins (Spheniscus Demersus) on Robben Island, in 6th International Penguin Conference (IPC07), Hobart, Tasmania, Australia, E.J. Woehler Ed., 2007.
  12. Belongie, S., Malik, J. & Puzicha, J., Shape Context: A New Descriptor for Shape Matching and Object Recognition, in Advances In Neural Information Processing Systems, 2001.
  13. Burghardt, T. & Campbell, N., Generic Phase Curl Localisation for an Individual Identification of Turing-Patterned Animals, Visual Observation and Analysis of Animal and Insect Behavior, pp. 17-21, 2010.
  14. Dabarera, R. & Rodrigo, R., Vision Based Elephant Recognition for Management and Conservation, in 5th International Conference on Information and Automation for Sustainability, 2010. DOI: 10.1109/iciafs.2010.5715653
  15. Loos, A. & Pfitzer, M., Towards Automated Visual Identification of Primates Using Face Recognition, in 19th International Conference on Systems, Signals and Image Processing, 2012.
  16. Taha, A., Darwish, A. & Hassanien, A.E., Arabian Horse Identification System Based on Live Captured Muzzle Print Images, in International Conference on Advanced Intelligent Systems and Informatics, 2017. DOI: 10.1007/978-3-319-64861-3_73
  17. Lowe, D.G., Object Recognition from Local Scale-Invariant Features, in the proceedings of the 7th IEEE international Conference on Computer Vision, 1999. DOI: 10.1109/iccv.1999.790410
  18. Monteiro, F.C., Automatic Cattle Identification Using Graph Matching Based on Local Invariant Features, in International Conference Image Analysis and Recognition, 2016. DOI: 10.1007/978-3-319-41501-7_88
  19. Li, W., Ji, Z., Wang, L., Sun, C. & Yang, X., Automatic Individual Identification of Holstein Dairy Cows Using Tailhead Images, Computers and Electronics in Agriculture, 142, pp. 622-631, 2017. DOI: 10.1016/j.compag.2017.10.029
  20. Dorai, C.N., Ratha, K. & Bolle, R.M., Detecting DYNAMIC BEHAVIOR in Compressed Fingerprint Videos: Distortion, in IEEE Conference on Computer Vision and Pattern Recognition, 2000.
  21. Bay, H., Tuytelaars, T. & Van Gool, L., Surf: Speeded Up Robust Features, in ECCV, p.404-417, 2006. DOI: 10.1007/11744023_32
  22. Rublee, E., Rabaud, V., Konolige, K. & Bradski, G., ORB: An Efficient Alternative to SIFT or SURF, in IEEE International Conference on Computer Vision (ICCV), 2011. DOI: 10.1109/iccv.2011.6126544
  23. Dalal, N. & Triggs, B., Histograms of Oriented Gradients for Human Detection, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. DOI: 10.1109/cvpr.2005.177
  24. Ting, R.R., Nurhartini Bids Farewell to Her Turtles, 5 Dec 2016. [Online]. [Accessed 10/1/18].
  25. Hipiny, I. & Mayol-Cuevas, W., Recognising Egocentric Activities from Gaze Regions with Multiple-Voting Bag of Words, Technical Report CSTR12-003, University of Bristol, 2012.
  26. Hipiny, I., Egocentric Activity Recognition Using Gaze, PhD thesis, University of Bristol, 2013.
  27. Ujir, H., Sing, L.C. & Hipiny, I., A Modular Approach and Voting Scheme on 3D Face Recognition, in International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), 2014. DOI: 10.1109/ispacs.2014.7024451
  28. Ujir, H., Spann, M. & Hipiny, I., 3D Facial Expression Classification using 3D Facial Surface Normals, The 8th International Conference on Robotic, Vision, Signal Processing & Power Applications, pp. 245-253, 2014. DOI: 10.1007/978-981-4585-42-2_29