A large-scale&nbsp;image dataset of wood surface defects for&nbsp;automated vision-based quality control processes

Pavel Kodytek; Alexandra Bodzas; Petr Bilik

doi:10.12688/f1000research.52903.1

Home Browse A large-scaleimage dataset of wood surface defects forautomated vision-based...

ALL Metrics

Views

Downloads

Get PDF

Get XML

Export

▬

✚

Data Note

A large-scale image dataset of wood surface defects for automated vision-based quality control processes

[version 1; peer review: 2 approved with reservations]

Pavel Kodytek¹, Alexandra Bodzas ¹, Petr Bilik¹

PUBLISHED 16 Jul 2021

Author details Author details

¹ Department of Cybernetics and Biomedical Engineering, VSB-Technical University of Ostrava, Ostrava, 70800, Czech Republic

Pavel Kodytek
Roles: Conceptualization, Investigation, Methodology, Software, Validation, Writing – Original Draft Preparation

Alexandra Bodzas
Roles: Software, Writing – Original Draft Preparation, Writing – Review & Editing

Petr Bilik
Roles: Funding Acquisition, Project Administration, Supervision

OPEN PEER REVIEW

REVIEWER STATUS

Abstract

The wood industry is facing many challenges. The high variability of raw material and the complexity of manufacturing processes results in a wide range of visible structure defects, which have to be controlled by trained specialists. These manual processes are not only tedious and biased, but also less effective. To overcome the drawbacks of the manual quality control processes, several automated vision-based systems have been proposed. Even though some conducted studies achieved a higher recognition rate than trained experts, researchers have to deal with a lack of large-scale databases and authentic data in this field. To address this issue, we performed a data acquisition experiment set in the industrial environment, where we were able to acquire an extensive set of authentic data from a production line. For this purpose, we designed and implemented a complex technical solution suitable for high-speed acquisition during harsh manufacturing conditions. In this data note, we present a large-scale dataset of high-resolution sawn timber surface images containing more than 43 000 labelled surface defects and covering 10 types of the most common wood defects. Moreover, with each image record, we provide two types of labels allowing researchers to perform semantic segmentation, as well as defect classification, and localization.

Keywords

wood surface defects, high resolution dataset, wood industry, wood processing, wood quality control process, wood defects dataset

Corresponding author: Alexandra Bodzas

Competing interests: No competing interests were disclosed.

Grant information: This work was supported by the “Student Grant System” of VSB-TU Ostrava, project number SP2021/123.

Copyright: © 2021 Kodytek P et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: Kodytek P, Bodzas A and Bilik P. A large-scale image dataset of wood surface defects for automated vision-based quality control processes [version 1; peer review: 2 approved with reservations]. F1000Research 2021, 10:581 (https://doi.org/10.12688/f1000research.52903.1) First published: 16 Jul 2021, 10:581 (https://doi.org/10.12688/f1000research.52903.1) Latest published: 27 Jun 2022, 10:581 (https://doi.org/10.12688/f1000research.52903.2)

Introduction

In the wood industry, each step of the manufacturing process affects material utilization and cost efficiency.¹ The heterogeneity of wood material with the complexity of these manufacturing processes may result in various defects, which not only degrade the mechanical properties of the wood such as the strength and stiffness but also reduce its aesthetic value.² These mechanical and aesthetical defects have furthermore a large impact on the commercial value of the wood and can diminish the utilization of such materials for further processing. There are many various types of defects arising from many different causes. The major wood defects include knots, fungal damage, cracks, warping, slanting, wormholes, and pitch defects. The seriousness of a defect, and therefore the grade and the cost of the material, is primarily determined by four criteria, including the size, location, type of the defect, and the purpose for which the wooden product will be used.^3,4

Even though the automation in this industrial sector is growing, many market leader companies still utilize trained domain experts to detect undesirable features and to perform quality grading.⁵ Besides the fact that the manual examination is tedious and biased, it was found that domain experts are not able to check large production volumes. Moreover, the study conducted by Urbonas et al.⁶ stated that due to factors such as eye fatigue or distraction, manual inspection rarely achieves 70 % reliability. To overcome the drawbacks of the manual examination, researchers try to develop automated systems, which are accurate and won't slow down the manufacturing process. According to the repeatability and quality of the inspection, the study performed by Lycken⁷ has already proved that automatic systems slightly outperformed human graders. Most of these systems were based on conventional image processing techniques in combination with supervised learning algorithms, however, over the last decade deep learning has achieved remarkable success in the forestry and wood products industry.⁸ Although researchers in the field were able to achieve satisfying results with the average recognition rate above 90 %,⁹ most of the authors worked with small-scale image datasets obtained in laboratory conditions by using self-developed vision system setups. Performing experiments in such conditions usually entails the disadvantage of a limited number of available products. In most of the studies,^2,6,10,11 researchers compensate for the lack of real products by using data augmentation techniques, which can expand the dataset up to 10 times its original size. From one point of view, data augmentation is considered to be an excellent tool to generalize the classification model and therefore prevent overfitting.¹² Nonetheless, it cannot ensure that the variability of the observed phenomenon will be sufficiently captured, especially in cases where the variability might be limitless.

In order to address the lack of extensive databases in the field, we performed an experiment with the goal to acquire a large-scale dataset of timber surface defects. Unlike other conducted studies, our experiment was placed in an industrial environment during real production, which allowed us to acquire a large amount of authentic data from the production line. To face the challenges arising from the manufacturing process, such as the high speed of the conveyor belt and heavy vibrations, we designed hardware as well as a software solution, which enabled acquisition of high-resolution images at the acquisition rate of 66 kHz. In this experiment, we acquire 20 276 original data samples of sawn timber surface, from which 1 992 images were without any surface defects, and 18 284 images captured one or more defects covering overall 10 types of common wood surface defects. The most frequent defects include live knots and dead knots, with an overall occurrence in the dataset of 58.8 % and 41.2 %, respectively. Furthermore, to provide more valuable information in this data descriptor, all dataset samples were complemented with two types of labels: a semantic label map for the semantic segmentation and a bounding box label.

Methods

Due to the industrial environment where the experiment was set, the most challenging part of this work was the dataset acquisition. Performing data acquisition in such an environment entailed several negative factors. One of those factors was that the sawmill production line utilized for this experiment is used for more than 300 days per year, with minimal pauses, which maximizes the manufacturer's profits. Also, we had to deal with the high speed of the sawmill conveyor belt, which reached a value of 9.6 m s⁻¹ at the place of the acquisition. This high speed of the conveyor causes constant heavy vibrations that in some peaks may result in fluctuations that are even centimetres in length. The main goal of the technical solution was therefore to create a robust and at the same time portable construction, which can be easily implemented in the sawmill environment.

Acquisition equipment

To overcome the limitation of this environment, we developed a mechanical construction for carrying the camera and the light source. The final construction assembled from ITEM aluminium profiles was at the place of the acquisition fixed to the production line construction and the floor, which helped to avoid the acquisition of blurry images. Although this solution didn’t deal directly with heavy vibrations, it ensured the harmonization of the conveyor vibrations with the mounted camera. The final mechanical solution implemented in the sawmill environment is demonstrated in Figure 1.

Figure 1. The mechanical construction, including the mounted camera and light source.

The distance between the line scan camera and the light source from the conveyor belt is 40 and 15 centimetres, respectively.

In order to obtain high-quality images at a speed of 9.6 m s⁻¹, a trilinear line scan camera SW-4000TL-PMCL manufactured by JAI was chosen. This camera was able to acquire 3 × 4096 pixels per line at the speed of 66 kHz. The required speed of the image acquisition was achieved by connecting the camera interface to a high-performance Camera Link frame grabber with the transfer speed parameter set to 10 tap mode. For this application, we selected the Silicon Software microEnable five marathon VCLx frame grabber with a PCIe interface that allows on-board high-speed data processing and high data throughput up to 1800 MB s⁻¹. The required field of view, which obtains a part of the sawn timber piece, with a width of 15 cm and the full length of 500 cm was achieved by using the Kowa LM50LF line scan camera lens. The selected camera, together with a 50 mm focal length lens placed at a distance of 40 cm from the measured object, led to a horizontal resolution of 16.66 pixels per millimetre. The vertical resolution R_v of the image was computed before the experiment by the following formula.

(1)

R_{v} = \frac{1}{\frac{\frac{v_{w}}{60} * L}{v_{c}}}

where v_w is the velocity of the conveyor, L is the number of lines per image, and v_c is the line rate of the camera. The resulting vertical resolution of 6.67 pixels per millimeter was afterward experimentally verified during the acquisition process.

Since the shutter of the camera was set to 3 μs to ensure the high-speed image acquisition, we had to use a powerful light source, which would sufficiently illuminate the desired field of view. For this purpose, we selected one of the most powerful light sources on the market, a linear LED light Corona II by Chromasens with the ability to provide a light intensity of 3.5 million lux. To achieve the best possible images, a white spectrum of the light was utilized.

Data acquisition

Instead of saving every single line during the acquisition process, we captured a block of 1024 lines, which resulted in an image resolution of 1024×4096. Such a high-resolution color image takes up approximately 12 MB of disk space. The used sampling frequency of 66 kHz with the total number of captured pixels resulted in a data transfer speed of 773 MB s⁻¹, which means that we were able to capture 66.4 images per second. Even though we used a very powerful computer, we found the process of saving this amount of data at such a high speed quite challenging. To overcome this challenging task, we had to separate image acquisition and image saving into two separate processes. While the acquisition process consisted of capturing a set of 84 images with a subsequent saving into the PC's RAM, the only task of the saving process was the transfer of the images from the computer RAM to the local hard disk drive. For this experiment, we employed two external 1 TB hard drives. To save CPU time during the acquisition and saving process, no online processing was performed.

Because transferring such a large amount of data between different software have a negative impact on CPU utilization and would decrease the frame rate, we used optimized frame grabber software, microDisplay X (runtime version 5.7) from Silicon Software.¹³ To use this software in an automated way, we developed an automatic clicker with a feedback loop based on the captured computer screen. In simple terms, the software reads the desired information from the screen and based on the information decides whether the acquisition or saving process is already completed. Additionally, it automatically assigns an incrementing filename to each captured image. This was mainly realized by using Windows library user32.dll, which allows the control of various aspects of mouse motion and button clicking. Since the saving process (loop) is almost 10 times slower than the acquisition process, the acquisition loop had to be temporarily stopped in each cycle. Despite the fact that this causes loss of the data continuity, it does not affect the study validity and reliability. We assumed that the acquisition process with the other support subroutines takes approximately 1.4 s while the saving process lasts 7.5 s. To maintain a predictable acquisition speed, including software delays, we introduced synchronization, which started a new cycle every 9 s.

Data processing

During four hours of acquisition, 60 480 images were acquired overall. Due to the limited third-party software functionality, the acquisition process had to be performed in a continuous mode, without any triggering option. This resulted in a large number of images of an empty conveyor or partly captured wood surface. To filter these meaningless data from the dataset, an offline histogram-based algorithm was created. The basic idea behind this algorithm is the sum calculation of the image green color space histogram. This sum value of the histogram can be in the next step divided by any number from the range of 5 to 10 (values in the range were deduced from the size of the images). The last step of the algorithm is based on a simple threshold, where all images with a resulting value of less than 10 were removed. Using this value of threshold ensured that only images that contained in the horizontal direction at least 40 % of the wood surface were kept. Since this filtration approach proved 100 % reliable in successfully filtering images with no wooden surface on 1500 randomly selected and manually sorted samples, we applied this filtering algorithm on the whole dataset. The filtering process reduced the dataset to a final number of 20 275 images.

Additionally, besides the filtration, we performed image cropping to remove the undesirable background from the images. This operation not only reduced the file size but also decreased the potential computation time for future use. To automatically crop each image in the dataset without any relevant data loss, we employed a simple straight-line edge detection technique in a vertical direction. Basically, the main principle of the algorithm is finding as many raising edge points in the desired direction as are needed to construct a line. The cropping operation was then performed on the image bounding box derived from the following formula.

(2)

BB (x_{1}, y_{1}, x_{2}, y_{2}) = (\frac{L_{x_{1} +} L_{x_{2}}}{2} - 150, L_{y_{1}}, \frac{L_{x_{1} +} L_{x_{2}}}{2} + 2650, L_{y_{2}})

where $BB (x_{1}, y_{1}, x_{2}, y_{2})$ is the cropped bounding box, and $L_{x_{1} y_{2} x_{1} y_{2}}$ stands for the image coordinates of the detected straight edge. Cropping the image, changed the image resolution to 2800 × 1024, and reduced the overall dataset size by almost 80 GB. An example of the image after the image crop operation is demonstrated in Figure 2.

Figure 2. A dataset example of a sawn timber surface with dead knots.

Ground truth labelling

The dataset annotation in this study was performed manually by a trained person. To accelerate this time-consuming process, we developed a customizable annotation tool. In comparison with other annotation tools available on the market, which didn’t fulfil our requirements, we created a universal application with the ability to manage bounding box labels, as well as labels for the semantic segmentation at the same time.¹⁴

For every single image, we created a BMP file representing a semantic map of the labeled defects. During the labeling process, the user manually painted zones in a displayed image, where each zone drawn with a selected color represents a specific defect. Each drawn zone was then automatically bounded with a zone of the particular label and a bounding rectangle. From the created zones, the tool automatically generated coordinates (left, top, right, bottom respectively) in the form of percent divided by 100, where a certain defect is located. For each processed image from the dataset, the annotation tool therefore created a text file including labels and bounding box coordinates and a semantic segmentation map with the configured color labels.

Data records

The dataset containing the data acquired in this experiment is publicly available.¹⁵ The dataset includes 1 992 images of sawn timbers without any defects and 18 283 timber images with one or more surface defects. On average, there are 2.2 defects per image, while only 6.7 % of images contain more than three defects. The highest occurrence of defects, which was captured during the experiment, was 16 defects per image. In this dataset, we present altogether 10 types of wood surface defects, including several types of knots, cracks, blue stains, resins, or marrows. An overall overview of all available wood surface defects with a number of occurrences is summarized in Table 1.

Table 1. Wood surface defects included in the database with the number of particular occurrences and an overall occurrence within the dataset.

Defect type	Number of occurrences	Number of images with the defect	Overall occurrence in the dataset [%]
Live knot	21 224	11 912	58.8
Dead knot	11 985	8 350	41.2
Knot with crack	2 276	1 835	9.1
Crack	2 169	1 578	7.8
Resin	3 455	2 624	12.9
Marrow	1 181	1 060	5.2
Quartzity	1 075	847	4,2
Knot missing	503	478	2.4
Blue stain	96	77	0.4
Overgrown	10	6	0.03

Each colour image with a resolution of 2800×1024 is provided in a BMP format in 10 separated zip folders labelled as Images.¹⁵ Additionally, we provide two types of annotations, semantic label maps, and bounding box labels. Both labels are provided in separate zip folders. The bounding box labels are located in a folder Bounding_Boxes and named as imagenumber_anno.txt, where the image number corresponds to the name of the original image in the dataset. Each original image has therefore one assigned text file, which can have multiple label records for each defect in the image. All bounding box labels have the following structure, where the first record represents the object label, and the subsequent values correspond to left, top, bottom, and right absolute positions of the defect in the image divided by 100.

\begin{matrix} Knot_OK & 0, 421786 & 0, 819336 & 0, 571429 & 1, 000000 \end{matrix}

Semantic label maps, used for semantic segmentation, are located in a folder, Semantic Maps. For each image in the dataset there exists just one semantic map in a BMP format with the label name in the form of imagenumber_segm.bmp, where the image number represents the corresponding name of the original image. In comparison to bounding box labels, each pixel of the semantic map image has its label, which is determined by a specified colour (see Figure 3).

Figure 3. Example of a semantic segmentation label.

The red label represents dead knots, the green label stands for live knots, and the dark yellow represents knots with cracks.

To see the exact label specification for the provided wood surface defect dataset, refer to Semantic Map Specification text file,¹⁵ or Table 2.

Table 2. Annotation colour specification for the provided dataset with hexadecimal colour codes.

Defect type	Colour	HEX colour code
Live knot	Green	00FF00
Dead knot	Red	FF0000
Knot with crack	Dark Yellow	FFAF00
Crack	Pink	FF0064
Resin	Magenta	FF00FF
Marrow	Blue	0000FF
Quartzity	Purple	640064
Knot missing	Orange	FF6400
Blue stain	Cyan	10FFFF
Overgrown	Dark Green	004000

Technical validation

The technical validation of the dataset was conducted by assessing the quality of the assigned labels by employing deep learning-based classification. For this purpose, we utilized a standard state-of-the-art Convolution neural network detector based on the ResNet-50 model.¹⁶ The selected neural network architecture was modified by adding Batch Normalization and ReLu layers after each convolution layer. The input layer of the network, and therefore all dataset images were downsampled to 1024×357. To train the neural network, we employed a transfer learning paradigm using pre-trained weights from the COCO dataset.¹⁷ Moreover, we performed data augmentation, including horizontal, vertical flip, translation and scaling, and divided the dataset into training and testing set in a conventional ratio of 40/60. To increase the detection of the labelled defects by the ResNet-50 model, several parameters were additionally modified on the basis of the trial-and-error process. These included sizes, strides, ratios and scales (see Table 3).

Table 3. A detailed specification of the modified neural network parameters.

Parameter	Values
Sizes	[32, 64, 128, 256, 512]
Strides	[8, 16, 32, 64, 12]
Ratios	[0.3, 0.55, 1, 2, 3.5]
Scales	[0.6, 0.8, 1]

At the beginning of the training, the first four layers of the network were frozen. After freezing the layers, the neural network was tuned by unfreezing the layers in a reverse order except for the Batch Normalization layer. The whole neural network was then finally fine-tuned at a low training speed. The overall number of epochs during the training was 30, while the training speed ranged between 10^-4 at the beginning and 10^-6 at the end of the training.

The trained ResNet-50 model resulted in an accuracy of 81 %. Since the neural network output a large number of false positives, the dataset was re-evaluated by a trained person who didn’t participate in the primary dataset labelling process.

Data availability

Underlying data

Zenodo: Underlying data for A large-scale image dataset of wood surface defects for automated vision-based quality control processes. ‘Deep Learning and Machine Vision based approaches for automated wood defect detection and quality control’. http://doi.org/10.5281/zenodo.4694695.¹⁵

This project contains the following underlying data:

• Bounding boxes
• Images 1–10
• Semantic map specification
• Semantic maps

Data are available under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0).

Software availability

Zenodo: Software for labeling wood surface defects and managing images. ‘Supporting tools for managing and labeling raw wood defect images’. http://doi.org/10.5281/zenodo.4904736.¹⁴

This project contains the following underlying data:

Labeler tool:

SubVI

• Labeler_software.vi
• Readme.txt
• Labeler.ini

Support Utils:

• Cutter.vi
• Sorter.vi

Data are available under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0).

References

1. Broman O, Fredriksson M: Wood material features and technical defects that affect yield in a finger joint production process. Wood Mater. Sci. Eng. 2012. Publisher Full Text
2. Ding F, Zhuang Z, Liu Y, et al.: Detecting defects on solid wood panels based on an improved SSD algorithm. Sensors. 2020. PubMed Abstract | Publisher Full Text | Free Full Text
3. Prokhorov M: Great Soviet Encyclopedia: A Translation of the Third Edition . New York: Collier Macmillan Publishers; 1973.
4. Çetiner I, Var AA, Çetiner H: Wood surface analysis with image processing technique. 22nd Signal Processing and Communications Applications Conference (SIU). 2014. Publisher Full Text
5. Gu IYH, Andersson H, Vicen R: Automatic classification of wood defects using support vector machines. In: Bolc L, Kulikowski JL, Wojciechowski K, editors. Lecture Notes in Computer Science. Berlin: Springer Science+Business Media; 2009. p. 356–367.
6. Urbonas A, Raudonis V, Maskeliūnas R, et al.: Automated identification of wood veneer surface defects using faster region-based convolutional neural network with data augmentation and transfer learning. Appl. Sci. 2019. Publisher Full Text
7. Lycken A: Comparison between automatic and manual quality grading of sawn softwood. Forest Prod. J. 2006; 56: 13–18.
8. Liu Z, Peng C, Work T, et al.: Application of machine-learning methods in forest ecology: Recent progress and future challenges. Environmental Reviews. 2018; 26. Publisher Full Text
9. Kryl M, Danys L, Jaros R, et al.: Wood recognition and quality imaging inspection systems. J. Sens. 2020; 2020. Publisher Full Text
10. He T, Liu Y, Xu C, et al.: A Fully Convolutional Neural Network for Wood Defect Location and Identification. IEEE Access. 2019. Publisher Full Text
11. Gao M, Chen J, Mu H, et al.: A Transfer Residual Neural Network Based on ResNet-34 for Detection of Wood Knot Defects. Forests. 2021; 12. Publisher Full Text
12. Jackson PTG, Amir A-A, Bonner S: Style augmentation: data augmentation via style randomization. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.
13. Basler AG: microDisplay X – The Reliable Path to Your First Image: Basler. Basler AG. 2021, May 28. Reference Source
14. Kodytek P, Bodzas A: Supporting tools for managing and labeling raw wood defect images. Zenodo. 2021. Publisher Full Text
15. Kodytek P, Bodzas A, Bilik P: Supporting data for Deep Learning and Machine Vision based approaches for automated wood defect detection and quality control. Zenodo. Dataset. 2021. Publisher Full Text
16. He K, Zhang X, Ren S, et al.: Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016. Publisher Full Text
17. Lin T, Maire M, Belongie S, et al.: Microsoft COCO: Common objects in context. ECCV. 2014.

Comments on this article Comments (0)

Version 2

VERSION 2 PUBLISHED 16 Jul 2021

Author details Author details

¹ Department of Cybernetics and Biomedical Engineering, VSB-Technical University of Ostrava, Ostrava, 70800, Czech Republic

Pavel Kodytek
Roles: Conceptualization, Investigation, Methodology, Software, Validation, Writing – Original Draft Preparation

Alexandra Bodzas
Roles: Software, Writing – Original Draft Preparation, Writing – Review & Editing

Petr Bilik
Roles: Funding Acquisition, Project Administration, Supervision

Competing interests

No competing interests were disclosed.

Grant information

This work was supported by the “Student Grant System” of VSB-TU Ostrava, project number SP2021/123.

Article Versions (2)

version 2

Revised

Published: 27 Jun 2022, 10:581

https://doi.org/10.12688/f1000research.52903.2

version 1

Published: 16 Jul 2021, 10:581

https://doi.org/10.12688/f1000research.52903.1

© 2021 Kodytek P et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

SEE MORE DETAILS

CITE

how to cite this article

Kodytek P, Bodzas A and Bilik P. A large-scale image dataset of wood surface defects for automated vision-based quality control processes [version 1; peer review: 2 approved with reservations]. F1000Research 2021, 10:581 (https://doi.org/10.12688/f1000research.52903.1)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?

Key to Reviewer Statuses VIEW HIDE

ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested

Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.

Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions

Version 1

VERSION 1

PUBLISHED 16 Jul 2021

Views

Reviewer Report 26 May 2022

Sri Rahayu, Universitas Nusa Mandiri, East Jakarta, Indonesia

Approved with Reservations

https://doi.org/10.5256/f1000research.56234.r136809

Wood defect datasets are still rare, so this research is very helpful for the wood industry and researchers interested in this field. The authors explain in detail the reasons for building this database with clear data acquisition and collection techniques.

Some of the methods used to construct semantic images and image testing may be replicated in others. The image acquisition technique used will be of great value if you mention the acquisition technique in other similar studies.

It would be even better if the authors provide examples of images for each image class and also presents the stages of the research in the form of a chart

Is the rationale for creating the dataset(s) clearly described?

Yes
Are the protocols appropriate and is the work technically sound?

Yes
Are sufficient details of methods and materials provided to allow replication by others?

Partly
Are the datasets clearly presented in a useable and accessible format?

Partly

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Single-cell technologies

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

CITE

Report a concern

Author Response 27 Jun 2022

Alexandra Bodzas, Department of Cybernetics and Biomedical Engineering, VSB-Technical University of Ostrava, Ostrava, 70800, Czech Republic

27 Jun 2022

Author Response

The point-by-point responses to comments:

1. We complemented the Introduction section with a paragraph describing wood defects acquisition techniques used in other studies (Paragraph 3).

2. We complemented the paper ... Continue reading The point-by-point responses to comments:

1. We complemented the Introduction section with a paragraph describing wood defects acquisition techniques used in other studies (Paragraph 3).

2. We complemented the paper with a diagram presenting the particular stages of the dataset acquisition (Figure 1) and a figure containing image examples for each class (Figure 4).
The point-by-point responses to comments:

1. We complemented the Introduction section with a paragraph describing wood defects acquisition techniques used in other studies (Paragraph 3).

2. We complemented the paper with a diagram presenting the particular stages of the dataset acquisition (Figure 1) and a figure containing image examples for each class (Figure 4).
Competing Interests: No competing interests were disclosed. Close
Report a concern
Respond or Comment

COMMENTS ON THIS REPORT

Author Response 27 Jun 2022

Alexandra Bodzas, Department of Cybernetics and Biomedical Engineering, VSB-Technical University of Ostrava, Ostrava, 70800, Czech Republic

27 Jun 2022

Author Response

The point-by-point responses to comments:

1. We complemented the Introduction section with a paragraph describing wood defects acquisition techniques used in other studies (Paragraph 3).

2. We complemented the paper ... Continue reading The point-by-point responses to comments:

1. We complemented the Introduction section with a paragraph describing wood defects acquisition techniques used in other studies (Paragraph 3).

2. We complemented the paper with a diagram presenting the particular stages of the dataset acquisition (Figure 1) and a figure containing image examples for each class (Figure 4).
The point-by-point responses to comments:

1. We complemented the Introduction section with a paragraph describing wood defects acquisition techniques used in other studies (Paragraph 3).

2. We complemented the paper with a diagram presenting the particular stages of the dataset acquisition (Figure 1) and a figure containing image examples for each class (Figure 4).
Competing Interests: No competing interests were disclosed. Close
Report a concern

Views

Reviewer Report 25 Nov 2021

Mariusz Pelc, Faculty of Electrical Engineering, Automatic Control and Informatics, Opole University of Technology, Opole, Poland; School of Mathematical and Computing Sciences, University of Greenwich, London, UK

Approved with Reservations

https://doi.org/10.5256/f1000research.56234.r100946

This paper deals with a relevant problem which is detection of wood surface defects.

For the purpose of the research, the authors have come up with a coherent methodology allowing them to acquire all required data, then mapping and detecting defects. From an algorithmic viewpoint it all makes sense, besides, the whole methodology/algorithm validation has also been performed.

So, the paper ticks pretty much all the boxes (relevance, novelty, etc.) and as such it qualifies for publication.

However, the paper requires some substantial changes in the following areas:

There is no related work section which makes it really difficult to understand the authors' contribution to the field. I would recommend adding such a section (even if it is brief) where similar solutions would be discussed and confronted with what the authors are proposing in this paper.
Every single paper should include a conclusion section allowing all readers to understand key findings of the research. This paper is lacking a conclusion section which is quite an omission.
Some tables (e.g. Table 1) should be re-done as their versions included in the paper are hardly readable. Usually one look at a table provides a lot of information about the results whilst in this paper this is not the case. I would suggest the authors re-format all tables to make all the dates gathered in the table easy to see and understand.
The "Software availability" section should be rewritten. I would suggest the authors make this section easily comprehensible via adding some more description of the software tools used and maybe outline some key feature(s) of the software. Also, based on the section contents, the section title better reflecting this would be e.g. "Supporting software tools" where first paragraph should say that in this research the following software was used (then outline the software and how it was used).
Referencing - I only want to make sure that the authors have used the proper referencing style since the most frequently used are either Harvard or IEEE, whilst the authors have used a foot-note like referencing style.
The whole paper is written in maybe not error-free but still quite coherent and comprehensible English. But I would still recommend at least one more proof reading to make sure that there are no obvious mistakes left in the text.

Based on the above consideration I would recommend accepting the paper for indexing after revision.

Is the rationale for creating the dataset(s) clearly described?

Yes
Are the protocols appropriate and is the work technically sound?

Yes
Are sufficient details of methods and materials provided to allow replication by others?

Yes
Are the datasets clearly presented in a useable and accessible format?

Yes

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Computer science, data / signal processing, automation and robotics, bio-medical engineering, expert sytsems.

CITE

Report a concern

Author Response 27 Jun 2022

Alexandra Bodzas, Department of Cybernetics and Biomedical Engineering, VSB-Technical University of Ostrava, Ostrava, 70800, Czech Republic

27 Jun 2022

Author Response

The point-by-point responses to comments:

1. Our paper was written in accordance with the journal guidelines for a Data Note article, which slightly differs from an original research article. Concerning ... Continue reading The point-by-point responses to comments:

1. Our paper was written in accordance with the journal guidelines for a Data Note article, which slightly differs from an original research article. Concerning the Data Note article, there is no related work section included in the article structure. However, to fulfill your requirements, we complemented the introduction section with a paragraph where we discussed similar solutions. The importance of this research is then explained in subsequent paragraphs.

2. The conclusion section is omitted intentionally again since we followed the data note article guidelines.

3. The tables in the article cannot be re-formatted since the article had been formatted by the editorial team before the publication. Table formatting is within the scope of an editorial team that formats the table according to the journal standards. Since the article is an online article, the tables are accessible and visible in full size after clicking on them.

4. The original Software availability section was rewritten before the publication to fulfill the editorial team's requirements. However, we complemented this section by adding some descriptions of the software. The title of this section cannot be changed since it follows the guidelines for a data note article and journal standards.

5. We unified the reference styles within the references. All references in a reference list are according to the Harvard referencing style. The in-text citation on the other hand are according to the journal standards, and the footnotes were added by the editorial team during the typesetting process

6. The paper was proofread and edited. The obvious mistakes were corrected.
The point-by-point responses to comments:

1. Our paper was written in accordance with the journal guidelines for a Data Note article, which slightly differs from an original research article. Concerning the Data Note article, there is no related work section included in the article structure. However, to fulfill your requirements, we complemented the introduction section with a paragraph where we discussed similar solutions. The importance of this research is then explained in subsequent paragraphs.

2. The conclusion section is omitted intentionally again since we followed the data note article guidelines.

3. The tables in the article cannot be re-formatted since the article had been formatted by the editorial team before the publication. Table formatting is within the scope of an editorial team that formats the table according to the journal standards. Since the article is an online article, the tables are accessible and visible in full size after clicking on them.

4. The original Software availability section was rewritten before the publication to fulfill the editorial team's requirements. However, we complemented this section by adding some descriptions of the software. The title of this section cannot be changed since it follows the guidelines for a data note article and journal standards.

5. We unified the reference styles within the references. All references in a reference list are according to the Harvard referencing style. The in-text citation on the other hand are according to the journal standards, and the footnotes were added by the editorial team during the typesetting process

6. The paper was proofread and edited. The obvious mistakes were corrected.
Competing Interests: No competing interests were disclosed. Close
Report a concern
Respond or Comment

COMMENTS ON THIS REPORT

Author Response 27 Jun 2022

Alexandra Bodzas, Department of Cybernetics and Biomedical Engineering, VSB-Technical University of Ostrava, Ostrava, 70800, Czech Republic

27 Jun 2022

Author Response

The point-by-point responses to comments:

1. Our paper was written in accordance with the journal guidelines for a Data Note article, which slightly differs from an original research article. Concerning ... Continue reading The point-by-point responses to comments:

1. Our paper was written in accordance with the journal guidelines for a Data Note article, which slightly differs from an original research article. Concerning the Data Note article, there is no related work section included in the article structure. However, to fulfill your requirements, we complemented the introduction section with a paragraph where we discussed similar solutions. The importance of this research is then explained in subsequent paragraphs.

2. The conclusion section is omitted intentionally again since we followed the data note article guidelines.

3. The tables in the article cannot be re-formatted since the article had been formatted by the editorial team before the publication. Table formatting is within the scope of an editorial team that formats the table according to the journal standards. Since the article is an online article, the tables are accessible and visible in full size after clicking on them.

4. The original Software availability section was rewritten before the publication to fulfill the editorial team's requirements. However, we complemented this section by adding some descriptions of the software. The title of this section cannot be changed since it follows the guidelines for a data note article and journal standards.

5. We unified the reference styles within the references. All references in a reference list are according to the Harvard referencing style. The in-text citation on the other hand are according to the journal standards, and the footnotes were added by the editorial team during the typesetting process

6. The paper was proofread and edited. The obvious mistakes were corrected.
The point-by-point responses to comments:

1. Our paper was written in accordance with the journal guidelines for a Data Note article, which slightly differs from an original research article. Concerning the Data Note article, there is no related work section included in the article structure. However, to fulfill your requirements, we complemented the introduction section with a paragraph where we discussed similar solutions. The importance of this research is then explained in subsequent paragraphs.

2. The conclusion section is omitted intentionally again since we followed the data note article guidelines.

3. The tables in the article cannot be re-formatted since the article had been formatted by the editorial team before the publication. Table formatting is within the scope of an editorial team that formats the table according to the journal standards. Since the article is an online article, the tables are accessible and visible in full size after clicking on them.

4. The original Software availability section was rewritten before the publication to fulfill the editorial team's requirements. However, we complemented this section by adding some descriptions of the software. The title of this section cannot be changed since it follows the guidelines for a data note article and journal standards.

5. We unified the reference styles within the references. All references in a reference list are according to the Harvard referencing style. The in-text citation on the other hand are according to the journal standards, and the footnotes were added by the editorial team during the typesetting process

6. The paper was proofread and edited. The obvious mistakes were corrected.
Competing Interests: No competing interests were disclosed. Close
Report a concern

Comments on this article Comments (0)

Version 2

VERSION 2 PUBLISHED 16 Jul 2021

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1	2
Version 2 (revision) 27 Jun 22	read	read
Version 1 16 Jul 21	read	read

Mariusz Pelc, Opole University of Technology, Opole, Poland; University of Greenwich, London, UK
Sri Rahayu, Universitas Nusa Mandiri, East Jakarta, Indonesia

Comments on this article

All Comments(0)

Add a comment

Browse by related subjects

Back to all reports

Reviewer Report

30 Views

12 Jul 2022 | for Version 2

Sri Rahayu, Universitas Nusa Mandiri, East Jakarta, Indonesia

30 Views Cite this report Responses(0)

Approved

The author has fulfilled the suggestions given.

A minor suggestion: provide some examples of currently available wood defect image datasets from previous researchers, as well as to make more references. Since the paper also produces a wood defect image dataset as well, I think it is better for the author to acknowledge other work to broaden the audience's view that other similar datasets are also available. If I had to mention examples of currently available wood defect image datasets from previous researchers, it is Riana et al., 2021¹.

References

1. Riana D, Rahayu S, Hasan M, Anton: Comparison of segmentation and identification of swietenia mahagoni wood defects with augmentation images.Heliyon. 2021; 7 (6): e07417 PubMed Abstract | Publisher Full Text

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Single-cell technologies

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

40 Views

08 Jul 2022 | for Version 2

40 Views Cite this report Responses(0)

Approved

I have read the revised version of the paper and I am happy to say that the paper has now reached the appropriate standard for indexing. Authors have made changes to the key elements (including use of English).

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Computer science, data / signal processing, automation and robotics, bio-medical engineering, expert sytsems.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

46 Views

26 May 2022 | for Version 1

Sri Rahayu, Universitas Nusa Mandiri, East Jakarta, Indonesia

46 Views Cite this report Responses(1)

Approved With Reservations

Is the rationale for creating the dataset(s) clearly described?

Yes
Are the protocols appropriate and is the work technically sound?

Yes
Are sufficient details of methods and materials provided to allow replication by others?

Partly
Are the datasets clearly presented in a useable and accessible format?

Partly

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Single-cell technologies

Respond to this report

Responses (1)

Author Response

27 Jun 2022

Alexandra Bodzas, Department of Cybernetics and Biomedical Engineering, VSB-Technical University of Ostrava, Ostrava, 70800, Czech Republic

The point-by-point responses to comments:

1. We complemented the Introduction section with a paragraph describing wood defects acquisition techniques used in other studies (Paragraph 3).

2. We complemented the paper with a diagram presenting the particular stages of the dataset acquisition (Figure 1) and a figure containing image examples for each class (Figure 4).

View more View less

Competing Interests

No competing interests were disclosed.

Back to all reports

Reviewer Report

48 Views

25 Nov 2021 | for Version 1

48 Views Cite this report Responses(1)

Approved With Reservations

There is no related work section which makes it really difficult to understand the authors' contribution to the field. I would recommend adding such a section (even if it is brief) where similar solutions would be discussed and confronted with what the authors are proposing in this paper.
Every single paper should include a conclusion section allowing all readers to understand key findings of the research. This paper is lacking a conclusion section which is quite an omission.
Some tables (e.g. Table 1) should be re-done as their versions included in the paper are hardly readable. Usually one look at a table provides a lot of information about the results whilst in this paper this is not the case. I would suggest the authors re-format all tables to make all the dates gathered in the table easy to see and understand.
The "Software availability" section should be rewritten. I would suggest the authors make this section easily comprehensible via adding some more description of the software tools used and maybe outline some key feature(s) of the software. Also, based on the section contents, the section title better reflecting this would be e.g. "Supporting software tools" where first paragraph should say that in this research the following software was used (then outline the software and how it was used).
Referencing - I only want to make sure that the authors have used the proper referencing style since the most frequently used are either Harvard or IEEE, whilst the authors have used a foot-note like referencing style.
The whole paper is written in maybe not error-free but still quite coherent and comprehensible English. But I would still recommend at least one more proof reading to make sure that there are no obvious mistakes left in the text.

Based on the above consideration I would recommend accepting the paper for indexing after revision.

Is the rationale for creating the dataset(s) clearly described?

Yes
Are the protocols appropriate and is the work technically sound?

Yes
Are sufficient details of methods and materials provided to allow replication by others?

Yes
Are the datasets clearly presented in a useable and accessible format?

Yes

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Computer science, data / signal processing, automation and robotics, bio-medical engineering, expert sytsems.

Respond to this report

Responses (1)

Author Response

27 Jun 2022

Alexandra Bodzas, Department of Cybernetics and Biomedical Engineering, VSB-Technical University of Ostrava, Ostrava, 70800, Czech Republic

The point-by-point responses to comments:

1. Our paper was written in accordance with the journal guidelines for a Data Note article, which slightly differs from an original research article. Concerning the Data Note article, there is no related work section included in the article structure. However, to fulfill your requirements, we complemented the introduction section with a paragraph where we discussed similar solutions. The importance of this research is then explained in subsequent paragraphs.

2. The conclusion section is omitted intentionally again since we followed the data note article guidelines.

3. The tables in the article cannot be re-formatted since the article had been formatted by the editorial team before the publication. Table formatting is within the scope of an editorial team that formats the table according to the journal standards. Since the article is an online article, the tables are accessible and visible in full size after clicking on them.

4. The original Software availability section was rewritten before the publication to fulfill the editorial team's requirements. However, we complemented this section by adding some descriptions of the software. The title of this section cannot be changed since it follows the guidelines for a data note article and journal standards.

5. We unified the reference styles within the references. All references in a reference list are according to the Harvard referencing style. The in-text citation on the other hand are according to the journal standards, and the footnotes were added by the editorial team during the typesetting process

6. The paper was proofread and edited. The obvious mistakes were corrected.

View more View less

Competing Interests

No competing interests were disclosed.

Alongside their report, reviewers assign a status to the article:

Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested

Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.

Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions

[1] 1. Broman O, Fredriksson M: Wood material features and technical defects that affect yield in a finger joint production process. Wood Mater. Sci. Eng. 2012. Publisher Full Text

[2] 2. Ding F, Zhuang Z, Liu Y, et al.: Detecting defects on solid wood panels based on an improved SSD algorithm. Sensors. 2020. PubMed Abstract | Publisher Full Text | Free Full Text

[3] 3. Prokhorov M: Great Soviet Encyclopedia: A Translation of the Third Edition . New York: Collier Macmillan Publishers; 1973.

[4] 4. Çetiner I, Var AA, Çetiner H: Wood surface analysis with image processing technique. 22nd Signal Processing and Communications Applications Conference (SIU). 2014. Publisher Full Text

[5] 5. Gu IYH, Andersson H, Vicen R: Automatic classification of wood defects using support vector machines. In: Bolc L, Kulikowski JL, Wojciechowski K, editors. Lecture Notes in Computer Science. Berlin: Springer Science+Business Media; 2009. p. 356–367.

[6] 6. Urbonas A, Raudonis V, Maskeliūnas R, et al.: Automated identification of wood veneer surface defects using faster region-based convolutional neural network with data augmentation and transfer learning. Appl. Sci. 2019. Publisher Full Text

[7] 7. Lycken A: Comparison between automatic and manual quality grading of sawn softwood. Forest Prod. J. 2006; 56: 13–18.

[8] 8. Liu Z, Peng C, Work T, et al.: Application of machine-learning methods in forest ecology: Recent progress and future challenges. Environmental Reviews. 2018; 26. Publisher Full Text

[9] 9. Kryl M, Danys L, Jaros R, et al.: Wood recognition and quality imaging inspection systems. J. Sens. 2020; 2020. Publisher Full Text

[10] 10. He T, Liu Y, Xu C, et al.: A Fully Convolutional Neural Network for Wood Defect Location and Identification. IEEE Access. 2019. Publisher Full Text

[11] 11. Gao M, Chen J, Mu H, et al.: A Transfer Residual Neural Network Based on ResNet-34 for Detection of Wood Knot Defects. Forests. 2021; 12. Publisher Full Text

[12] 12. Jackson PTG, Amir A-A, Bonner S: Style augmentation: data augmentation via style randomization. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.

[13] 13. Basler AG: microDisplay X – The Reliable Path to Your First Image: Basler. Basler AG. 2021, May 28. Reference Source

[14] 14. Kodytek P, Bodzas A: Supporting tools for managing and labeling raw wood defect images. Zenodo. 2021. Publisher Full Text

[15] 15. Kodytek P, Bodzas A, Bilik P: Supporting data for Deep Learning and Machine Vision based approaches for automated wood defect detection and quality control. Zenodo. Dataset. 2021. Publisher Full Text

[16] 16. He K, Zhang X, Ren S, et al.: Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016. Publisher Full Text

[17] 17. Lin T, Maire M, Belongie S, et al.: Microsoft COCO: Common objects in context. ECCV. 2014.

A large-scale image dataset of wood surface defects for automated vision-based quality control processes

Abstract

Keywords

Introduction

Methods

Acquisition equipment

Figure 1. The mechanical construction, including the mounted camera and light source.

(1)

Data acquisition

Data processing

(2)

Figure 2. A dataset example of a sawn timber surface with dead knots.

Ground truth labelling

Data records

Table 1. Wood surface defects included in the database with the number of particular occurrences and an overall occurrence within the dataset.

Figure 3. Example of a semantic segmentation label.

Table 2. Annotation colour specification for the provided dataset with hexadecimal colour codes.

Technical validation

Table 3. A detailed specification of the modified neural network parameters.

Data availability

Underlying data

Software availability

References

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated