Thermal imaging with drone

5 / 5. 2

Thermal imaging with drone

Category: Architecture

Subcategory: Computing

Level: University

Pages: 29

Words: 7975

Most methods use 2D thermal images because they are easily obtainable. However, the proposed procedure for 3D model reconstruction in this paper can provide helpful support in making interpretation and evaluation of thermal data more automatic, faster and objective. Contrary to other methods, thermal imaging can work in environments without ambient light. It can penetrate obscurants like smoke, haze, and fog. The paper will address the challenge of 3D reconstruction from images of thermal infrared (TIR). It seeks to demonstrate that commercial computer vision software may be used to orient the sequence of TIR images from a drone and produce 3D point cloud without having to acquire any GNSS/IN information about altitude and position of images or the parameters of the camera calibration. The paper proposes a procedure which is based on “Iterative Closest Point algorithm (ICP)” to come up with a model which combines the geometric precision and also high resolution of RGB with thermal information from TIR images.
Keywords: Photogrammetry, Unmanned Aerial Vehicles, Computer Vision, Thermal Infrared Images

List of Acronyms
UAV Unmanned Aerial Vehicle
TIR Thermal Infrared
RGB Red, Green and Blue
CPUCentral Processing Unit
CUDACompute Unified Device Architecture
GCP Ground Control Points
GPU Graphic Processing Unit
MSAC M-estimator Sample Consensus
GNSS Global Navigation Satellite System
SFM Structure from motion

IntroductionThermal imaging is a way of improving the visibility of an object in a constrained environment by detecting the infrared radiation of the object and using that information to create an image. Thermal imaging systems are currently used in various applications such as landslide assessment and inspection of photovoltaic plants. Thermography is a non-contact technology which allows the capturing of thermal irregularities, resulting from damages or flaw localized in buildings. Originally made for military during the “Korean War,” thermal imaging cameras have since migrated into many other fields including search and rescue sector, law enforcement and building inspection. In the US, the firefighters are increasingly using this technology as a result of increased availability of state equipment grants, following the 9/11 incident. The technology avoids destructive interventions which are aimed at identifying the sources of the problem.
Recent developments of UAV platforms represent great advantages for data collection processes and introducing low-cost solutions which can quickly deliver spatial and temporal information. A thermal imaging camera that is mounted on UAV system allows easy inspections for facades and also building roofs that can’t be captured by the terrestrial images..
Although heat distributions can be provided as 2D projections, they can’t give 3D or depth information of the source of heat, so the distance from the scanner can’t be deduced readily from a single image (Sampaio and Simoes, 2014, p.20). Consequently, analysing TIR images may be complicated because it’s normally done manually. Images are also interpreted individually with no view of the building. To monitor big structures, 3D building models becomes essential. Digital Surface Modes from aerial thermal imagery normally have comparable precision with the one generated from visible images. A fusion of TIR images and “time-of-flight depth images” is used in creating an accurate 3D point for subsequent segmentation. The co-registration images taken with TIR and RGB cameras will give an advantage as far as density and accuracy of the resultant 3D point cloud is concerned. Photogrammetry is the method of measuring positions in 3D spaces by the use of photography. It is the technique employed for coming up with 3D objects. It works well, and it’s becoming common in the creation of 3D models of landscapes by the use of synchronized drones for capturing photos. The commercial use of these cameras has been on the rise, more so, with the introduction of drones. The thermal cameras normally give video outputs which indicate the level of radiated energy.
Project objectivesThe aim of the project is to determine the sequences of TLR images taken from an “Unmanned Aerial Vehicle (UAV)” and to produce 3D point cloud without the use of any geospatial position and also altitude of the aerial images or any camera calibration parameters. In other words, the paper aims to show that high-quality 3D models reconstruction of buildings can be retrieved from an automatic and fast way from TIR using a commercial Photogrammetry and 3D modelling software. The paper also seeks to propose an automatic methodology for integration of RGB and TIR images. It is about improving building inspection by producing 3D reconstruction model with thermal imagery using a drone.
Literature ReviewA study conducted by the US Department of Energy shows that the commercial and residential buildings are responsible for about 40 percent of the primary energy consumption in the country. 71% of this value is attributed to electrical energy consumption at a cost nearly $400 billion annually (Energy Information Administration, 2017). Whenever an object radiates at a higher temperature compared to the surrounding, the power transfer takes place, and the power radiates from warm to cold as far as thermodynamics is concerned. Thus, if there is a cold area in the thermogram, the object absorbs the radiation which has been emitted by the warm object. The European Commission approximates that the largest and cost-effective energy savings potential “lies in residential (27%) and commercial (30%) buildings” (Zachariadis et al., 2018). “Building envelope” is a term which encompasses the roofs, walls, windows, skylights, and doors of any building that allows the transfers of energy if ambient temperatures change.
According to Maurizio Carlini, the energy exchange through the envelope between the outside ambient and the inside conditioned space is a function of pressure and the temperature difference between these environments (Maurizio Carlini, 2014, p.395). It can be a source of operating inefficiencies of a building. The energy losses of a building are primarily due to the infiltration of unconditioned air into conditioned spaces, poor installation of thermal insulations and aging of the structure.
A study conducted by El-Hakin (2003, p. 302) argues that photogrammetry today is increasingly using terrestrial laser scanning to generate 3D models. However, for one to fully exploit the 3D data, typical limitations of conventional orthorectification must be addressed. It is the inability of handling both image occlusions and surface self-occlusions. Therefore, there is the need to adopt an approach that allows automatic generation of perspective views and orthoimages based on 3D models from multi-image texture interpolation and laser scanning. The occlusion problem is first solved by identification of surface points which are visible in the projection direction. The texture is interpolated using all images through blending which view every particular surface point.
According to Raheem Ariwoola (2016), studying building envelopes gives a good analytical and qualitative understanding of thermal performances of major elements of building envelops. It points out major deficiencies and assists in developing suitable measures for energy conservation to better the performance of a building. Better energy performance leads to less carbon emissions, better human health and also saves energy cost. It contributes to the bottom line of sustainability.
According to El-Hakin (2003, p. 302), there are various limitations and also merits of automatic or interactive image-based approaches for 3D modelling. However, it is agreed that in several cases, terrestrial laser scanning allows accurate surface modelling, thereby providing the accurate 3D basis for coming up with 2D projections or texture mapping. It also provides a basis for creating ortho-projections in particular. The main limitations of the conventional orthorectification software regard the question about occlusions such as double projection and blind areas. On the other side, self-occlusions in the projection direction must be established which is a challenge that is irrelevant in typical cases of 2.5D modelling but vital for 3D models. Also, the conventional software may not respond well even in a simple case of 2.5D models by not recognize image occlusions (Oniga, Chirilă, and Stătescu, 2017, p. 551). Hence rigorous tools for digital ortho-projections have to face both occlusion aspects. The issue of visibility extends beyond strict limits of ortho-projections to involve creating perspective views and placing the matter in wider frameworks of texture-mapping. The methods employing multiple oblique aerial images for texturing 3D models obtained through terrestrial and aerial laser scanning may be considered as a generalization of orthoimage.
Why use 3D modelling?The recent technology advancement in the construction industry creates an opportunity for significantly improving safety, accelerating project delivery and minimizing the number of insurance claims and change orders. Although heat distributions can be provided as 2D projections, they cannot provide depth or 3D information of the heat source, so the distance from the scanner cannot be readily deduced from one single image. As a result, multiple images are used to give a 3D model. For large buildings housing heating plants and air conditioning, each image reproduces just a slight portion as a result of the building size. Therefore, it’s highly recommended to create 3D models and an orthophoto to have comprehensive views of the structure and also to make it easier to evaluate and interpret thermal data. The technology enables the collection of more and better information efficiently while safety is also improved.
However, various constructors are still using the traditional tools which are inefficient, putting workers in situations that are unsafe to verify results or measure quantities. As technology adoption among constructors increase, the technology divide between inspectors and constructors also deepens, putting several resident engineers in difficult positions. Inspectors who don’t use the 3D will miss opportunities to pre-emptively catch issues, verify construction process in real time and collect rich 3D records of construction which supports measurements for payments and daily diary entries. Therefore, inspectors should be challenged to keep the pace with the modern tools which use 3D data to help improve safety, accelerate the delivery of the project and minimize the number of insurance claims and change orders.
3D flow 3D Zephyr3DF Zephyr is a commercial photogrammetry and 3D modeling software developed by 3Dflow. It is a complete photogrammetry software that entails several post-processing tools for 3D modeling, content creation, post-processing, and measurements. It allows reconstruction of 3D from videos and photos by extracting frames automatically and selecting the most appropriate for computation. The process totally automatic and no manual editing, special equipment or corded targets are required. It normally comes with an interface which is user-friendly and the possibility of exporting in several 3D formats or generating lossless video without needing certain external tools. Also, it is possible to calculate volumes, angles, areas and contour lines and to generate digital elevation models and true orthophotos. In other words, it is a good tool for modeling from realism. 3D Zephyr employs several technologies which it has developed in-house (Oniga, Chirilă, and Stătescu, 2017, p. 551). They include proprietary algorithms like 3D Samantha and Stasia that have recently received great recognition, more so, in the area of Computer Vision.
3D Zephyr is known as one of the accurate and complete photogrammetry programs that are currently available (Sampaio and Simoes, 2014, p. 20). It one among the first software to integrate complete photogrammetry solutions with the management of all laser scan data, both pre-aligned or not and colored or not. Therefore, it is a complete tool that allows one to edit and also manage 3D contents. It’s is more than just creating 3D models from photos. It can be applied in a wide range of filters, drawing CAD polylines, extracting sections and level curves, computing volumes and other measures from one’s data.
There are three algorithms in 3DF Zephyr that were used in producing the 3D model. The first one is the3DF Samantha, which is a structure from motion technology. It can recover the orientation of images automatically. It has a great auto-calibration algorithm which makes it possible to function with digital cameras, including TIR ones (Luhmann, Piechel, and Roelfs, 2013, p. 27). In the scientific community, it is known as one of the advanced and effective technologies that allow the recovery of photo position. It can understand the direction the cameras are looking, even from sparse images. The second one is the 3DF Stasia – Multiview Stereo. It extracts accurate and also dense point clouds by “multi-views” stereo. 3DF Stasia speaks with Samantha to provide an accurate reconstruction as much as possible. It is the technology employed in creating the 3D point cloud, starting from 2D images. The third one is the 3D Sasha – Mesh extraction, it produces a “texture mapped triangular mesh”. The three algorithms make 3D modelling very easily as you only require one software to do it all. It enables the retrieval of sharp edges on the 3D model.
It is vital to note that this software is a computationally demanding program that takes advantage of any available CPU cores. It can also make the use of CUDA technology when available. CUDA enabled video has the capability of drastically improving the performances because multi GPU configurations are supported. It is normally recommended to exit any running application while performing a reconstruction using the 3D Zephyr. After cameras have been aligned, one can then proceed with the extraction of dense reconstruction. It requires the input camera to be correct. The process of reconstruction is powered by Stasia, which is the multi-view stereo technology. Different dense reconstructions can be created by varying parameters. Beginning with the dense reconstruction, one can come up with a triangular mesh. Based on the what you intend to reconstruct, one may select the generation of sharp surfaces that fit buildings. The latter types of surfaces are taken care of by Sasha, which is the latest algorithm made by the 3Dflow.
MethodologyStructure from motion (SFM) technique was used to estimate the 3D structures from 2D image sequences that coupled with local motion signal (Toldo et al.,2015).
Model georeferencing was used express the camera orientations and the coordinates of 3D points in local reference frames and also defined to a scale. Georeferencing involves assigning real-global coordinates to every pixel of the raster. Using these coordinates or Ground Control Points (GCP), the images are warped and made to be fitted within the selected coordinate system. To georeference an image, you need to establish control points first, input the “geographic coordinates of the control points” that are known, select the projection parameters and coordinate system and minimize residuals. Normally, residuals are the difference between the predicted coordinates by the geographic model and the real coordinates of control points.
The residual value can be calculated using;

𝜀 – is the thermal emissivity value on spectrum
𝜎 – is the Bolzmann constant.
The residuals provide a way of determining the accuracy levels of georeferencing process.

Figure SEQ Figure * ARABIC 4: Georeferencing
The multi-view stereo was used in recovering a dense point which represents the surface of the image when the position and also the attitude of the image are provided (Toldo et al., 2013). Using this technology, different dense reconstructions can be created by varying parameters. Structure from motion is normally used to structure the images. It implies that it will estimate the camera parameters, its orientation, and the photo location. Multi-view stereo will take this orientation and the location from SFM to make the 3D dense point cloud. It takes a large set of images then constructs a 3D credible geometry which explains these images under certain reasonable assumptions. “A correlation profile Cj(ζ), which is parameterised with the depth ζ, computed for every pixel m and its neighbour image Ij ∈ η(Ii).”
Camera Geometry can be used, thanks to epipolar geometry

Figure SEQ Figure * ARABIC 5:In depth image of the dense point cloud obtained from the thermal (TLR) dataset.Using both mesh generation technique and texture mapping technique to create a texture triangular mesh. The colour value from every image corresponding to a specific orthophoto pixel is interpolated through convolution as a result of its smoothing effects. A texture TIR map is finally built by making use of “the view angle” and visibility information. The UV coordinates can be calculated using

3D Sasha will enable the retrieval of sharp edges on the 3D model hence it is suitable for applications such as industrial survey, architecture and urban monitoring.
1) 2)
3) 4)
Figure SEQ Figure * ARABIC 6: “3D reconstruction from TIR images: 1) sparse point cloud, 2) dense point cloud, 3) mesh with texture, 4) orthophoto.”3DF SAMANTHAThe first step here involves key-points extraction which is done by the detector where we identify the blobs with related scale levels from the scale-space extreme of the scale-normalized Laplacian. Considering the descriptor, Samantha will implement a 128-dimensional descriptor depending on the accumulated responses of steerable derivative filters. The combination of the detector and descriptor works in comparable ways to SIFT which has been proved to be suitable for features detection in low-resolution thermal imagery and also avoid patent issues. A given number of the key-points, having the strongest responses are retained. Since the images may be unordered, you must then recover the graph which tells images that overlap with each other, i.e., epipolar graph. It has to be done the most computationally efficient manner without trying matching key-points located between every image pair.
Considering this broad stage, we consider just a small constant number of descriptors, particularly the key-points with higher scales. It is because their descriptors are normally more representative of the entire image content. The key-point descriptors are matched to their estimated nearest neighbours in the feature space. To register the number of matches of the corresponding images, a 2D histogram can be built. We may adopt the approach of taking maximum spanning trees which bring out a stronger inter-clique connection rather than selecting pairs with high scores in the histogram.
Key-point matching takes the nearest neighbour approach with “ratio test.” The matches that aren’t injective are discarded. Fundamental and homographs between pars of images that match are subsequently computed using “M-estimator Sample Consensus” (MSAC), and outliners are rejected. The remaining matches between two images are discarded in case they are less than twenty percent, of the entire matches before MSAC. The explanation behind this is that the original matches become unreliable altogether if the excessive proportion of the outliners has been detected.
Subsequently, the matching of key-point in multiple images is connected to trucks. From a graph, the undirected key-points are the edges and nodes, representing the matches while the track is the connected component of the graph. The vertices can be labelled with images that the key-points belong to. However, an inconsistency may come up if the track label appears more than once. Tracks that are inconsistent or the ones shorter than three frames need to be discarded. A track signifies the projection of one 3D tie-point that is imaged to several exposures. The next step involves organizing images into a dendrogram or a tree with agglomerative clustering by the use of the “measure of overlap” as the affinity. The computation of structure-and-motion follows the tree such that images are stored in leaves while partial models correspond to the internal nodes. The entire hierarchical approach of structure-and-motion relies on four photogrammetric procedures which are in Samantha, paying close attention to outliers’ resilience.
• Intersection
Intersection, also known as triangulation is the procedure calculating the coordinates of the 3D point from corresponding points in several images. The control points are manually identified in images, and the positions in the 3D are approximated. Their coordinates are then independently measured by GNSS and expressed in the international reference frame.
• Resection
Resection involves recovering the matrix of the camera from known 3D correspondences. It involves approximating the parameters of the produced images.
• Relative Orientation
It is the task of obtaining the relative attitude and position of cameras from the relative points in the images. One must determine the orientations and also the position of a camera relative to the other before corresponding points taken by two cameras may be employed to recover the object distance in a given scene. It is one photogrammetric limitation of relative orientation which based on the “interpretation of binocular” stereo information.
• Absolute Orientation
It requires the computations of the rigid transformation which brings two models sharing some tie-points into one reference frame. It seeks to find the best parameters for transformation which relates to geodetic and stereo model coordinates. In other words, it is the leveling, attitude correction and scaling images to fit the ground control.
The algorithm may be summarized as below;
I. At the leaves, solve various independent relative orientation problems, giving many independent stereo models.
II. Traverse the tree. Add one image with resection then by intersection to update one model. Merge absolute orientation with two independent models. After registering the models, tie-points are then updated by intersection. The new model is then refined with bundle block adjustments.
The algorithm should perform steps one and two if the tree forms a chain. If the tree is balanced perfectly, one the second step is taken. This framework has relatively lower computational complexity than the standard sequential method. It copes better with drift challenges and independent of the original pair of images, a typical of the sequential framework. The auto-calibration approach of Samantha can calculate the inner parameters of images without giving any control point. The method is based on listings of the inherently restricted space of the inner parameters of the cameras to get the collineation space which upgrades certain projective reconstructions to Euclidean. Every sample of the search space will define a constant plane at infinity. It, in turn, gives a tentative, nearly Euclidean upgrading of the entire reconstruction that is then scored depending on the intrinsic factors of the camera.
Model Geo-referencingImages taken by cameras mounted with UAV are normally provided with GNSS or INS information. The information may be used as the original values for estimated unknown orientation parameters. It is vital to note that their availability is not always guaranteed. Samantha doesn’t require ancillary information which is the calibration parameters or the GNSS/INS measures (Luhmann, Piechel, and Roelfs, 2013, p. 27). The 3D point coordinates and the resulting camera orientation are defined up to scale and expressed in local reference scales. To remove the ambiguity of the scale and transfer the coordinates into a global system, a similarity transformation has to be computed. It may be done the moment three Ground Control Points are available. Their coordinates are independently measured by GNSS and expressed in the international reference frame. The control points are manually identified in images, and the positions in the 3D are approximated by intersection. The correspondences between the estimated and true 3D coordinates can be used in transforming models with similarity which aligns the control points in the least-squares sense. It is vital to note that GCPs may also be exploited as a constraint to optimize the reconstruction of the 3D.
Georeferencing images will call for establishing control points first then inputting the known coordinates of control points. The projection parameters and coordinate system are selected then we minimize residuals. Normally, residuals are the difference between the predicted coordinates by the geographic model and the real coordinates of control points. The residuals provide a way of determining the accuracy levels of georeferencing process. Essential information could be contained in the images produced at the various point of time. Thus, it may be necessary for this data to be compared with the one currently available. The final may be used in analysing changes in the features over a period. It is vital to note that different maps could be using different systems of projections. However, Georeferencing tools normally consist of methods of combining and overlaying the maps with minimum distortion. Considering this method, the data retrieved from surveying tools can be given a reference point from topographic maps that are already available.
3DF STASIA (Multi-view Stereo)The aim is to recover the dense point cloud which signifies the surface of the image provided the position and attitude of the image. In the 3DF Zephyr, the procedure is done through Stasia. The first step involves extracting depth hypothesis. The aim of this step is to approximate the number of candidate depths for every pixel and image. The hypotheses will later be used for labelling in the Markov Random fields which retrieves the final depth. Consequently, to multiple algorithms, a pixel level that matches epipolar lines can be used as the matching matrix with Normalized Cross Correlation, giving good trade-offs between robustness and speed of photometric nuisances. To achieve that, every depth map needs to be created independently. The extraction can be performed by considering the three neighbouring views and the reference image selected based on the visibility information given by Samantha and the sparse structure. The candidate’s depths for every pixel are then searched equivalently or along the optical ray using the NCC and block matching. In this approach, a correlation profile which is parameterized with depths can be calculated for every neighbour image and pixel image. It is vital to note that the search range of every pixel depth may impact heavily the performance of algorithms. To prevent this, information from the structure-and-motion can be used to minimize the search range.
The technique of discrete MRF optimization is used over the grid of the image to generate the “final depth map from the depth hypothesis.” The depth maps are lifted in the 3D space to give a photoconsistency volume which is represented by an actree which accumulates the scores originating from every depth map. To avoid or minimize the loss of precision, “a moving average approach” can be used inside every bin. Every cell will contain a 3D position at the end of this lifting process. Each cell can be shifted relative to the cell centre and a value representing photoconsistency given by the summation of correlation scores of points falling in that bin. At this stage, the photoconsistency contains several spurious points that don’t belong to an actual surface. They are normally characterized by certain features. For instance, they normally exclude actual surface points. Their photoconsistency is also lower than real surface points. The above observation prompts an iterative strategy such that an occlusor’s photoconsistency is reduced by a proportion of that of the occluded point and points having negative value is ultimately removed. In computer vision, the process is commonly known as Multi-view Stereo. 3DF Stasia speaks with Samantha to provide an accurate reconstruction as much as possible. The accuracy needs to be stressed to exploit each pixel input images to give the dense cloud.
3DF SASHA (Mesh Generation and Texture Mapping)3DF Sasha is the proprietary algorithm as far as mesh extraction is concerned. Candidate depths for every pixel will match along epipolar lines with Normalised Cross Correlation (NCC) as its matching metric. In other words, the colour value from every image which corresponds to a specific orthophoto pixel is interpolated through convolution as a result of its smoothing effects. In the event images consist of significant difference in scale, it’s rational to fix the “interpolation window size” in the space projections. It is vital to acknowledge that pixel adjacency doesn’t have to mean adjacency in space. The corresponding orthophoto pixel is supposed to be left blank for a model point that is visible on an image. The scenario can be seen in advance since the algorithm normally creates a map that shows how many images view every orthophoto pixel. During texture mapping, such gaps may be treated with “hoe-filling.” The final texture of every orthophoto pixel is the weighted average of the corresponding colour values from images that are entitled to contribute. It relies on the intersection angle or the viewing angle and the resolution in space which determine the “size of 2D image” triangle.
The procedure is to recover dense points cloud. When the process ends, the third module known as Sasha generates a surface by employing the Poisson algorithm and starting with normal that is calculated at each point (Kazhdan, Bolitho & Hoppe, 2006, p.61). The normal is calculated by fitting a plane to neighbours that are closer. Normal direction is then disambiguated with visibility. A final depth map will be obtained from the depth hypothesis using a discrete Markov random field (MRF) over the grid of the image. Depth maps will then be elevated in 3D space to give a Photoconsistency volume. Given the dense point clouds which have several details, it is vital to preserve details as much as possible when extracting the surface.

Figure SEQ Figure * ARABIC 7 Some of the sample images from the TLR dataset.
OptimizationThe surface may be optimized further inside the software with optimization algorithm depending on the photoconsistency. The initial surface construction approach is interpolatory. The point cloud may have a plenty amount of noise. Thus, the attained initial mesh is often noisy and can fail to capture finer details. The mesh is refined with the variational multi-views stereo method by using all image information. The initial mesh exploited as the original condition of the gradient descent of sufficient energy functional with algorithms. A texture TIR map is finally built by making use of “the view angle” and visibility information. 3D Sasha enables the retrieval of sharp edges on the 3D model hence it is suitable for applications such as industrial survey, architecture and urban monitoring. The optimization is committed fully to streamline the meshes that have been produced by 3DF Zephyr. Optimization allows one to greatly increase every detail of the surface that has been reconstructed without interfering with its precision or altering its original scene’s shape. As much as it will need some time for computation, it is worth using it. A number of neighbour cameras can be picked for photoconsistency and also the image resolution which was used. Normally, better results can be obtained in the event the image resolution-based optimization is greater than the resolution employed in the stereo step. The photoconsistency mesh optimization can also be run as a mesh filter right from the tools menu.
Target Projection area: It controls the size which each triangle with obtaining during the process of photoconsistency. Provided a triangle, the final reprojection area to the neighbour camera will tend to nearer to the stated value. To come up with a denser mesh, this value can be lowered, and the resultant mesh will consist of a variable level of details with greater vortex point cloud that corresponds to those areas which are viewed by neighbour cameras. Normally, the default value may work well. In situations when dealing with a mesh which is very good already such as one which has completed a photoconsistency step, the parameter may be reduced to extract more details. In the case of low-quality images and noisy matches, this parameter may be increased.
The number of neighbour cameras: For every camera, a given number of neighbouring cameras needs to be selected which will compose pairs which the algorithm of photoconsistency will use. If this parameter is increased, the computation time, as well as the final accuracy, will increase. It is vital to note that this parameter should only be decreased in special cases such as for preliminary tests or in cases where there is no need for high accuracy.
Use of Symmetric pairs: In the event, you enable this option, the algorithms will analyze every pair of cameras in a symmetric way during every iteration. The final level of details is usually very similar as far as the option value is concerned. This option can be left unchecked since it will reduce the computation time by half. However, it can be used for very small datasets.
Maximum Iterations: A given value, say 40, can be left on default since the algorithm will normally converge to an optimal solution. The value may be decreased if the starting mesh has a good detail already, for instance, in case the “input mesh is the output” of the initial photoconsistency step.
Hierarchical subdivision: In the event, this value is more than zero, the algorithm for photoconsistency may be applied several times in a sequential manner, which adjusts automatically the iteration number and the image resolution. It implies that the same results may be attained by running similar algorithm several times with appropriate settings.
Convergence Relative Tolerance: It determines the time minimization algorithm should halt since the solution is converging. It is because; more iteration will still give similar results. Increasing the value for convergence relative tolerance too much may bring the algorithm into halt tool early, thereby producing lesser details. On the contrary, reducing the value too much may expose one to the risk of having too many iterations. It may go up to the maximum iteration values. As much as reducing this value won’t worsen the result, it will heighten the computation times.
Thermal Imaging CameraIt is a device which forms images using the infrared radiation just like the common cameras which form images using visible light. Thermal cameras normally consist of various components such as the detector, optic system, signal processing, display and the amplifier. Because the thermal camera can only see the electromagnetic radiation that a human eye can detect, it will build and record a visible picture. The thermal imaging camera employs various mathematical algorithms. To evaluate the applicability the suggested method we acquire the dataset of 130 TIR images using an Optris PI450 LW infrared camera with a wavelength ranging from 8 to 14 μm. The detector should have 80 (h) × 60 (v) active pixels and also a thermal sensitivity of <50 mK. The camera needs to be mounted on an octorotor drone with 4kg payload, 20 minutes flight time and pitch and roll axis stabilization. The flight should be planned for images to have a side and forward overlap of 80 percent and Ground sampling distance of 94 mm on average. The subject is supposed to be a big building that houses a heating plant and an air conditioner. Many solar panels are to be installed on the roof, and because of the building size, each image will reproduce just a limited portion. It is the reason it’s recommended to create an orthophoto and a 3D model to get a comprehensive view of that building and to ease the interpretations and evaluations of the thermal data.
When using the thermal imaging camera, safety should be a big concern hence the need to use the drones with standard features like the auto return home and a 5-rotor fail safe mode. It allows the drone to land flawlessly in case one of its rotors fails in the middle of the flight. The battery duration needs to be reliable. The thermal imaging camera display will show infrared input differentials. Therefore, two objects, having the same temperature will normally seem to be of the same colour. However, most of these thermal imaging cameras normally use grayscale which represents objects with normal temperature but highlights very hot surfaces using different colours. Apart from building inspections, thermal imaging cameras are largely used by firefighters to have a view of areas of heat through the darkness, smoke and heat-permeable barrier.
Based on the procedure described above, the first structure-and-motion algorithm is then run to obtain the altitude and position of each image. After adjusting the bundle, the RMSE becomes 0.31 pixel while the approximated variance becomes 0.35 squared pixel. The model is then georeferenced using the coordinates of three GCPs. The coordinates are measured by very accurate GPS differential locating in the previous survey. The step may be quite tricky because the GCPs on the building’s edges may be difficult to locate on the TIR images. The dense point cloud is subsequently created with the “multiple-view stereo” algorithm (Oniga, Chirilă, and Stătescu, 2017, p. 551). It is vital to note that apart from the sidewalk and certain portions of the “building facades,” the point cloud consists of high density and also most details are reconstructed, not to mention solar panels. The mesh with texture is obtained, and a thermal orthophoto of high quality is created. The whole process may require not more than thirty minutes on a computer with 8GB RAM and “Intel Core i5-4200M CPU @ 2.50GHz.” Among colour values from various images to be merged for texturing, “outlying values” may happen to be included. It may arise from orientation or view-dependent features or modelling errors, specifically in the occlusion borders. Thus, such values are to be discarded before assigning the texture to orthophoto pixels. It should be noted that the effects arising from lens distortion can’t be treated as ‘blunders.’ In the event they are not directly corrected, they may ruin the outcome.
ResultsThe result will show how multi-view stereo and structure-and-motion algorithms may be applied successfully in creating a highly detailed, dense point cloud, orthophoto, and 3D models directly from uncalibrated and unordered TIR images without having to require ancillary information or additional RGB images. It makes TIR images of instant use which minimizes the processing time considerably. However, as a result of low contrast and low resolution of TIR images, the 3D point clouds that have been generated from them directly will have a relatively lower accuracy compared to those retrieved from RGB images. Thus, when an RGB image of the “same area” is already available, it will be wise to try using them. To come up with detailed 3D models which retain the thermal information simultaneously, we can propose a procedure that combines RGB and TIR images.
1) 2)
Figure SEQ Figure * ARABIC 8: Integrating RGB and TIR images. Image 1) is the 3D model from RGB and 2) is the texture from TIR images.Figure 8: Integrating RGB and TIR images. Image (a) is the 3D model from RGB and (b) is the texture from TIR images.
The technique may be automatically undertaken by 3DF Zephyr on a set of RGB and TIR images (Li, Guo, and Zhang, 2017, p.152). It implies that every RGB image doesn’t require a corresponding thermal infrared (TLR) image obtained from the same orientation and same positions.

Figure SEQ Figure * ARABIC 9: The Simplified procedure of creating 3D building models from RGB and TIR models
Point Clouds Generation
The very first step of this procedure resembles the initially described in the above section. One needs to establish control points first, input the geographic coordinates that are known, select the projection parameters and coordinate system and minimize residuals. Normally, residuals are the difference between the predicted coordinates by the geographic model and the real coordinates of control points. The RGB and TIR images are separately oriented through structure-and-motion algorithms without necessarily having any prior knowledge regarding their altitude and position. The obtained models are then georeferenced and also scaled by the use of at least three GCPs. In this step, it is vital to note that the model derived from TIR images calls for georeferencing only because it will be registered subsequently on the RGB points cloud. The multi-view stereo algorithm will then generate the RGB and also TIR point cloud independently.
As a result of georeferencing, “TIR and RGB point clouds close together in that 3D space” and have almost the same scale. The TIR point cloud may, therefore, be registered to that retrieved from RGB images by the use of the Iterative Closest method. The algorithm will compute the correspondences between point-sets, provided an estimate of that transformation. It will then update the transformation depending on the existing correspondences and then iterates until convergence is reached. Considering the final model extraction, the orientation parameters of TIR images will be transformed automatically into the reference system by the alignment of RGB point cloud with the TIR one. It makes it possible, relating the thermal information arising from TIR datasets to 3D point clouds computed from RGB images. To maintain the geometric accuracy as well as high levels of detail that arises from the images generated in the visible spectrum, the final surface needs to be retrieved from RGB clouds through Poisson algorithms (Kazhdan, Bolitho & Hoppe, 2006, p. 61). The coordinates of the texture are obtained which projects the vertices of “the RGB mesh” into the thermal infrared images. To evaluate the proposed procedure for integrating TIR and RGB images, an existing dataset of twenty-seven RGB images which is taken over the same area can be used.
Upon the block adjustments with Samantha, the approximated reference variance becomes 0.52 squared pixels. Georeferencing the model becomes easier in this case than for TIR dataset since the GCPs on the edges of the building, edges of the nearby utility holes or road signs become recognizable easily in RGB images. A least-square alignment is then performed on “12 GCPs” with a mean residual of 0.1 m. The dense point cloud is then generated from RGB images which are used as the reference to align the previously extracted TIR point cloud, calculating a similarity transformation. Consequently, the TIR orientation is then updated accordingly. Upon the registration through ICP, the average distance of the corresponding 3D point of two-point clouds becomes 0.15 m while the standard deviation becomes 0.07 m. The RGB and TIR can be compared to qualitatively evaluate further the results obtained. Considering this project, coordinate image errors were <5 pixels tolerance which is an indication of high accuracy level.

Figure SEQ Figure * ARABIC 10: Results of comparison between RGB and TIR clouds.
Considering the distribution of the residual point to absolute distance in the figure above, there is no deformation of the of the TIR block which is appreciable. The process described above enabled the creation of the surface with accuracy from the RGB point which is subsequently textured with TIR images as shown in the above figure.
EvaluationThe results show that Computer Vision software, referred to as 3DF Zephyr can be a solution in monitoring big structures through 3D reconstruction. It was developed for processing RGB images but can also be used in 3D reconstructions from TIR information. The multi-view stereo and structure-and-motion algorithms may be applied successfully to come up with high detailed 3D models, orthophoto, and dense point cloud from TIR images without having to require additional images from that visible spectrum (Farenzena, Fusiello and Gherard, 2009, p. 13). It makes TIR images of instant use which reduces the processing time considerably. On the contrary, when RGB images are also available, the procedure proposed in the previous section enables integration automatically between RGB images and thermal infrared images. When the models from RGB and TIR datasets are compared, the RGB one indicates an increase of details due to the high resolution of RGB images. It is also as a result of building edges which are less perceptible in the TIR about the visual spectrum.
Figure SEQ Figure * ARABIC 11: 1) is the RGB model details while 2) is the TIR model details (the edges of the building in the TIR model are less visible).The edges appear to be sharper and also constructed better in the RGB model than the other model. The sizes and distribution of the polygons found in the RGB mesh tend to be more regular compared to the TIR mesh. Therefore, the integrating TIR and RGB guarantee a high attainable geometric accuracy. Consequently, it maintains the thermal information.
It is vital to note the difference in the procedures. The above procedure allows one to perform the adjustments of images block even with no GNSS/INS information which at times may not be available. Sometimes, the GNSS/INS data have low accuracy hence the validity of this procedure (Li, Guo, and Zhang, 2017, p.152). The scaling and georeferencing may be done subsequently, making use of three GCPs which had been identified in the images with known coordinates. However, the use of GCDs may be avoided in the event it is unnecessary to express the coordinates of the model in a certain international reference frame. If that happens, only a control distance will be required for scaling the reconstructed model (Lindeberg, 1998, p. 79). In the above procedure, the geometric calibration of images has not been requested as a result of the great auto-calibration algorithm which is implemented in Samantha. It can also work with the thermal infrared dataset as shown by the experimental evaluation.
Provided there is a precise image calibration and an accurate 3D mesh; the presented algorithms will identify the visibilities of object and image to enable synthesis of textured projections automatically. The pixel coloring will succeed the overall contributions from viewing images. It is vital to note that some challenges may arise for inspectors or constructors using 3D engineering models in the tasks of construction inspection such as unfamiliarity with the tools and field survey methods, lack of software and equipment to exploit the 3D models in the field or the office. Inspectors and resident engineers need practical means of overcoming the above challenges to realize the beneficial goals of safer and faster construction which is more transparently accepted and more accurate. Using the 3D model successfully calls for controlling possible sources of measurement error by using the same survey setup, 3D data, control network and the instrument type in the work inspection as was used during construction. It calls for close cooperation with the contractor to streamline the process and minimize measurement errors. Considering the performances about computed coordinates of the object, the results are positively surprising. It takes into account the processing time, process automation, the point cloud density and the accuracy. As much as a traditional survey may not be replaced, drones will augment it in a manner that gives significantly better results for almost the same cost and effort.
Various elements need to be considered while controlling the potential sources of measurement errors. For instance, there is the need to have a survey plan, including a survey method for managing grid scale factor distortions and a source of real-time kinematic correction for international navigational satellite systems for the positioning application. There is also the need to have control network and maintenance plan meant for passive site control. Unique management and identification of the 3D model need to be used for inspection, quantity calculations, and construction. The constructor’s furnishing of software, hardware, and training for inspectors if applicable is also necessary. Calibration protocols and daily validations of the field technologies need to be considered. Finally, the notification and timing requirements also supposed to be in place to measure and verify the completed work.
ConclusionThe model is a good tool during various stages of building such as planning, construction, and maintenance. It is possible to access all information regarding a building which makes it advantageous and reliable as far as “maintenance and monitoring” are concerned. Maintaining a building can prove to be a complex tax since the identification and analysis of various kinds of construction failures call for considerable efforts of those responsible for monitoring and maintaining the facility. Therefore, inspectors should be challenged to keep the pace with the modern tools which use 3D data to help improve safety, accelerate the delivery of the project and minimize the number of insurance claims and change orders.
The advancement of technology is creating new, safer, faster and more accurate means methods for building construction. The benefits of digital, repeatable and data-driven make it critical as far as construction practices are concerned. The synergistic benefits of transparent, fast, digital records of construction progress and repeatable measurements make it ripe for construction partnering. The engineer and constructor may partner to come up with a 3D model that gives an accurate reflection of the ground. It will help identify and resolve accountability issues. It may provide the constructor and the engineer the mutual benefit of avoiding the burden of data exchange by providing them with technology for manipulating 3D data. The technology avoids destructive interventions which are aimed at identifying the sources of the problem. High-quality 3D models reconstruction of buildings can be retrieved from an automatic and fast way from TIR using a commercial Photogrammetry and 3D modeling software. It acts as the perfect tool for building inspection as well as other applications. However, some challenges may arise for inspectors or constructors using 3D engineering models in the tasks of construction inspection. Some of the challenges include unfamiliarity with the tools and field survey methods, lack of software and equipment to exploit the 3D models in the field or the office. Others include lack of confidence or familiarity in manipulating the 3D models and unfamiliarity with a reliable datasets of 3D models that reflect accurately the field conditions or the design intent. There are tools that can be employed in handling sources of texturing errors. Despite those tools, the accuracy of the initial input data is of prime importance. It refers to both coherence of the reconstructed configurations and the accuracy of modeling. In other words, the image blocks need to be processed in a unique self-calibrating alteration. More than the approximated accuracy are normally valuable in this context.
To conclude, the proposed procedure for 3D model reconstruction in this paper can provide helpful support in making interpretation and evaluation of thermal data more automatic, faster and objective. One can imagine the availability of 3D models of every building such that architects may use to analyze heat distributions of the building which helps them identify the necessary modifications. The above procedure may not only be applied in building and construction. It can also be applied in many other fields including search and rescue sector, law enforcement, building inspection and power line maintenance.
El-Hakim S. F., Beraldin J.-A., Picard M., Vettore A., 2003. Effective 3D modeling of the heritage site. 4th International Conference of 3D Imaging & Modeling, Banff, pp. 302-309.
Energy Information Administration (2017). How much energy is consumed in U.S. residential & commercial buildings?. [online] Available at: [Accessed 24 Mar. 2018].
Kazhdan, M., Bolitho, M. and Hoppe, H., 2006. Poisson surface reconstruction. In: Eurographics symposium on Geometry processing, p. 61–70.
Li, M., Guo, B. and Zhang, W. (2017). An Occlusion “Detection Algorithm for 3D Texture Reconstruction of multi-View” Images. International Journals of Machine Learning & Computing, 7(5), pp.152-155.
Lindeberg, T., 1998. Features detection with automatic scale selections. International Journals of Computer Visions 30, pp. 79–116.
Luhmann, T., Piechel, J. and Roelfs, T., 2013. Geometric calibration of thermographic camera. In: Thermal Infrared Remote Sensings, Springer, pp. 27–42.
Farenzena, M., Fusiello, A., Gherardi, R. Structure-and-Motion Pipelines on Hierarchical Cluster Trees. Proceeding of the IEEE International Workshops on 3-D Digital Imaging & Modeling, Kyoto, October 2009, pp.13.Oniga, E., Chirilă, C. and Stătescu, F. (2017). Accuracy Assessment Of 3d Modelreconstructed From Images Taken With A Low-Cost Uas. ISPRS – International Archives of Photogrammetry, XLII-2/W3, pp.551-558.Toldo, R., Towards automatic acquisitions of the high-level 3D model from images. Ph.D. Thesis. University of Verona, 2013, pp.2-15
Sampaio, A. and Simoes, D. (2014). Building Information Modelling Concept Applied in Maintenance of Buildings. [online] pp.20-28. Available at: [Accessed 28 Mar. 2018].
Toldo, R., Fantini, F., Giona, L., Fantoni, S. and Fusiello, A. (2013). Accurate Multiview Stereo Reconstruction with Fast Visibility Integration And Tight Disparity Bounding. Isprs – International Archives of Photogrammetry, Remote Sensing & Spatial Information Science, XL-5/W1, pp.243-249.
Zachariadis, T., Michopoulos, A., Vougiouklakis, Y., Piripitsi, K., Ellinopoulos, C. and Struss, B. (2018). Determination of Cost-Effective Energy Efficiency Measures in Buildings with the Aid of Multiple Indices. Energies, 11(1), p.191.
Maurizio Carlini, Elena Allegrinia, Domenico Zillia and Sonia Castellucci, Simulating Heat Transfers through the Building Envelope: a Useful Tool in the Economical Assessment 2014, p.395
Ariwoola, Raheem Taiwo, “Use of Drone and Infrared Camera for a Campus Building Envelope Study” (2016). Electronic Theses and Dissertations. Paper 3018.

Free Thermal imaging with drone Dissertation Example

All Examples

Looking for Computing dissertation?

Order it on our dissertation writing service at affordable price

Get an original paper