A LAScatalog
class is a representation in R of a las
file or a collection of las files not loaded in memory. Indeed, a
regular computer cannot load the entire point cloud in R if it covers a
broad area. For very high density datasets it can even fail loading a
single file (see also the “LAS formal class” vignette). In lidR, we use
a LAScatalog
to process datasets that cannot fit in
memory.
A LAScatalog
contains a sf
object with
POLYGON
geometries plus some extra slots that store
information relative to how the LAScatalog
will be
processed.
The slot data
of a LAScatalog
object
contains the sf
object with the most important information
read from the header of .las or .laz files. Reading only the header of
the file provides an overview of the content of the files very quickly
without actually loading the point cloud. The columns of the table are
named after the LAS
specification version 1.4
The other slots are well documented in the documentation
help("LAScatalog-class")
so they are not described in this
vignette.
A LAScatalog
purpose is not to manipulate spatial data
in R. The purpose of a LAScatalog
is to represent a set of
existing las/laz files. Thus a LAScatalog
cannot be
modified because it must be related to the actual content of the files.
The following throws an error:
Obviously it is always possible to modify an R object by bypassing such simple restrictions. In this case the user will break something internally and a correct output is not guaranteed.
However it is possible to add and modify the attributes using a name that is not reserved. The following is allowed:
Users commonly report bugs arising from the fact that the point cloud
is invalid. This is why we introduced the function
las_check()
to perform an inspection of the
LAScatalog
objects. This function checks if a
LAScatalog
object is consistent (files are all of the same
type for example). For example, it may happen that a collection mixes
files of type 1 with files of type 3 or files with different scale
factors.
las_check(ctg)
#>
#> Checking headers consistency
#> - Checking file version consistency...[0;32m ✓[0m
#> - Checking scale consistency...[0;32m ✓[0m
#> - Checking offset consistency...[0;32m ✓[0m
#> - Checking point type consistency...[0;32m ✓[0m
#> - Checking VLR consistency...[0;32m ✓[0m
#> - Checking CRS consistency...[0;32m ✓[0m
#> Checking the headers
#> - Checking scale factor validity...[0;32m ✓[0m
#> - Checking Point Data Format ID validity...[0;32m ✓[0m
#> Checking preprocessing already done
#> - Checking negative outliers...
#> [1;33m ⚠ 2 file(s) with points below 0[0m
#> - Checking normalization...[0;31m no[0m
#> Checking the geometry
#> - Checking overlapping tiles...
#> [1;33m ⚠ Some tiles seem to overlap each other[0m
#> - Checking point indexation...[0;31m no[0m
The function las_check()
when applied to a
LAScatalog
does not perform a deep inspection of the point
cloud unlike when applied to a LAS
object. Indeed the point
cloud is not actually read.
lidR
provides a simple plot()
function to
plot a LAScatalog
object:
The option mapview = TRUE
displays the
LAScatalog
on an interactive map with pan and zoom and
allows the addition of a satellite map in the background. It uses the
package mapview
internally. It is often useful to check if
the CRS of the file are properly registered. The epsg codes recorded in
the las files appear to be sometime incorrect, according to our own
experience.
Using a sf
object to store the attributes of the las
file it is easy to display metadata of the files. In the following we
can immediately see that the catalog is not normalized and is likely to
contain outliers:
Most of lidR functions are compatible with a LAScatalog
and work almost like with a single point cloud loaded in memory. In the
following example we use the function pixel_metrics()
to
compute the mean elevation of the points. The output is a continuous
wall-to-wall raster. It works exactly as if the input was a
LAS
object.
However, processing a LAScatalog
usually requires some
tuning of the processing options to get better control of the
computation. Indeed, if the catalog is huge the output is likely to be
huge as well, and maybe the output cannot fit in the R memory. For
example, normalize_height()
throws an error if used ‘as is’
without tuning the processing options. Using
normalize_height()
like in the following example the
expected output
would be a huge point cloud loaded in
memory. The lidR package forbids such a call:
output <- normalize_height(ctg, tin())
#> Error: This function requires that the LAScatalog provides an output file template.
Instead, one can use the processing option
opt_output_files()
. Processing options drive how the big
files are split in small chunks and how the outputs are either returned
into R or written on disk into files.
opt_output_files(ctg) <- "folder/where/to/store/outputs/{ORIGINALFILENAME}_normalized"
output <- normalize_height(ctg, tin())
Here the output is not a point cloud but a LAScatalog
pointing to the newly created files. The user can check how the
collection will be processed by calling summary
summary(ctg)
#> class : LAScatalog (v1.2 format 1)
#> extent : 883166.1, 895250.2, 625793.6, 639938.4 (xmin, xmax, ymin, ymax)
#> coord. ref. : NAD83 / UTM zone 17N
#> area : 111.1 km²
#> points : 0 points
#> density : 0 points/m²
#> num. files : 62
#> proc. opt. : buffer: 30 | chunk: 0
#> input opt. : select: * | filter:
#> output opt. : in memory | w2w guaranteed | merging enabled
#> drivers :
#> - Raster : no parameter
#> - stars : NA_value = -999999
#> - Spatial : no parameter
#> - SpatRaster : overwrite = FALSE NAflag = -999999
#> - SpatVector : overwrite = FALSE
#> - LAS : no parameter
#> - sf : quiet = TRUE
#> - data.frame : no parameter
Also the plot
function can displays the chunks pattern
i.e. how the dataset is split into small chunks that will be
sequentially processed
It possible to flag some file that will not be processed but that will be used to load a buffer if required. In the following example only the central files will be processed but the others one were not removed and they will be used to buffer the processed files.
Load a collection Process each file sequentially. Returns a raster into R.
Load a collection Process each file sequentially. For each file write a raster on disk named after the name of the processed files. Returns a lightweight virtual raster mosaic.
Load a single big file too big to be actually loaded in memory. Process small chunks of 100 x 100 meters at a time. Returns a raster into R.
Load a collection. Process small chunks of 200 x 200 meters at a time. Each chunk is loaded with an extra 20 m buffer. Returns spatial points into R.
This is forbidden. The output would be too big.
Load a collection. Process small chunks of 500 x 500 meter
sequentially. For each chunk write a laz file on disk named after the
coordinates of the chunk. Returns a lightweight LAScatalog
.
Note that the original collection has been retiled.
Load a collection. Load a shapefile of plot centers. Extract the plots. Returns a list of extracted point clouds in R.
Load a collection. Load a shapefile of plot centers. Extract the
plots and immediately write them into a file named after the coordinates
of the plot and an attributes of the shapefile (here PLOTID
if such an attribute exists in the shapefile). Returns a lightweight
LAScatalog
.