Compressing LIDAR Data
This page will discuss the results of various tests and attempts to reduce the size of observed LIDAR data, with hopes of finding a solution that is both fast, and produces small data products without losing any of the scientific information contained within the products. (Lossless)
The Data
The LIDAR data I'm doing the tests on is similar to this:
QTC Version : 15
Projection : UTM
UTM Zone : 18 N
282085.999966 4772036.940039 130.183709 247
282085.999966 4772036.940039 130.183709 247
282086.129971 4772036.910010 130.181709 255
282086.129971 4772036.910010 130.181709 255
282086.160000 4772036.879980 130.171709 231
282086.160000 4772036.879980 130.171709 231
282086.290005 4772036.849951 130.168709 250
282086.290005 4772036.849951 130.168709 250
282086.310024 4772036.819922 130.142709 255
282086.310024 4772036.819922 130.142709 255
282086.440029 4772036.789893 130.117709 235
282086.440029 4772036.789893 130.117709 235
282086.460049 4772036.760107 130.128709 221
This output is from an Optec LIDAR system.
This shows that the observation is in UTM Zone 18 North. The four columns in the series of observation points are the UTM Coordinates in X Y and Z, then the intensity.
The file I'm testing with is 311,168,033 bytes large, of this ASCII data.
Single File, Original ASCII
This first series of tests encompasses one single file in the above form, being run through various compressors.
These tests were done on a single cpu 450mhz G4 Cube running OS X 10.4, while another process was also running. All of the
following consumed 50-60% of the processor time. With the exception of 7zip (7za), they all consumed roughly 1mb of physical ram and 28mb of virtual ram. 7za consumed roughly 26mb of physical, and 60mb of virtual.
| compression |
user time |
compressed size |
ratio |
additional info |
| gzip |
1:36 |
41,043,960 |
1:7.58 13.1% |
|
| gzip -3 |
1:29 |
56,402,108 |
1:5.52 18.1% |
|
| gzip -9 |
4:47 |
40,473,564 |
1:7.69 13.0% |
|
| bzip2 |
6:21 |
34,629,760 |
1:8.99 11.1% |
|
| bzip2 -3 |
4:54 |
35,514,311 |
1:8.76 11.4% |
|
| bzip2 -9 |
6:17 |
34,629,760 |
1:8.99 11.1% |
|
| zip |
1:31 |
41,044,083 |
1:7.58 13.1% |
|
| 7za a (-mx=5) |
24:11 |
30,118,731 |
1:10.33 9.7% |
used 3 threads, is multicpu friendly |
| 7za a -mx=3 |
3:34 |
38,147,961 |
1:8.16 12.3% |
|
| 7za a -mx=9 |
108:22 |
29,676,419 |
1:10.49 9.5% |
consumed 302mb phys, 370mb virt |
Split Files, Original ASCII
This series of tests will involve splitting the one file into many smaller files. Perhaps this might make a differece with the amount of time it takes to generate the dictionary for the resulting file.
Binary file
This test will involve parsing the ASCII text file in, and emitting it as a binary file. This binary file will
then be run through various compression routines.
Binary Processed
This test will involve taking the Binary file, and emitting a few files, including the brightness values (column 4
of the original data) with various image compression routines. Other processing might be attempted as well.r