| |
Submarine Topograhy -- Database Development |

Most Recent Bathymetry Dataset For The Sea Of
Cortez And Surrounding Pacific Ocean Including GEODAS Data, Satellite
Altimetry Below 1000 meters, and DBDBV Interpolated Data |
|
Introduction I am attempting to develop a master database of bathymetric and
terrestrial heightfield data for the Sea of Cortez and surrounding waters. This
page contains notes on this work. The picture at the top of this page is a
current visualization of the bathymetric and terrestrial database. The current
geographical boundaries for the database are 34N 117.5W, 19N 104W. This region
includes terrestrial watersheds that are important to the Gulf of California.
|
|
Notes
Master Database Design
- Fundamental
Considerations
- Since marine life
is ultimately keyed to space and time, the master database must use space and
time in its fundamental structure.
- The database must
be dimensionally expandable to accommodate data for any number of biological
and physical processes.
- The database must
be useable by both internally developed visualization software and certain
commercial products.
- GIS
products
- IDL
- DEM viewers
like Dlgv32
.
- The database must
be useable over the Internet.
- Basic Structure
- The use of a
geometric grid to contain spatial data is an efficient and compact data
structure
- The geographic
location of a data element is represented by its position on the
grid.
- Therefore the
actual geographic position does not have to be stored.
- Only the
starting position and the grid resolution need be provided (usually in a
header).
- The grid structure
can be used to store heightfield (bathymetric and terrestrial) data.
- Each grid cell
represents a given latitude and longitude.
- Each grid cell
can either supply data to visualization software or the entire grid can become
a bitmap.
- The grid structure
can also be used to store data on biological and physical processes.
- One grid is
used for each data type (e.g. biomass, water temperature)
- Macro Structure
- The use of a
spatial grid structure for any one data element permits "stacking" of grids
into an expandable and multidimensional database.
- Only one DEM style
header needs to be used for all data grids because the header defines spatial
information for all grids.
- A metadata header
is also required to describe the data contained in each grid
plane.
- Utilization
- Since arrays are a
fundamental construct of most programming languages, a grid structured database
is easy to program without using a lot of sophisticated and/or proprietary
database libraries.
- Common Digital
Elevation Model (DEM) datasets use the grid structure approach
- A combined
marine and terrestrial grid structured database can be used by many common DEM
viewers.
- The DEM grid model
can be incorporated into the HDF format.
- The HDF format
can be used by some commercial visualization software including
IDL.
- Since the HDF
model has metadata capabilities, it can be used over the
Internet.
- The grid model,
which is a raster database, can be accessed by commercial GIS
software.
- Quadcode mapping
can be accomplished with the grid model
- Territories can
be represented at different resolutions in different areas.
- Important in
marine research because data come from a wide variety of sources with different
spatial resolutions.
Bathymetric and Terrestrial Heightfield Data (Fundamental data
plane)
- Basic Database
Design
- Fundamental grid
design is based on the
GTOPO30
format
- 30 arc second
(about 900 meter) resolution grid.
- Height data are
signed short integers (2 bytes)
- Row major
ordering.
- Binary data
file (*.dem).
- Data are
stored in Motorola byte format ("big endian") which stores the most significant
byte first.
- DEC Alpha
and most PCs use the Intel byte order ("little endian") which stores the most
significant byte first.
- We'll
continue using the big-endian binary format for storage
- Makes
our data available to Macs and Unix machines.
- Most
DEM viewers (e.g. Dlgv32) require big-endian.
- Can
easily convert the data as it's brought into our own viewers.
- Header file
(*.hdr)
- Holds
information on the fundamental data format and layout
- Byte
order
- Number
of rows and columns
- NaN
value
- GTOPO30 uses -9999 as a numerical indicator for an unknown (NaN)
value
-
Since some depths might be greater that -9999, the NaN value for extracted
datasets will be 32767.
- Latitude and longitude of the center of the left uppermost cell.
- X and Y
dimension (precision) of a cell.
- Bathymetric and Terrain
Data Sources
- Digital Bathymetric
Database - Variable (DBDB-V) From The Naval Oceanographic Office
(NAVOCEANO)
- A digital
bathymetric database that provides ocean depths at various gridded resolutions.
This online database may
be queried by specifying point location, an arc of a great circle, or a
bounding rectangle. Information and specifications about the database can be
found
here .
- I extracted
their bathymetry data for our geographic range. Their data are derived by
digitizing bathymetric contours of hard copy maps. Their grid resolution (1, 2,
and 5 arc minutes) varies by geographic area. Missing values are filled in
using a multi-stage minimum curvature spline algorithm which interpolates the
digitized values to derive a depth value for each node. The Web site permits
one to receive interpolated data down to 0.5 arc minute spacing. Shoreline
discrepancies are resolved by creating a land mask using the World Vector
Shoreline, or higher resolution shorelines.
- After
downloading the data at a 30 arc-second interpolated resolution, I created a
GTOPO30 DEM of their data and used this DEM as the basis for building our
database
- There
appear to be some terrain inconsistencies in the land mask. Since I can resolve
these inconsistencies by merging the standard GTOPO30 terrain data into the DEM
file, I used DBDB-V to build my first DEM into which the other data sources
(terrain and bathymetric) would be merged.
- GTOPO30
Database
- This format has
world-wide terrestrial coverage but no bathymetric data.
- There are good
terrestrial data for areas that surround the Gulf of California.
- We need
surrounding terrestrial data to portray watersheds and coastal
wetlands.
- For the Sea of
Cortez, the GTOPO30 DEM file to use is the
W140N40
tile.
- The data for
our geographic area was extracted from this tile and merged into the master
GTOPO30 DEM.
- The GEODAS
bathymetry data set comes from the
Marine Trackline Geophysics
database CD (Version 4.0)
- Data are a
collection of 30 years of single and multibeam echo soundings from various
world-wide institutions.
- A coastline
data set is also on the CD. These data were merged into the master database to
set sea level points for use in interpolation.
- Software
supplied on the GEODAS CD permits the extraction of data (in ASCII format) for
a selected geographical area.
- CD costs
$250.00.
- Bathymetric
Estimates Based On Gravimetric Information -- Sandwell and Smith at Scripps and
NOAA.
- 2 minute
resolution.
- Longitude
cell size is 2 minutes
- Latitude
cell size is 2 minutes * cos(latitude).
- Combines a
corrected version of the GEODAS bathymetry data set with estimates generated
from gravity data derived from satellite altimetry. According to Smith "The
depth data were obtained by screening 6905 surveys from the NGDC (Marine
Trackline Geophysics CD-ROM version 3.2), the Scripps Institute of Oceanography
and Lamont - Doherty Earth Observatory databanks, and other data, using quality
control procedures based on those of Smith [J. Geophys. Res. 98, 9591-9603,
1993] The satellite gravity field combines all data from the ERS-1 and GEOSAT
satellites including the data declassified in 1995".
- In email correspondence with David
Sandwell of Scripps, I was cautioned not to use any gravimetric data at depths
of less than 1000 meters. David says that "The gravity data provide almost no
information in the shallow ocean." He also stated that "We did the margins just
to make it look nice." Based on David's comments, I'm using only the
gravimetric bathymetry estimates from his dataset and only those values at 1000
meters or deeper.
- Data are
considered "estimates" because derivative processes are used.
- Because
these derivative processes use Fourier techniques, I suspect that the results
are realistically smoothed just as in the fractal terrain forgeries that use
Fourier methods.
- Smith provides
overviews and
detailed
descriptions of his process. He also provides a very clear
readme file
which nicely defines his file format.
- Scripps
provides the full dataset for
download. This
worldwide file is huge (136 MEG). The download is free but there will
apparently be a CD available soon. Cost unknown.
- While he did
his work using the GMT (Generic Mapping Tool) on Unix, Smith provides nice
C code snippets that
can be used to extract portions of his data. Ergo, we don't have to "reinvent
the wheel" for the NT extraction code.
- Current Data
Reliability Issues
- The database still
has numerous unknown data points at a 30 arc second resolution. Most of these
unknown points are in wetland areas.
- In addition, the
DBDBV interpolated data does not match well with the GEODAS trackline
bathymetry below 29 degrees north. This is probably because the DBDBV database
below 29 degrees north uses a 5 minute grid basis for its interpolations
instead of the 1 minute grid basis that it uses at and above 29
degrees.
- Data Extraction
Process
- From data extracted
from W140N40, create an GTOPO30 file for the designated geographic area.
- Extract GEODAS ship
track bathymetry data for the designated geographic area and place them in the
new GTOPO30 file.
- Extract GEODAS
coastline data and place them in the new GTOPO30 file.
- Extract satellite
generated bathymetry estimates from the Smith file and place them in the new
GTOPO30 file.
- Use the ordinary
kriging methods defined above to estimate the remaining unknown points at a 30
arc second resolution
- Geospatial
Interpolation Methods To Improve The Current Dataset
- There are a number
of techniques for interpolating geospatial data that are described in lecture
notes and outlines from various institutions.
- Unless I find
something better, ordinary kriging has been chosen to estimate the remaining
unknowns.
- Most other
interpolation methods are either too linear (e.g. triangulation) or handle end
points very poorly (e.g. splines).
- Advantages of
kriging
- Estimate is
based on the distance the unknown point is from known points. The farther a
known point is from the unknown, the less the influence it has on estimation of
height. This makes a lot of sense in terrain heightfield estimations because
nearby terrain has a much stronger influence on the height estimate for the
unknown.
- Estimate
can be obtained using a non-linear random function base model. And, one can
choose the base model that is used. Our world is rarely linear.
- The range
of influence can be pre-defined. Kriging permits the exclusion of known data if
it exceeds a certain range of influence. This makes sense because known height
data 50 km away probably has little influence on a local
heightfield.
- Disadvantages
of kriging
- Like any
other stochiastic process, its relationship to the real world is subject to the
model that is chosen.
- It is a
mathematically intensive process that can bring a computer to creep speed.
- There
are lots of square roots as distances are being computed.
- Matrix
inversion and multiplication is required.
- An
effective kriging process requires careful thought about the search strategy
that is used to find and use known values.
- A large
search window may bring in too many known local data points and drastically
slow the computer down as distances are calculated.
- A
poorly defined strategy for moving the search window will result in some
unknowns remaining as unknowns.
- My current
kriging process
- Methods to
speed up the process
- Each
search/computation window is limited to a 9x9 cell matrix. This limits the
number of data points and the search distance to reasonable
values.
- Precompute a 20x20 distance matrix on the first kriging operation.
This process eliminates the need for any further distance
calculations.
- Limit
the number of known values in the kriging computation to the 10 closest to the
unknown.
- Search and
computation strategy (all using a 9x9 cell window - 8.1km square)
- Start
in the lower left (SW) corner.
- Within
the window, search out and list all known points. If there are fewer than 8
knowns, or if all the knowns are zero (the beach) shift the window block one
cell to the right.. (Fewer than 8 unknwns might only happen with the first
computation set because subsequent window sets include some of the previously
calculated data).
- Using a
predetermined primary computation order for 9 cells (four corner cells, then
middle edge cells between the corners, etc.)
- Sort the list of knowns in order of their distance from the
unknown.
- Using no more than 10 of these knowns, perform a kriging operation
to estimate the unknown.
- Store the result.
- If
a cell defined in the primary computation order contains a known value, don't
do the above computation.
- This computational pass is based on bathymetric data
only.
- Using a
predetermined secondary computation order for 16 cells (in between the
previously computed primary cells)
- Sort the list of known bathymetric data and the estimates
calculated in the primary process in order of their distance from the unknown.
- Using no more than 10 of these points, perform a kriging operation
to estimate the unknown.
- Store the result.
- If
a cell defined in the primary computation order contains a known value, don't
do the above computation.
- Using
similar procedures, compute estimates for the remaing cells.
- After
finishing the solution process in a window, shift the cell matrix to the right
by 8. In shifting by 8, the left column contains data from the previous window.
- If at
the max right cell matrix, reset cell matrix indices to far left and shift up 8
cells. In shifting up by 8, the lower row contains data from the previous
window.
- Continue the process until all areas are
completed.
- Redo
the first window if there are missing data
- Interpolation Of The
Dataset At Precisions Greater Than 30 Arcseconds Using Fractal
Interpolation
- At 30 arcseconds
resolution, the resulting 3D rendering provides only a general idea of the
shape of the submarine terrain. It would be interesting to find a way to
estimate more detail as the viewer's eye gets closer.
- One possible
estimation technique is to:
- Supply
heightfield anchor points from the GTOPO30 grid.
- Create a second
GTOPO30 grid that has georeferenced data on the substrate roughness -- which is
usually known. This roughness can be espressed as fractal
dimension.
- With the anchor
points and fractal dimension, a fractal heightfield can be generated within a
grid cell. Using RTIN techniques, this finer resolution could be rendered
within the known dataset.
|
|
|