What is the UEA Seaglider Toolbox

The toolbox is a collection of integrated Matlab (v.2014a+) scripts aimed at making Seaglider (University of Washington & Kongsberg) data easier to process. Glider data suffer from numerous issues, namely sensor lags and badly constrained flight models, but these can generally be corrected for in post-processing, assuming the necessary information is present in the dive data.

There are a wide range of parameters under which the glider will fly, but few which are optimal for data collection. This page will detail some of the suggested practices for obtaining better glider data and processing it through the toolbox.

Broadly speaking, the toolbox will:

1. Load the glider data, preserving raw data from the .eng, .log and sg_calib_constants.m file.

2. Output processed data in the .hydrograhy substructure with the following corrections and caveats:

 2a. Recalculation of pressure/depth based on the "pseudometers" output by the gliders. There is still no pressure hysteresis correction included in the toolbox. I would welcome any suggestions.

 2b. Neither does it correct for time and pressure offsets between recorded timestamps and actual sample times due to the single-thread processor as both people at the University of Washington and I have found it impossible to do (hence why they're developing a separate science controller).

 2c. The toolbox then calculates a first-guess estimate of glider flight to do a first correction of conductivity (based on Morison et al. 1994).

 2d. It launches a GUI for flight model regressions which works broadly along the same lines as those of the University of Washington code and the methods described in Frajka-Williams et al (2011). Testing at a workshop in Bergen (May 2016) showed both the UW and toolbox code produced near identical results.

 2e. Using the improved hydrodynamic parameters, the toolbox recalculates the conductivity correction as described in Garau et al. 2011 using code developed as part of the SOCIB glider toolbox.

3. The toolbox then moves on the the secondary sensors, each with their own independent routines. The only one presenting significant complexity in processing is that of the Aanderaa 4330 optode. The algorithms are still being trialled and the results are not yet ideal. I plan to incorporate the Bittig method when I can find the time.

Essentially, it provides the same capabilities as the UW/Kongsberg basestation 2.04/2.05 processing capabilities with added features (conductivity corrections, improved flight regression model and time varying drag, better handling of oxygen data). UW/Kongsberg processing as of v2.08 has some different CT corrections developed by Charlie Eriksen but are currently not complete in the version distributed by Kongsberg. The improved (complete?) version should come out with basestation software version 2.09. This goes back to first principles (Lueck & Picklo 1990) modelling heat flow through the CT cell to better estimate T within the CT cell and therefore get better salinity/density. It provides a significant improvement if you sample at high frequencies (> 1Hz ish). At low frequencies (0.2Hz like most current gliders), the improvements do not seem to be significant relative to the Garau et al. 2011 method. Further tests still need to be done.

Pre-deployment tips

Before deployment, the single most important element to quantify is the mass of the glider, as accurately as possible. Once mass and seabird CT coefficients are accurate both on the glider and in sg_calib_constants.m, you are able to regress all other flight model related parameters.

While piloting

It is generally safe to assume that the first 15 dives will not be used for accurate science collection as they are much more useful setting up what you need to then ensure the glider flies significantly better during the rest of the mission. The first 5 or so dives will be used to trim the glider (VBD, then pitch, then roll).

Once the glider is trimmed, there should be two dives dedicated to the in situ compass calibration. As a general rule, I recommend always flying using $COMPASS_USE,4. Not only does this tend to provide the best compass data, it also outputs the compass data for future reprocessing if necessary.

Finally, I recommend about 8 dives trying to cover the widest pitch - buoyancy parameter space for the flight model regression. You should aim for some shallow dive anglies ($GLIDE_SLOPE or T_DIVE) with both high and low buoyancy ($MAX_BUOY), then steep dives again both fast and slow. This will provide the very best estimates of the lift and drag parameters on the glider.

With this information, you will have a glider with trimmed C_VBD, C_PITCH/PITCH_GAIN and C_ROLL_*. The newly regressed hd_a, hd_b and volmax coefficients should be updated in sg_calib_constants.m and the first two uploaded to the glider via a CMDFILE. Once the glider has the improved hd_a and hd_b, verify the CT constants and update the $RHO parameter to be equal to the bottom density observed by the glider. This will allow it to determine its own "volmax" equivalent and therefore fly as well as possible. This is the prefered approach for obtaining symmetrical dive profiles; the added benefit is that the flight profile is now well determined by the T_DIVE parameter as the glider will be able to determine the best buoyancy and pitch parameters independently.

Further details relating to the flight model are presented further down.

How do I run it?

The glider toolbox can be downloaded from the Bitbucket repository: UEA Seaglider Toolbox

Simply add it to your Matlab path, ensure you also have the Gibb's seawater toolbox installed and you're set to go. For more advanced processing, it helps if you have a C compiler (gcc is easily installed on Mac and Linux alike). For Windows users, MinGW is a good solution. Either install it manually, or try the Matlab built-in Add-On manager, located in the toolbar in the main window, and look for MinGW.

If your .log, .eng and sg_calib_constants.m files are in the same folder, then it's as simple as loading up the data:

raw = gt_sg_load_merge_data(502);
first_pass = gt_sg_process_data(raw);

This will ask you to select which sensors to process, then show you the flight model regression GUI. Once the flight model is regressed, you can close the window and it will finish processing and output a variable with the following substructures:

.log:.log file data, unmodified
.eng:.eng file data, unmodified
.sg_calib_const:sg_calib_constants.m settings, unmodified
.hydrography:Calculated hydrographic variables (the end product)
.flight:Output from the flight model part of the toolbox
.gt_sg_settings:Settings used for processing, mostly the same as sg_calib_constant except for newly calculated parameters. Will be used preferentially to sg_calib_const if set.
.date:Date in matlab format
.gps_postdive:GPS from the end of the dive when the glider surfaces
.gps_postpreviousdive:"gps_postdive" from the previous dive, GPS coordinate from just before the last phone call
.gps_predive:GPS from the start of the dive, before the glider dives
.flag:Not yet implemented. Flags will indicate data point quality
.gt_sg_log:Log of actions performed by toolbox

Flight model regression

The flight model regression GUI has 3 particularly useful tabs.

The first, "Data Selection"strong> allows you to subsample the glider dataset to regress only on useful data and to speed up the overall process. The key thing to remember here is that it is more important to regress over a large pitch-buoyancy parameter space than to regress over a lot of dives. In particular, the model relies on the glider being in a steady state (no ac/deceleration, turning, etc.). I suggest selecting about 50-100 dives covering a wide range of dP/dt (central panel) and pitches (bottom panel). The apogee - and more importantly the surface - cutoffs (top left) are very useful. As the glider is not meant to be ac/decelerating, density gradients can be an issue. Also, in warmer insolated regions, the glider hull may warm up at the surface confounding the glider buoyancy calculations. In these cases, I often cut the top 120m and also remove the bottom 30m as data tend to be dubious. Once you select "Save settings", it switches over to the regression page.

The second tab, "Parameter Regression", is where the magic happens. The bottom right plot shows the mean state of the model. The red lines show vertical velocity as calculated by the change in pressure over time, and the blue line as per the buoyancy model. The difference between the two is an estimation of vertical advection (WH2O). The regression is based on the assumption that there should be almost no WH2O.

To start, I recommend selecting the "Attempt auto regress" button. This will perform the regression suggested by the University of Washington. It will try to minimise the mean WH2O by changing the lift (hd_a), drag (hd_b) and glider volume (volmax) parameters. It will do two passes by default. On the second pass, look at the improvement on the minimisation score. If the change is small (< 1%) then there will be little to no further improvements to be gained with further auto-regressions. At this point, most users will be satisfied with the results and will be able to move ahead.

For gliders with particularly unusual and large sensor loads, large quantities of syntactic foam or in very extreme environments (ie. temperature gradients greater than 20 degrees), it may be worth playing with the other parameters. In these cases, you'll usually see a poorly constrained flight models in certain vertical regions, and/or a skewed estimates of WH2O (the green line). In such cases, I usually deselect hd_a, hd_b and volmax so that they remain fixed and select hd_c (induced drag, if I have a very large and protruding sensor load), absolute compression or thermal expansion (if I have large amounts of syntactic foam, reagent bags or large non compressible sensors). I then sclick the "Regress parameters" button and inspect the bottom right plot to assess what improvements may have occurred. I strongly recommend reading the Frajka-Williams et al. (2011) paper for more information on the effects of the various parameters and the different scoring functions.

The third tab, "Visualisation", is useful for assessing the quality of your regression. At present, this tab shows 3 separate plots. The top left shows estimated WH2O. Ideally, you'll see almost no signal, or a signal consistent with the hydrographic properties of the section. What you DO NOT want to see is a pattern consistent with changes in flight shape; this indicates a poorly constrained flight model. The bottom left plot shows differences between up and down casts of WH2O. In this plot, you're essentially looking for the opposite. In ideal conditions, it will appear as random noise. You DO NOT want to see structure similar to hydrographic changes (or flight parameters for that matter). It is rare, even in well constrained flight models for this to be completely the case. It is very hard to assess whether what you observe is real or not and will always require in-depth investigation separate from the toolbox. Finally, the third plot on the right shows an estimate of stratification agains WH2O. In a well constrained flight model you will see exponentially decreasing WH2O as buoyancy frequency increases. Proper assessment of WH2O from gliders is still more of an art than a science - at least for now. Don't trust the output as it comes, it should be considered a qualitative rather than quantitative estimate.

If you wish to return to the flight model regression, I would recommend re-running the toolbox processing, as this will lead to the derived variables (conductivity, salinity, density, oxygen) also being updated:

second_pass = gt_sg_process_data(first_pass);

Otherwise, if this is not desired, it can be called independently. This is not generally recommended, but can be useful when trying to determine time-varying changes in drag (hd_b):

new_flight_model = gt_sg_flightmodelregression(second_pass);

To use time-varying lift and drag parameters, data.gt_sg_settings.hd_a, b and c should be set to an array of length the number of dives, with each element the value for that dive. This will bypass the flight regression GUI and simply recalculate the flight model. This functionality was recently introduced for the Scottish Association for Marine Science and has not been fully tested. Its benefits have not been quantified.

Conductivity lag for salinity and density spikes

As this process is very time consuming, it is only run on a subset of 100 dives when the toolbox goes through an autonomous run, and only on the first time it is processed (it skips the regression if the variable first_pass.gt_sg_settings.sbe_cond_coefficients is already set).

If you wish to improve conductivity corrections further, you can then reprocess your output structure (here called "first_pass") by doing:

data_improvedCT = gt_sg_sensors_sbect(first_pass,n);

where n is either an integer representating the dive downsampling factor (1: run on all dives; 15: run on one in fifteen dives) or an array of dive numbers. This will rerun the minimisation which reduces the area between the up and down theta-S curves as described in Garau et al. 2011 on the selected dives, update the parameters in .gt_sg_settings.sbe_cond_coefficients and output the new conductivity (and resulting salinity, density, etc.) in .hydrography.

To assess the change, the variable salinity_nocorr is also created from temperature and raw conductivity, and can be compared to the regressed one.

Useful parameters and some code snippets

To extract a timeseries from a top-level structure:

temp_timeseries = [data.hydrography(10:20).cons_temp];

Extracting a timeseries from a sub-level structure is unfortunately not as straightforward:

tmp = arrayfun(@(x) x.trajectory_latlon_estimated(:,2),data.flight(12:36),'UniformOutput',false);
estimated_subsurface_longitude = vertcat(tmp{:});