The following post was written by one of Praescient’s Advanced Analysis Fellows, a former marine biologist who is working with the Initiatives Group, and articulates how traditional research and analysis techniques can be (and ought to be) applied across disciplines to support special projects. The Initiatives Group is a collective of thought-leaders, subject matter experts, data specialists, and global operators responsible for solving the most complex problems facing the Nation. Our highly-respected team is dedicated to aiding legislators, decision makers, executives, and senior leaders within government, federal, private, and public organizations.
Fish ID#473-A, named Bob in honor of the man who caught him, was a prime specimen of Morone saxatilis, also known by its more common name the Striped Bass. Bob weighed 38 pounds and measured close to 40 inches long when he was caught near the mouth of the river. He had been surgically tagged with an acoustic transmitter that would emit a unique signal burst every 5 seconds. When in range of one of the eight hydrophones in the local river system the signal would be logged with a corresponding date and time, a water temperature reading, and a water salinity reading for future analysis.
Bob, and many other fish just like him, were part of a study exploring Striped Bass behavior in the New England estuary river systems. By looking at the collected data points, the scientist would try and develop a pattern of life for each fish, and then try and expand that pattern of life to the entire tagged population. Linking data points of movement to environmental and temporal data helped build a behavioral model that could accurately predict what fish were doing in the river without having to continuously sample the entire population.
While correlations between time, temperature, and salinity proved to be very informative, some of the more interesting information was derived from the relatively large portions of “missing data”. Due to constraints, the hydrophone listening stations were only placed at certain “choke points” in the river system. By design, fish were only geolocated on the river as they passed through relatively small “gated” areas; thus, most of the time Bob was in the river he was transmitting without an active receiver.
Remaining aware of assumptions and/or hindrances to analysis can prove vital.
First, researchers lost granularity (i.e. exact location and environmental factors) even with a positive return on signal. Second, scientists structured their research with rules that existed in the data. When analyzing the activity of Bob, however, both environmental and technical constraints existed.
Identifying gaps in analysis is essential to arriving at valuable conclusions.
The technical constraints of a transmission every five seconds and a hydrophone reception failure rate of ~ 5% create the upper bound for the number of possible recorded signals. Environmentally Bob was fish living in a river with a dam on one end, followed by a series of hydrophone sonic gates, and bracketed by the ocean on the other. This meant that every time Bob’s transmissions were not being recorded, he was by substitution in a zone out of range between two hydrophones or in the ocean.
Meta-data analysis allows users to scale unwieldy or unworkable data to existing analysis workflows.
For example, examining Bob’s null returns on signal (i.e. a signal that exists but is not heard), scientists leveraged meta-data for traffic analysis. While the new data was limited to a wider “bin” with fewer intrinsic properties (e.g. geolocation zone with only date/time properties), some of these handicaps could be overcome with the addition of external data layers explaining the historic environmental factors to provide detailed analysis. Also, by expanding the focus area and sacrificing some data properties, scientists were able to obtain new data points to submit for robust statistical analysis.
The analyst’s ability to extract meaning from the data depends largely on how analysis is conducted within the known boundaries.
The scientists conducted analysis within the known rules and constraints of the database. Often, a null return was void without any meaningful information, which prevented researchers from assigning patterns and properties where none truthfully existed. The real value of Bob’s null return data set only emerged when analyzed first against the environmental and technical system rules, and second with the external data overlays.
Team Praescient is committed to actively shaping the future of analysis. Our company is structured around small, focused teams comprised of intelligent and passionate people to encourage ownership and innovation. Many of us have deployed to combat zones across the globe, others have fought corruption from the halls of premier academic institutions, and still others have made their mark by going head-to-head with the media moguls of Hollywood. We believe that this diversity is what makes us high-impact leaders and uniquely positions us to make a difference.