The geological sciences, maybe more than any other discipline, rely heavily on field studies to evaluate and interpret the natural world. It is increasingly important for field-based geoscientists to incorporate digital data management systems into their workflow to increase efficiency, data sharing, and collaboration and, ultimately, to answer larger scientific questions. Unfortunately, there is no one-size-fits-all data management system; field practices vary widely across disciplines, and individual researchers tend to customize these practices further.
However, with sufficient community involvement and communication, we can design a shared cyberinfrastructure to support most of the requirements of field-based research. A community cyberinfrastructure would preserve all field data and relevant metadata and make it publicly available while eliminating time-consuming and tedious postfieldwork digitization and data entry.
As part of the National Science Foundation’s (NSF) EarthCube initiative, NSF funded a Research Coordination Networks (RCN) project to facilitate dialogue between field-based geologists who currently lack an efficient digital workflow and computer scientists who specialize in databases linked by cybertechnologies. To this end, the project, called Earth-Centered Communication for Cyberinfrastructure (EC3), organized a field excursion in August 2014 to Yosemite National Park and Owens Valley that brought together representatives of these two groups.
During this field trip, the computer scientists learned about specific challenges related to field data collection, and the geoscientists were exposed to new technologies and infrastructure concepts that could allow them to work more efficiently and collaboratively. This trip facilitated lively group discussions about field technologies and their supporting cyberinfrastructure. Here we report to the broader community the results of these interactions to help build consensus and direct future funded projects.
Portable Electronic Tools
Much of the discussion focused on the benefits and downsides of the new wave of handheld devices and computers (including smartphones, tablets, rugged laptops, GPS, and digital compasses) for field data collection. Unfortunately, there is no current standard regarding the usage of these tools; each field geologist develops his or her own methodologies and conventions. This lack of standardization makes it challenging to share data and to design an effective and transferable cyberinfrastructure.
As an example of new field technology, participants from the British Geological Survey (BGS) presented its free software package, System for Integrating Geoscience Mapping (SIGMA), which enables scientists to collect, visualize, and map data on rugged tablet PCs. BGS’s experiences in developing SIGMA provide lessons for our communities about developing field tools in tandem with cyberinfrastructure.
In particular, BGS developers work closely with geoscientists, often in the field, in an iterative process to develop an effective workflow for geologic mapping, sampling, and macroscopic data collection. SIGMA offers a wide variety of tools for sketching, stratigraphic logging, data entry, mapping structural contours, outcrop projection, and customizing one’s workspaces. One downside is that SIGMA relies on the Environmental Systems Research Institute’s (ESRI) proprietary ArcGIS software, which limits user access.
In addition to SIGMA, there are a number of applications for Windows, Android, and iOS mobile devices that assist geological data collection. A partial list includes
- University of Kansas implementation of ArcGIS
- University of Texas at El Paso’s implementation of the open-source alternative QGIS
- Geology Sample Collector
- Strike and Dip
- eGEO Compass
Unfortunately, all of these apps either have issues that limit their functionality or have customized interfaces that might not fit all workflows.
One application that could ease paper-centric field geoscientists into a digital workflow with minimal disruption is Capturx. This product uses a digital pen that can capture handwritten text, numbers, and sketches from a specially designed field notebook. Once the data are uploaded from the pen, the user can then convert the handwritten notes and numbers to digital form with handwriting recognition software. This approach may work particularly well for researchers who are not yet comfortable with mobile devices and software. Other field trip participants also experimented with smartphone apps with voice recognition (like Evernote) to capture field notes, with significant success.
One common family of apps is point-based orientation data loggers that make use of the compass-inclinometer sensors built into many smartphones and tablets (Figure 1). During the field trip, many participants collected data using their phones (a variety of brands and models), and they assessed the quality of their measurements using several of these different apps. The results of this experiment were rather sobering. The range in their data for a single planar orientation was greater than 50° in the strike direction (a typical analog compass can measure the strike of a plane to within a few degrees). Much of the scatter originated from devices that lacked calibration modules, but other sources for error remain unknown.
The Ideal Field Data Collection System
To assess the priorities of the EC3 field trip participants for a mobile data collection system, we had them vote on suggestions that had been proposed in small breakout groups. The ranked list of recommendations in Figure 2 shows that our participants were very interested in an all-in-one type of device as long as they had confidence in the accuracy of the sensors built into that device.
This all-in-one system would alleviate the need to bring separate devices for GPS location measurements and collecting orientation data. Also, most people want an app to have the “feel” of a traditional field notebook. We interpret this result to mean that researchers do not want to be limited by the input format of a given app, for example, that they don’t want to wade through a series of drop-down menus and that the freedom to customize the screen is important to them.
Several researchers emphasized the importance of sketches, not just as an important way of recording information but also because they play a cognitive role in thinking through scientific hypotheses like potential subsurface geometries and the temporal evolution of large-scale structures. The consensus was that the ability to easily draw interpretive lines directly on digital photographs while in the field is a real advancement.
Participants also felt strongly that any application that we developed should have underlying open-source code that community members could easily alter to fit their individual needs and that there should be a forum where researchers could share their modified code and modules.
Last, although the software in digital fieldwork is critical, there are some significant hardware challenges as well. Community members were concerned that the current generation of ruggedized devices has screens that are very difficult to see clearly in direct sunlight, especially if several people are trying to look at the same screen. These concerns have existed since the inception of digital mapping, and some devices address them much better than others.
Management of metadata in a digital field workflow was another thought-provoking topic. The participants were asked to list all of the metadata that they felt were important to collect during fieldwork (Figure 3).
The initial part of this discussion focused on the very definition of “metadata” and how relative a term it truly is. For instance, in a study on the accuracy of location techniques, information such as the number of satellites and location error would be considered data, whereas in nearly all other studies they would be considered metadata.
Conversely, lithologies, orientations, fossil assemblages, and similar information would be considered primary data in most geological studies. However, in certain contexts, they would be considered metadata. Some computer scientists felt that it wasn’t important to classify data as either “meta” or “primary,” as long as community standards exist for what needs to be included.
A community-developed field app would help researchers capture data and metadata using community standards. For instance, a GPS device knows how many satellites it has used to determine its position, but most are not programmed to record that information, and those data are lost. Developing a field data collection app in tandem with community conversations about data and metadata standards may go a long way toward automating this process for many field-based researchers.
Some participants asserted that the most important function of metadata is to document a workflow and ensure the reproducibility of results. During a group exercise, participants split into groups and tried to describe a detailed workflow from a preselected article.
They gained an appreciation for the fact that method sections in scientific journal articles are rarely sufficient to fully reproduce a researcher’s workflow. Some argued that full methodologies should be given their own unique digital object identifier, which subsequent publications could then cite. Metadata could play a vital role in documenting those methodologies and helping researchers assess the quality of a given data set.
A Meeting of the Minds
A unique aspect of the EC3 field trip was the bringing together of field-based geoscientists and computer scientists. By all accounts, this was a resounding success. Geoscientists were introduced to the range of technological possibilities, and the computer scientists gained a much deeper appreciation of the issues associated with field geological data. It was quite exciting and rewarding to see how the shared field experiences on the outcrop led to conversations that simply would not have occurred otherwise.
An important result of the EC3 field trip was the consensus on the need for developing an open-source field data collection application that adopts community standards on data and metadata. How that is accomplished and funded is beyond the scope of the EC3 project, but we hope that those who eventually develop this software will take our observations into consideration.
These are exciting times for geologic studies. We face a new era in which we can integrate field data into multidisciplinary geologic data systems, and a community-developed app will play a significant role in ushering it in.
Matty Mookerjee and Daniel Vieira, Department of Geology, Sonoma State University, Rohnert Park, Calif.; email: [email protected]; Marjorie A. Chan, Department of Geology and Geophysics, The University of Utah, Salt Lake City; Yolanda Gil, Information Sciences Institute, University of Southern California, Marina del Rey; Terry L. Pavlis, Department of Geological Sciences, University of Texas at El Paso, El Paso; Frank S. Spear, Department of Earth and Environmental Science, Rensselaer Polytechnic Institute, Troy, N.Y.; and Basil Tikoff, Department of Geoscience, University of Wisconsin–Madison, Madison
Mookerjee, M., D. Vieira, M. A. Chan, Y. Gil, T. L. Pavlis, F. S. Spear, and B. Tikoff (2015), Field data management: Integrating cyberscience and geoscience, Eos, 96, doi:10.1029/2015EO036703. Published on 13 October 2015.