GSAC Web Services for Data Repositories

GSAC is UNAVCO's Geodesy Seamless Archive package, supplying web services for geodesy archives, developed through NASA ACCESS funding to UNAVCO and to partners CDDIS and SOPAC. Additional development by UNAVCO in 2012 and 2013 was funded by the the U.S. National Science Foundation, in support of the COOPEUS program.

GSAC is web services. GSAC provides complete, modern, and consistent web services at geoscience data repositories, for discovery, sharing, and access to data. GSAC web services enable remote users to discover site and instrument information, and to download data files, from a data repository. GSAC supports queries for standard information about geodesy sites and instruments, and provides access to instrumental data files.

Data centers with geodesy data, such as GPS/GNSS instrument files, can use GSAC to participate in worldwide data sharing with scientists. Using GSAC, a geodesy data center can offer standard and consistent web services for scientists to query the data center about sites and instruments and to download data files.

A GSAC archive can be queried for information and data files with forms on its web page, or by using API calls using a HTTP request, or with a client program supplied with GSAC, or directly from the command line in Linux. A single request to GSAC can find and download hundreds of GNSS data files, or complete information about a network of sites with instruments.

GSAC Web Services

GSAC is a free, open-source, software package. Any organization with data files from instruments at sites, and information about the data files and their sites (monuments or stations) and instruments in a database, can provide web services about their data repository to the community by installing GSAC. GSAC uses the latest software technology to install and provide the latest web services. A GSAC introduction is online: UNAVCO GSAC Web Services for Geodesy Data Repositories.

GSAC is built on the concept of sites with instruments making data files. GSAC is not just for GPS receivers and RINEX files. Two tables in the prototype database let a GSAC installation handle different kinds of sites (instruments) and different kinds of data files from instruments. The database information about a site has one of the "station_style" values, such as "GPS/GNSS Continuous" or "Strainmeter." Each data file's information in the database has one of the "file_type" values, such as "RINEX GLONASS navigation file" or "BSM (borehole strainmeter) Processed." GSAC servers in operation already handle GNSS, VLBI, SLR, DORIS, and tide gauge instruments and data files. No code changes are needed to GSAC to use new instruments or data file types. As a practical matter, to allow consistent use of all GSAC's, GSAC supplies standard unchanging names for types of instruments and files in the prototype database schema. You can see the current prototype database schema in the installation page on this web site. If you add a new type of instrument or data file to your GSAC, please inform UNAVCO so we can include it in new GSACs.

GSAC has a well-defined purpose, to provide complete and consistent web services for data repositories. GSAC enables powerful tools for discovery, sharing, and access to data. Please note that GSAC is not a management system for operation of a data repository. GSAC does not do data processing, or error checking of data values. To start a data repository which can use GSAC, see "Operating a Basic Geodesy Data Center" below.

A Collaboration for a Web-Services Enabled Geodesy Seamless Archives

The NASA ROSES ACCESS Program funded UNAVCO, CDDIS and SOPAC to expand and modernize metadata exchange definitions and technologies first developed in the late 1990s as the GPS Seamless Archive Centers (see the old GSAC Web pages for historical references about the original GPS Seamless Archive Centers.) The University of Nevada Reno (UNR) was a fourth partner in the new GSAC project, and tested the web services by incorporating them into their daily GNSS data processing scheme. The new GSAC is a completely new software development from the original project of the 1990s.


UNAVCO was funded by the NSF, from 2012 to the end of 2013, to participate in the COOPEUS project to connect major environmental-related research infrastructures for "the efficient access to and the open sharing of data and information produced by the environmental research infrastructures... the COOPEUS project will serve as a testbed for new standards and methods."

About GSAC Web Services and Software

GSAC is a suite of open-source Java code for geodesy data repositories, with a web-browser-based UI for data search and data file downloading. GSAC has a self-describing web services RESTful API through URLs, and a client process to work with the API. GSAC provides site (station) log information in formats including SINEX and GAMIT station.info files, as well as new IT formats such as XML and JSON.

GSAC is ready-to-use middleware between your database and file system on one hand and a remote user needing to query metadata or access data files on the other. GSAC uses your database, data files, and web server, and creates web services for data discovery and data downloads from your repository. There is also a GSAC command line client, a Java-based client program for accessing a GSAC repository through its API, that allows users to do programmatic searches of a GSAC repository and to download files. Like other web services, GSAC accepts an incoming URL-based request, handles the request, and returns a result. The result may be metadata information, or data file access information.

The GSAC results' format for site metadata is user selectable. GSAC offers site query results in web pages (HTML), and in the geodesy formats SOPAC XML site log, SINEX, GAMIT station.info files; as two csv formats designed for computer use; and as a plain text file for human visual checks. You can see Samples of GSAC site queries' result files in differing formats.

From data file searches you can see a web page with a list of FTP or HTTP URLs for file downloads from your server, or get a file which is a wget script for FTP or HTTP of all files discovered by one file search, suitable for computerized file downloading of multiple files.

GSAC Data Access Principle

GSAC uses the UNAVCO open data policy, which is similar to data sharing principles of the Group on Earth Observations(GEO):

    All shared data, metadata and products being free of charge or no more than cost of reproduction will be encouraged for research and education.

You may choose not to share some data. You need not share all your data, with everyone, all the time. GSAC provides a way to hide from public searches selected data files and/or site metadata in your data center, when needed.

In general, GSAC offers free access to data files and site metadata without restriction. GSAC does not track or authenticate users. GSAC does not recognize or manage user accounts or passwords. If you wish to provide user authentication for data file downloading, you need to add that as part of your FTP or HTTP data file download server (which GSAC knows nothing about).

Since GSAC is designed to participate in federated data access, adding user accounts to GSAC itself is contrary to its fundamental design principles. You may be able to add user acounts on top of GSAC, at a location, but then that GSAC will not work with other GSACs. Such an user-account-controlled GSAC server will conflict with end users' expectations of GSAC operations.

Operating a Basic Geodesy Data Center

You can create and operate a new geodesy data center for public access to your geodesy holdings, and using GSAC for web services. In addition to GSAC, you need a collection of data files from instruments, and an FTP or HTTP server to provide them for download. See for example this typical URL for an FTP geodesy data file server, ftp://data-out.unavco.org/pub/rinex/obs/2007/055/ , at UNAVCO. You also need a database with complete information about the data files, and information about the sites (stations, monuments) and about the instruments where the data was collected. The GSAC prototype database schema, available from the GSAC web site on the GSAC installation page is a suitable database. You need to populate the database yourself. You will need to maintain the data files, the FTP server, and the database as new data arrives and as instruments and sites change. Using a database is a much better approach to archive management than, for example, keeping site information in free-form (ASCII) files, such as SINEX files or IGS site log files which are extremely error-prone.

UNAVCO may, in future, provide a software package to help manage a geodesy archive, but this will be new software, and separate from GSAC. That new code may use GSAC for the public interface to the data repository, but the archive management software will not be GSAC. GSAC is web services.

Federating GSAC

GSAC software supports single and federated repositories. You can provide a data repository with no ties to other organizations or archives. GSAC can be as simple or as complex as your data holdings. GSAC becomes particularly powerful when used in the federated mode for search and download from several different GSAC-enabled data repositories. A federated GSAC system dynamically queries two or more participating GSAC repositories directly through their locally implemented GSAC APIs. A federated GSAC does not maintain its own database to copy other GSAC data holdings, nor does it mirror collections of data files. GSAC software was designed from the beginning to support simultaneous queries and results from multiple cooperating GSAC repositories.

An independent GSAC at one data repository provides data search, discovery and download mechanisms to remote users, for that one repository. Every GSAC repository publishes on its web site an XML document of its API (a capabilities document). Remote users, including federated GSAC servers, can employ those REST API arguments in commands for their queries.

A Federated GSAC is a GSAC server that provides joint (federated) search and data file downloads from two or more independent GSAC-enabled data repositories, in one search and retrieval tool. It is a separate GSAC installation which uses the services of other GSAC repositories, and operates without any local database or geodesy data files of its own. A federated GSAC need not be located with a data repository.

Federated searches are made possible since GSAC can read and use the capabilities document of other GSAC repositories (published in the 'Repository information xml" file, available in each GSACs web site). Federated GSAC servers query in parallel two or more GSAC data repositories. A federated GSAC web service does not copy instrument information or data files from remote servers, but rather knows how to query remote servers for information they hold.

An unresoved question is access permissions for federation of a GSAC data repository. As GSAC is coded now, anyone can create a federated GSAC service using any other GSAC data repositories. Some kind of registry may be wanted or needed to limit federation to approved collaborators.

In principle, one could build a network or hierarchy of federated GSACs, one federated GSAC calling several other federated GSACs. (This has not been tested.) Multilayered federated GSACs could provide data discovery and download from, potentially, dozens of single GSACs, for example, providing a single search for many data types in one geographic region, from many data centers.

Instructions for installing a federated GSAC are in the README Part 1 file in the GSAC pacakge. See this installation page on this web site.

More about GSAC

Installing GSAC is much simpler than creating your own set of web services for a data repository. Modern web services call for professional software engineering; GSAC provides a complete and tested package of modern web services. Coding required for web services must create a browser-based user interface, an API interface with published capabilities, request handling, and output handling such as making SINEX files and GAMIT station.info files in the correct format from miscellaneous data values. GSAC has all the code for these functions, plus a client program. And GSAC delivers results in several formats, including geodesy formats such as SINEX. GSAC software is supported and improved by UNAVCO.

GSAC has a precisely defined and consistent goal: to serve information and data files from your archive to remote users. GSAC is not a software for geodesy archive operations, or to manage data files, site logs, or metadata. GSAC does not check, verify, or insure the "correctness" of any data it provides; that is the responsibility of the data center operator. In UNAVCO's experience managing data files and related information is best done separately from the software for remote data discovery and download. Serving data discovery and download requests is not the time to check for errors in data.

UNAVCO is considering development of a new code package for geodesy archive operations. In that case GSAC may be used as the public web service interface to the archive.