Station Database and Queries
Station Lookup
The station_info command lets you search for stations by name fragment or ID:
$ station_info francisco
See dms_datastore Command Reference for the full CLI reference.
Configuration System
The datastore uses YAML files and Python modules to manage station metadata, variable mappings, and screening configurations.
Main configuration files
dstore_config.yamlThe central configuration file. Defines paths to station databases, repository locations, source priorities, and screening configurations.
dstore_config.pyPython module that reads the YAML configuration and exposes helper functions.
Key data files
station_dbase.csvMaster database of all stations. Key columns:
id— internal unique identifieragency_id— ID used by the collecting agencyname— descriptive station namelat,lon— geographic coordinatesx,y— projected coordinates (SCHISM mesh-corrected)
variable_mappings.csvMaps agency-specific variable codes/names to standardized variable names used within the datastore.
variable_definitions.csvDefines standard variables with their units and descriptive information.
station_subloc.csvDefines sublocations (e.g., depths, sensor positions) for stations where the station ID alone is insufficient to identify a unique datastream.
Configuration API
The dstore_config module exposes these functions:
station_dbase()Returns the station database as a
pandas.DataFrame.sublocation_df()Returns the sublocations table.
configuration()Returns the full configuration dictionary.
config_file(label)Returns the path to a named configuration file. Checks the current working directory first, then the built-in
config_data/package directory.
All functions cache their results to avoid repeated filesystem reads.
Screen Configuration
The screening configuration YAML (referenced by screen_config in
dstore_config.yaml) drives auto_screen and contains
rule sets for:
Bounds checking — acceptable min/max values per variable
Spike detection — parameters for flagging data spikes
Repetition checking — rules for flagging suspicious repeated values
Custom screening functions — advanced algorithms for specific data types