Cache Round-Trip Example with Metadata
This notebook demonstrates how to use the enhanced caching system that:
Caches DataFrames based on selected function arguments
Saves and reloads cached content via CSV with full metadata
Supports recovery of clean data frames with datetime indexing
Features
Metadata headers:
cached_function,index_name,keys, andcol_keysOptional preservation of some key fields in data via
col_keysAutomatic datetime index recovery on reload
[1]:
import pandas as pd
from dms_datastore.caching import *
[2]:
@cache_dataframe(key_args=["station", "variable"])
def ec_data(station, variable, subloc="upper"):
# Simulated time series
idx = pd.date_range("2020-01-01", periods=100, freq="D")
df = pd.DataFrame({"value": range(100), "station": station, "param": variable}, index=idx)
return df
[3]:
df1 = ec_data(station="vns", variable="flow")
df2 = ec_data(station="vns", variable="ec")
df3 = ec_data(station="old", variable="ec")
[7]:
cache_to_csv() # Writes individual CSVs with function-specific metadata
[8]:
LocalCache.instance().clear()
load_cache_csv("ec_data.csv")
[9]:
df_reloaded = LocalCache.instance()[generate_cache_key("ec_data", station="vns", variable="flow")]
df_reloaded = coerce_datetime_index(df_reloaded)
print(df_reloaded.dtypes)
df_reloaded.head()
value int64
station.1 object
param object
dtype: object
[9]:
| value | station.1 | param | |
|---|---|---|---|
| datetime | |||
| 2020-01-01 | 0 | vns | flow |
| 2020-01-02 | 1 | vns | flow |
| 2020-01-03 | 2 | vns | flow |
| 2020-01-04 | 3 | vns | flow |
| 2020-01-05 | 4 | vns | flow |