# PER-* and INST-* period types

HEC DSS uses a different convention than most date time libraries in use. The one that is difficult to deal with is the the end of interval labeling

For example, the monthly data in DSS is stored at the end of the month. Lets take a look

The monthly value for January 1990 is stored as the datetime of 31JAN1990 2400 which first off all cannot be parsed as such by python libraries. If you take the time as stored (long) and convert it to datetime in python it will be displayed as 01FEB1990 0000. 

The last time stamp that is considered to belong to the month of January is 31JAN1990 2359.9999... to nano-second precision. However the timestamp stored in DSS files is actually the stroke of midnight of the end of January and python datetime libraries mark that as a February date.



In [1]:
import pandas as pd

In [2]:
pd.to_datetime('31JAN1990 2400')

ParserError: hour must be in 0..23: 31JAN1990 2400

In [5]:
last_day = pd.to_datetime('31JAN1990')
last_day

Timestamp('1990-01-31 00:00:00')

See this discussion here https://stackoverflow.com/questions/37354105/find-the-end-of-the-month-of-a-pandas-dataframe-series on how to get end of month

In [12]:
from pandas.tseries.offsets import MonthEnd
last_day + MonthEnd(0)

Timestamp('1990-01-31 00:00:00')

None of those will give 31JAN1990 2400 as the answer. If you force the issue, e.g. the last nano-second of the month you will get this

In [15]:
last_day + pd.offsets.MonthBegin() - pd.offsets.Nano()

Timestamp('1990-01-31 23:59:59.999999999')

In [16]:
pd.to_datetime('01FEB1990 0000')

Timestamp('1990-02-01 00:00:00')

In [18]:
pr = pd.period_range(start='01JAN1990',periods=2,freq='M')
pr

PeriodIndex(['1990-01', '1990-02'], dtype='period[M]')

## So what is the answer to this dilemma?

The best that can be done IMHO is to use the pandas PeriodIndex which signifies a range of time.
E.g. the monthly data is read in with a period of 'M' and shifted by 1'M' interval to the left (-1) for data stored in DSS. 

This is not ideal because it uses the period_type 'PER-AVG' as the reason to shift the data and convert it to periods. 

Also math operations are not easily done with data indexed by PeriodIndex vs DatetimeIndex. These should be handled by users as it depends upon the usecase.

One example would be to resample the datetimeindex'ed data to monthly and then add it to period indexed data. 

Another issue is that when resampling the user would want use different options so that the 

These issues are pandas issues and can be dealt with in the python or pandas space. 