DA Prices seems to have broken. #346

PeteGilbert98 · 2024-10-04T12:53:06Z

I've noticed that as of today, the function getting the da_prices from entsoe seems to have broken. My assumption is that the format sent from the entsoe api must have changed.

As you can see, all prices now seem to be the same.

Here's what I think might be causing the issue:

On line 36, I would have expected the _extract_timeseries generator to produce multiple soup objects for the different days that I have queried. Instead only one object is returned.

This means, on line 88, the keys in the dictionary (on line 83) are overwritten, as the keys are non unique per day.

on line 92, the datetime index appears to be correct, covering the whole timespan of the (buffered) query.

When the (seemingly correct) index is combined with the data. Things start to break.

LuisTellezSirocco · 2024-10-04T12:57:46Z

Hi, I'm not here to answer you about the problem, but to tell you that you have leaked your API Key in one of the images...

PeteGilbert98 · 2024-10-04T13:39:41Z

Hi, I'm not here to answer you about the problem, but to tell you that you have leaked your API Key in one of the images...

Thanks, I changed token

borg42 · 2024-10-04T14:24:16Z

I have the same problem. It worked fine yesterday, but now i have the same problem as described by @PeteGilbert98. I think this means that Entso-E must have changed something in the XML, right?

fboerman · 2024-10-04T14:29:38Z

Dear all, thank you for the fast report. I see my dashboard also having the same behaviour now. I am currently on holiday without access to my computer but will be back soon. In the meantime a pull request with a fix is always welcome! ⁣Get BlueMail for Android

…

On 4 Oct 2024, 16:24, at 16:24, "Olaf Lüke" ***@***.***> wrote: I have the same problem. It worked fine yesterday, but now i have the same problem as described by @PeteGilbert98. I think this means that Entso-E must have changed something in the XML, right? -- Reply to this email directly or view it on GitHub: #346 (comment) You are receiving this because you were mentioned. Message ID: ***@***.***>

binboupan · 2024-10-04T14:31:50Z

As of today I am getting entsoe.exceptions.NoMatchingDataError, it worked fine for years so something has definitely changed.

sakvaua · 2024-10-04T14:43:39Z

Same here.
I think this is related to how entsoy-py treats indexes as just data ranges with a fixed start, end, and frequency, while the actual data returned in XML may be missing some hours.
index = pd.date_range(start=start, end=end, freq=delta, inclusive='left')
I downloaded a random interval and noticed this

PeteGilbert98 · 2024-10-04T15:02:16Z

I have a suggested fix which looks like it works for the prices. But I haven't checked extensively.

Here is how I have adjusted parse_prices (in parsers.py) (I had to add in this mapping because it wasn't behaving as expected without):

`def parse_prices(xml_text):
"""
Parameters
----------
xml_text : str

Returns
-------
pd.Series
"""
time_mapping = {
    '15min': '15min',
    '30min': '30min',
    '60min': '60min',
    '15T': '15min',
    '30T': '30min',
    '1H': '60min',
    '1h': '60min',
    'h': '60min',
    '0.25H': '15min',
    '0.5H': '30min',
}


series = {
}
for soup in _extract_timeseries(xml_text):
    soup_series = _parse_timeseries_generic(soup, 'price.amount')
    series[time_mapping[soup_series.index.freqstr]] = soup_series

return series`

Here is how I changed _parse_timeseries_generic ( in series_parsers.py):

`def _parse_timeseries_generic(soup, label='quantity', to_float=True):
    # Create a list to store all time series data
    all_data = []

    # Iterate over each period
    for period in soup.find_all("period"):
        # Extract start time, end time, and resolution for each period
        start_time_str = period.find("start").text
        resolution_str = period.find("resolution").text  # PT6H, PT30M, etc.
        start_time = pd.to_datetime(start_time_str)

        # Convert ISO 8601 duration to a pandas Timedelta
        resolution_timedelta = pd.to_timedelta(resolution_str)

        # Loop over each point and extract position and price
        for point in period.find_all("point"):
            position = int(point.find("position").text)
            value = point.find(label).text
            if to_float:
                value = float(value)

            # Calculate the timestamp for this point based on the position and resolution
            timestamp = start_time + resolution_timedelta * (position - 1)

            # Append the data
            all_data.append([timestamp, value])

    # Create a DataFrame from the combined data
    df_combined = pd.DataFrame(all_data, columns=['Timestamp', label])

    # Reindex the DataFrame to include the complete range
    df_combined.set_index('Timestamp', inplace=True)

    if soup.find('curvetype').text == 'A03':
        # with A03 its possible that positions are missing, this is when values are repeated
        # see docs: https://eepublicdownloads.entsoe.eu/clean-documents/EDI/Library/cim_based/Introduction_of_different_Timeseries_possibilities__curvetypes__with_ENTSO-E_electronic_document_v1.4.pdf
        # so lets do reindex on a continious range which creates gaps if positions are missing
        # then forward fill, so repeat last valid value, to fill the gaps

        # Create a complete date range for the specified periods using the maximum resolution
        complete_range = pd.date_range(start=df_combined.index.min(),
                                       end=df_combined.index.max(),
                                       freq=resolution_timedelta)

        df_combined = df_combined.reindex(complete_range)

        # Forward fill missing values
        df_combined[label] = df_combined[label].ffill()

    return df_combined[label]`

Again, very untested. But looks promising.
**Potential errors:

if the first value it queries, or the last value it queries are missing, the timeseries returned will be shorter than expected.**

I haven't tested this on any of the other functionality at all!!!

fboerman · 2024-10-04T15:05:56Z

Please put this in a proper pull request and I'll look at it after my holiday thanks. ⁣Get BlueMail for Android

…

On 4 Oct 2024, 17:02, at 17:02, PeteGilbert98 ***@***.***> wrote: I have a suggested fix which looks like it works for the prices. But I haven't checked extensively. Here is how I have adjusted parse_prices (in parsers.py) (I had to add in this mapping because it wasn't behaving as expected without): `def parse_prices(xml_text): """ Parameters ---------- xml_text : str Returns ------- pd.Series """ time_mapping = { '15min': '15min', '30min': '30min', '60min': '60min', '15T': '15min', '30T': '30min', '1H': '60min', '1h': '60min', 'h': '60min', '0.25H': '15min', '0.5H': '30min', } series = { } for soup in _extract_timeseries(xml_text): soup_series = _parse_timeseries_generic(soup, 'price.amount') series[time_mapping[soup_series.index.freqstr]] = soup_series return series` Here is how I changed _parse_timeseries_generic ( in series_parsers.py): `def _parse_timeseries_generic(soup, label='quantity', to_float=True): # Create a list to store all time series data all_data = [] # Iterate over each period for period in soup.find_all("period"): # Extract start time, end time, and resolution for each period start_time_str = period.find("start").text resolution_str = period.find("resolution").text # PT6H, PT30M, etc. start_time = pd.to_datetime(start_time_str) # Convert ISO 8601 duration to a pandas Timedelta resolution_timedelta = pd.to_timedelta(resolution_str) # Loop over each point and extract position and price for point in period.find_all("point"): position = int(point.find("position").text) value = point.find(label).text if to_float: value = float(value) # Calculate the timestamp for this point based on the position and resolution timestamp = start_time + resolution_timedelta * (position - 1) # Append the data all_data.append([timestamp, value]) # Create a DataFrame from the combined data df_combined = pd.DataFrame(all_data, columns=['Timestamp', label]) # Reindex the DataFrame to include the complete range df_combined.set_index('Timestamp', inplace=True) if soup.find('curvetype').text == 'A03': # with A03 its possible that positions are missing, this is when values are repeated # see docs: https://eepublicdownloads.entsoe.eu/clean-documents/EDI/Library/cim_based/Introduction_of_different_Timeseries_possibilities__curvetypes__with_ENTSO-E_electronic_document_v1.4.pdf # so lets do reindex on a continious range which creates gaps if positions are missing # then forward fill, so repeat last valid value, to fill the gaps # Create a complete date range for the specified periods using the maximum resolution complete_range = pd.date_range(start=df_combined.index.min(), end=df_combined.index.max(), freq=resolution_timedelta) df_combined = df_combined.reindex(complete_range) # Forward fill missing values df_combined[label] = df_combined[label].ffill() return df_combined[label]` Again, very untested. But looks promising. **Potential errors: if the first value it queries, or the last value it queries are missing, the timeseries returned will be shorter than expected.** I haven't tested this on any of the other functionality at all!!! -- Reply to this email directly or view it on GitHub: #346 (comment) You are receiving this because you were mentioned. Message ID: ***@***.***>

PeteGilbert98 · 2024-10-04T15:07:46Z

@fboerman will do this eve. thanks

GeneralCP · 2024-10-04T18:42:37Z

for anyone looking for a quick fix: https://github.com/GeneralCP/entsoe2
this also fills in missing positions in the xml output.

as far as I can see there was a duplicate price today on the 4th for The Netherlands (22 and 23 rd position the same price). Apparantly the API then only gives positions 22 and skips 23. Not sure if this 'feature' is new or this is just the first time we have the exact same price 2 hours in a row.

binboupan · 2024-10-04T21:41:30Z

day ahead prices are also broken; all of the prices are the same.

JaniKallankari · 2024-10-05T07:27:32Z

Day ahead prices seems to be broken. Does this change in Entsoe platform https://transparency.entsoe.eu/news/widget?id=66f5203d792e84032cbb9b71 have something to do with this?

borg42 · 2024-10-05T07:39:48Z

Day ahead prices seems to be broken. Does this change in Entsoe platform https://transparency.entsoe.eu/news/widget?id=66f5203d792e84032cbb9b71 have something to do with this?

Yes, this is exactly the problem. The day ahead prices use "variable sized blocks" now:

while entsoe-py expects every position to be present.

Roeland54 · 2024-10-05T17:34:28Z

There is another change I have noticed. If I request prices for today and for tomorrow (belgium) the prices for today using the 60min resolution as usual. But the prices of tomorrow are using the 15min resolution. For other countries the response is still using the 60min resolution. I am confused.

So they make braking changes to a public api on a friday and only announce it 5 days before it happens...

JaniKallankari · 2024-10-06T13:22:41Z

I made a parser for the new data type. Only tested with day a head prices so be carefully. Response all TimeSeries are combined to one pandas.Series with possibly (not likely) none equal time steps. Fell free to use this code you find this code usefully.

import xml.etree.ElementTree as ET
def parse_timeseries(self, xml_text, value_key='price.amount', to_float=True):    
	resolution_map = {
		'PT60M': pd.Timedelta(60, 'min'),
		'P1Y'  : pd.Timedelta(365,'day'),
		'PT15M': pd.Timedelta(15, 'min'),
		'PT30M': pd.Timedelta(30, 'min'),
		'P1D'  : pd.Timedelta(1,  'day'),
		'P7D'  : pd.Timedelta(7,  'day'),
		'P1M'  : pd.Timedelta(30, 'day'),
	}
	time_stamps = []
	values      = []
	xml_text = re.sub(' xmlns="[^"]+"', '', xml_text, count=1) #Remove namespace
	root        = ET.fromstring(xml_text)
	for time_serie in root.findall('TimeSeries'):
		#curve_type = time_serie.find('curveType').text
		for period in time_serie.findall('Period'):
			start_time = pd.Timestamp(period.find('timeInterval').find('start').text)
			resolution = resolution_map[period.find('resolution').text]
			for point in period.findall('Point'):
				position = float(point.find('position').text)-1
				time_stamps.append(start_time+position*resolution)
				if to_float:
					values.append(float(point.find(value_key).text))
				else:
					values.append(point.find(value_key).text)
	return pd.Series(data=values,index=time_stamps)

miikasda · 2024-10-06T20:15:22Z

The above fix by @JaniKallankari works for day ahead prices in Finland. Here is a small guide to include it as a a hotfix to entsoe-py if others need it as well:

Edit entsoe/parsers.py in Python's site-packages and add following:

# Hotfix for dayahead prices, see GitHub issue below for more information
# https://github.com/EnergieID/entsoe-py/issues/346
import xml.etree.ElementTree as ET
import re
def parse_timeseries(xml_text, value_key='price.amount', to_float=True):    
    resolution_map = {
        'PT60M': pd.Timedelta(60, 'min'),
        'P1Y'  : pd.Timedelta(365,'day'),
        'PT15M': pd.Timedelta(15, 'min'),
        'PT30M': pd.Timedelta(30, 'min'),
        'P1D'  : pd.Timedelta(1,  'day'),
        'P7D'  : pd.Timedelta(7,  'day'),
        'P1M'  : pd.Timedelta(30, 'day'),
    }
    time_stamps = []
    values      = []
    xml_text = re.sub(' xmlns="[^"]+"', '', xml_text, count=1) #Remove namespace
    root        = ET.fromstring(xml_text)
    for time_serie in root.findall('TimeSeries'):
        #curve_type = time_serie.find('curveType').text
        for period in time_serie.findall('Period'):
            start_time = pd.Timestamp(period.find('timeInterval').find('start').text)
            resolution = resolution_map[period.find('resolution').text]
            for point in period.findall('Point'):
                position = float(point.find('position').text)-1
                time_stamps.append(start_time+position*resolution)
                if to_float:
                    values.append(float(point.find(value_key).text))
                else:
                    values.append(point.find(value_key).text)
    return pd.Series(data=values,index=time_stamps)

And then change the entsoe/entsoe.py to use this new parser. First import the new parser function in line 14:

from .parsers import parse_timeseries, parse_prices, parse_loads, parse_generation, \
    parse_installed_capacity_per_plant, parse_crossborder_flows, \
    parse_unavailabilities, parse_contracted_reserve, parse_imbalance_prices_zip, \
    parse_imbalance_volumes_zip, parse_netpositions, parse_procured_balancing_capacity, \
    parse_water_hydro,parse_aggregated_bids, parse_activated_balancing_energy_prices

And then change the query_day_ahead_prices function defined in line 1202 to actually use it:

def query_day_ahead_prices(
            self, country_code: Union[Area, str],
            start: pd.Timestamp,
            end: pd.Timestamp,
            resolution: Literal['60min', '30min', '15min'] = '60min') -> pd.Series:
        """
        Parameters
        ----------
        resolution: either 60min for hourly values,
            30min for half-hourly values or 15min for quarterly values, throws error if type >
        country_code : Area|str
        start : pd.Timestamp
        end : pd.Timestamp

        Returns
        -------
        pd.Series
        """
        if resolution not in ['60min', '30min', '15min']:
            raise InvalidParameterError('Please choose either 60min, 30min or 15min')
        area = lookup_area(country_code)
        # we do here extra days at start and end to fix issue 187
        text = super(EntsoePandasClient, self).query_day_ahead_prices(
            country_code=area,
            #start=start-pd.Timedelta(days=1),
            start = start,
            end = end
            #end=end+pd.Timedelta(days=1)
        )
        #series = parse_prices(text)[resolution]
        series = parse_timeseries(text)
        if len(series) == 0:
            raise NoMatchingDataError
        series = series.tz_convert(area.tz)
        series = series.truncate(before=start, after=end)
        # because of the above fix we need to check again if any valid data exists after trun>
        if len(series) == 0:
            raise NoMatchingDataError
        return series

fboerman · 2024-10-06T20:18:41Z

@JaniKallankari @miikasda thank you for your suggestions. as a tip, its usually much more readable if you enter such proposals as a pull request instead of a copy in an issue. I am using your code as hint for the fix I am currently writing for this issue. Many thanks to all in this thread for their suggestions!

borg42 · 2024-10-06T20:25:36Z

Regarding the suggestion above:

resolution = resolution_map[period.find('resolution').text]
for point in period.findall('Point'):
    position = float(point.find('position').text)-1
    time_stamps.append(start_time+position*resolution)
     if to_float:
         values.append(float(point.find(value_key).text))
     else:
         values.append(point.find(value_key).text)

This only works if the xml only contains one resolution, which is generally not the case. E.g. for DE_LU it usually contains 60min and 15min data. This would need to return a dict with the resolutions with one series per resolution to work with all regions.

…appen discussion and inspiration for fixes from #347 and #346

fboerman · 2024-10-06T22:06:08Z

many thanks to all for the discussions, I have released a new version, 0.6.9, which fixes this issue. If you still encounter issues on this version please open a new issue!

fboerman · 2024-10-06T22:10:55Z

oh and @Roeland54 this is probably a mistake on Elia side I believe. I have send them an informal nudge hopefully they will fix it soon, I cant fix that in the package

ruupert mentioned this issue Oct 6, 2024

wraps _parse_timeseries_generic and call it for each period #349

Closed

rhpijnacker mentioned this issue Oct 6, 2024

Fetching data from entso-e server results in flat line SeitaBV/flexmeasures-entsoe#43

Closed

fboerman mentioned this issue Oct 6, 2024

day ahead price is the same for every hour #348

Closed

fboerman added a commit that referenced this issue Oct 6, 2024

fixes generic parser to correctly handle various scenarios that can h…

4d6b401

…appen discussion and inspiration for fixes from #347 and #346

fboerman closed this as completed Oct 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DA Prices seems to have broken. #346

DA Prices seems to have broken. #346

PeteGilbert98 commented Oct 4, 2024 •

edited

Loading

LuisTellezSirocco commented Oct 4, 2024

PeteGilbert98 commented Oct 4, 2024

borg42 commented Oct 4, 2024

fboerman commented Oct 4, 2024 via email

binboupan commented Oct 4, 2024

sakvaua commented Oct 4, 2024

PeteGilbert98 commented Oct 4, 2024

fboerman commented Oct 4, 2024 via email

PeteGilbert98 commented Oct 4, 2024

GeneralCP commented Oct 4, 2024 •

edited

Loading

binboupan commented Oct 4, 2024

JaniKallankari commented Oct 5, 2024

borg42 commented Oct 5, 2024

Roeland54 commented Oct 5, 2024 •

edited

Loading

JaniKallankari commented Oct 6, 2024 •

edited

Loading

miikasda commented Oct 6, 2024 •

edited

Loading

fboerman commented Oct 6, 2024

borg42 commented Oct 6, 2024 •

edited

Loading

fboerman commented Oct 6, 2024

fboerman commented Oct 6, 2024

DA Prices seems to have broken. #346

DA Prices seems to have broken. #346

Comments

PeteGilbert98 commented Oct 4, 2024 • edited Loading

LuisTellezSirocco commented Oct 4, 2024

PeteGilbert98 commented Oct 4, 2024

borg42 commented Oct 4, 2024

fboerman commented Oct 4, 2024 via email

binboupan commented Oct 4, 2024

sakvaua commented Oct 4, 2024

PeteGilbert98 commented Oct 4, 2024

fboerman commented Oct 4, 2024 via email

PeteGilbert98 commented Oct 4, 2024

GeneralCP commented Oct 4, 2024 • edited Loading

binboupan commented Oct 4, 2024

JaniKallankari commented Oct 5, 2024

borg42 commented Oct 5, 2024

Roeland54 commented Oct 5, 2024 • edited Loading

JaniKallankari commented Oct 6, 2024 • edited Loading

miikasda commented Oct 6, 2024 • edited Loading

fboerman commented Oct 6, 2024

borg42 commented Oct 6, 2024 • edited Loading

fboerman commented Oct 6, 2024

fboerman commented Oct 6, 2024

PeteGilbert98 commented Oct 4, 2024 •

edited

Loading

GeneralCP commented Oct 4, 2024 •

edited

Loading

Roeland54 commented Oct 5, 2024 •

edited

Loading

JaniKallankari commented Oct 6, 2024 •

edited

Loading

miikasda commented Oct 6, 2024 •

edited

Loading

borg42 commented Oct 6, 2024 •

edited

Loading