Error when getting scenario template #60

amy-wang · 2025-03-17T18:45:03Z

Issue overview

I've followed all of the set up steps. When I try to run the model (dgen_model.py) for New York State, I'm getting this error:

Traceback (most recent call last):
  File "/Users/amywang/dgen/dgen_os/python/dgen_model.py", line 195, in main
    solar_agents.on_frame(agent_mutation.elec.apply_export_tariff_params, [net_metering_state_df, net_metering_utility_df])
  File "/Users/amywang/dgen/dgen_os/python/agents.py", line 137, in on_frame
    results_df = self.run_with_runtime_tests(how_to_apply='on_frame', func=func, func_args=func_args, cores=None, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/amywang/dgen/dgen_os/python/agents.py", line 254, in run_with_runtime_tests
    assert initial_len == post_len, "agent_df len changed by a function applied on_frame"
           ^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: agent_df len changed by a function applied on_frame
It appears that the function apply_export_tariff_params is changing the length of agent_df.

When I added some print statements, it appears that the length of agent_df is doubled:
Rows before function: 1147
Rows after function: 2294
Checking if agent_id duplicates exist: 1147

Details

Some additional details for this issue (if relevant):

Platform: Mac
Version of dGen: Most recent (forked on 3/15)
Python version: Python 3.12.7
matplotlib: 3.9.2
pandas: 2.2.2
numpy: 1.26.4
scipy: 1.13.1

The text was updated successfully, but these errors were encountered:

michaelavs · 2025-03-17T20:28:09Z

Hi @amy-wang,
This is an issue I have come across myself and have found a solution that I implemented locally, but not here in the repository as I assumed it was from my own database error and was not persistent elsewhere. The short summary of what I have found is that for one reason or another, the values from the SQL database were duplicated and once they are imported and merged on the agent file, they result in the agent file essentially duplicating itself. To resolve this, I have added "drop_duplicates" to parameters to every instance of pd.merge() in the elec.py and financial_functions.py code.

For example the function apply_export_tariff_params in elec.py contained:

temp_df = pd.merge(dataframe, net_metering_utility_df, how='left', on=['eia_id','sector_abbr','state_abbr'])
and
agents_without_utility_nem = pd.merge(agents_without_utility_nem, net_metering_state_df, how='left', on=['state_abbr', 'sector_abbr'])

I changed it to be:
temp_df = pd.merge(dataframe, net_metering_utility_df.drop_duplicates(), how='left', on=['eia_id','sector_abbr','state_abbr'])
and
agents_without_utility_nem = pd.merge(agents_without_utility_nem, net_metering_state_df.drop_duplicates(), how='left', on=['state_abbr', 'sector_abbr'])

If you implement this change, I recommend adding the .drop_duplicates() only to the dataframe being merged into the original dataframe (i.e., the second argument in the pd.merge() function) as this is generally the dataframe that has been created/pulled from the SQL database. Let me know if that helps and/or you get any further issues. As I now know this is an issue that has impacted users, I will also add it to our project board and have assigned it myself to work on adding the fix to the codebase!

amy-wang · 2025-03-17T20:49:33Z

That's great to know, thank you! I'm going to try what you suggested.

amy-wang · 2025-03-20T16:59:46Z

Hey @michaelavs! I tried to update the code locally and I'm now running this error:

TypeError: unhashable type: 'list'

It seems like I should add dropduplicates only if the type isn't a list, is that right? The error is occurring in line 156, I have it updated to dataframe = pd.merge(dataframe, deprec_sch[['sector_abbr', 'deprec_sch', 'year']].drop_duplicates(), how='left', on=['sector_abbr', 'year'])

amy-wang · 2025-03-20T17:00:10Z

I also see that you have a PR out (thank you so much!). If you can prioritize getting that merged, I'll update my github repo instead of making local changes.

amy-wang · 2025-03-20T18:48:18Z

Hi @michaelavs: Side note, I'm still having trouble running this code. I'm doing some work for a think tank (www.switch.box) and we were just hoping to understand how dgen works to see how we can leverage it in the future. Is there any chance you could share output that you got from running a model for any state (though preferably New York)?

I'm just trying to see what the output looks like to understand the model and resolving these bugs hasn't been the best use of my time. Thank you!

michaelavs self-assigned this Mar 17, 2025

michaelavs added the bug Something isn't working label Mar 17, 2025

michaelavs added this to the General Code Clean-up milestone Mar 17, 2025

michaelavs linked a pull request Mar 18, 2025 that will close this issue

Add "drop_duplicates" to pd.merge() calls #61

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when getting scenario template #60

Error when getting scenario template #60

amy-wang commented Mar 17, 2025

michaelavs commented Mar 17, 2025

amy-wang commented Mar 17, 2025

amy-wang commented Mar 20, 2025

amy-wang commented Mar 20, 2025

amy-wang commented Mar 20, 2025

Error when getting scenario template #60

Error when getting scenario template #60

Comments

amy-wang commented Mar 17, 2025

Issue overview

Details

michaelavs commented Mar 17, 2025

amy-wang commented Mar 17, 2025

amy-wang commented Mar 20, 2025

amy-wang commented Mar 20, 2025

amy-wang commented Mar 20, 2025