Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when getting scenario template #60

Open
amy-wang opened this issue Mar 17, 2025 · 5 comments · May be fixed by #61
Open

Error when getting scenario template #60

amy-wang opened this issue Mar 17, 2025 · 5 comments · May be fixed by #61
Assignees
Labels
bug Something isn't working

Comments

@amy-wang
Copy link

Issue overview

I've followed all of the set up steps. When I try to run the model (dgen_model.py) for New York State, I'm getting this error:

Traceback (most recent call last):
  File "/Users/amywang/dgen/dgen_os/python/dgen_model.py", line 195, in main
    solar_agents.on_frame(agent_mutation.elec.apply_export_tariff_params, [net_metering_state_df, net_metering_utility_df])
  File "/Users/amywang/dgen/dgen_os/python/agents.py", line 137, in on_frame
    results_df = self.run_with_runtime_tests(how_to_apply='on_frame', func=func, func_args=func_args, cores=None, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/amywang/dgen/dgen_os/python/agents.py", line 254, in run_with_runtime_tests
    assert initial_len == post_len, "agent_df len changed by a function applied on_frame"
           ^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: agent_df len changed by a function applied on_frame
It appears that the function apply_export_tariff_params is changing the length of agent_df.

When I added some print statements, it appears that the length of agent_df is doubled:
Rows before function: 1147
Rows after function: 2294
Checking if agent_id duplicates exist: 1147

Details

Some additional details for this issue (if relevant):

  • Platform: Mac
  • Version of dGen: Most recent (forked on 3/15)
  • Python version: Python 3.12.7
  • matplotlib: 3.9.2
  • pandas: 2.2.2
  • numpy: 1.26.4
  • scipy: 1.13.1
@michaelavs michaelavs self-assigned this Mar 17, 2025
@michaelavs
Copy link
Collaborator

Hi @amy-wang,
This is an issue I have come across myself and have found a solution that I implemented locally, but not here in the repository as I assumed it was from my own database error and was not persistent elsewhere. The short summary of what I have found is that for one reason or another, the values from the SQL database were duplicated and once they are imported and merged on the agent file, they result in the agent file essentially duplicating itself. To resolve this, I have added "drop_duplicates" to parameters to every instance of pd.merge() in the elec.py and financial_functions.py code.

For example the function apply_export_tariff_params in elec.py contained:

temp_df = pd.merge(dataframe, net_metering_utility_df, how='left', on=['eia_id','sector_abbr','state_abbr'])
and
agents_without_utility_nem = pd.merge(agents_without_utility_nem, net_metering_state_df, how='left', on=['state_abbr', 'sector_abbr'])

I changed it to be:
temp_df = pd.merge(dataframe, net_metering_utility_df.drop_duplicates(), how='left', on=['eia_id','sector_abbr','state_abbr'])
and
agents_without_utility_nem = pd.merge(agents_without_utility_nem, net_metering_state_df.drop_duplicates(), how='left', on=['state_abbr', 'sector_abbr'])

If you implement this change, I recommend adding the .drop_duplicates() only to the dataframe being merged into the original dataframe (i.e., the second argument in the pd.merge() function) as this is generally the dataframe that has been created/pulled from the SQL database. Let me know if that helps and/or you get any further issues. As I now know this is an issue that has impacted users, I will also add it to our project board and have assigned it myself to work on adding the fix to the codebase!

@michaelavs michaelavs added the bug Something isn't working label Mar 17, 2025
@michaelavs michaelavs added this to the General Code Clean-up milestone Mar 17, 2025
@amy-wang
Copy link
Author

That's great to know, thank you! I'm going to try what you suggested.

@michaelavs michaelavs linked a pull request Mar 18, 2025 that will close this issue
@amy-wang
Copy link
Author

Hey @michaelavs! I tried to update the code locally and I'm now running this error:

TypeError: unhashable type: 'list'

It seems like I should add dropduplicates only if the type isn't a list, is that right? The error is occurring in line 156, I have it updated to dataframe = pd.merge(dataframe, deprec_sch[['sector_abbr', 'deprec_sch', 'year']].drop_duplicates(), how='left', on=['sector_abbr', 'year'])

@amy-wang
Copy link
Author

I also see that you have a PR out (thank you so much!). If you can prioritize getting that merged, I'll update my github repo instead of making local changes.

@amy-wang
Copy link
Author

Hi @michaelavs: Side note, I'm still having trouble running this code. I'm doing some work for a think tank (www.switch.box) and we were just hoping to understand how dgen works to see how we can leverage it in the future. Is there any chance you could share output that you got from running a model for any state (though preferably New York)?

I'm just trying to see what the output looks like to understand the model and resolving these bugs hasn't been the best use of my time. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants