Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download_Mode for File, S3_File and Enso_File #12017

Open
wants to merge 19 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
wip
  • Loading branch information
GregoryTravis committed Nov 13, 2024
commit 2c44654013c85c68bbeaee2eb61992f0aef8e80a
5 changes: 3 additions & 2 deletions distribution/lib/Standard/Base/0.0.0-dev/src/Data.enso
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import project.Any.Any
export project.Data.Download.Downlad_Mode.Download_Mode
import project.Data.Pair.Pair
import project.Data.Read.Many_Files_List.Many_Files_List
import project.Data.Read.Return_As.Return_As
Expand Down Expand Up @@ -440,8 +441,8 @@ post (uri:(URI | Text)=(Missing_Argument.throw "uri")) (body:Request_Body=..Empt
- headers: The headers to send with the request. Defaults to an empty vector.
@uri (Text_Input display=..Always)
@headers Header.default_widget
download : (URI | Text) -> Writable_File -> HTTP_Method -> Vector (Header | Pair Text Text) -> File ! Request_Error | HTTP_Error
download (uri:(URI | Text)=(Missing_Argument.throw "uri")) file:Writable_File (method:HTTP_Method=..Get) (headers:(Vector (Header | Pair Text Text))=[]) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API suggestion: what if we rename mode into replace_existing?

IMO it will be much clearer for the user.

If I do:

Data.download url1 file mode=..If_Not_Exists
Data.download url2 file mode=..If_Not_Exists

tbh I'd still expect the second expression to download. That is because I am now downloading a different file to that destination. So while the destination exists, as a user I would expect it to be overwritten, because I've changed the URL - e.g. I was working with a report from June relying on the cache to only download it once, but now I want to start working with reports from July. I change the URL and expect the file to get redownloaded even if it exists - because I expect the data is new. I even reset caches and am confused why I'm still seeing June data in the file.

I understand that this is not what this was designed for, but I think the above is a likely user scenario.

Now, if we rename the parameter to replace_existing, the code reads as:

Data.download url1 file replace_existing=..If_Not_Exists
Data.download url2 file replace_existing=..If_Not_Exists

Now it is obvious to me that the second statement will do nothing if the first one succeeded - because the file is redownloaded only if it didn't exist in the first place (regardless of the URL). And now that semantics (what is currently implemented) is completely clear when reading the calls.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and regardless of parameter name - we need to update the method documentation to include it and ideally describe what the expected semantics is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we rely on the 'refresh' button to clear caches in many cases, we probably should add a note that this download method works only based on file existance/age and so refresh button does not affect it.

As a user I might expect the refresh button to ensure the file is redownloaded (whether that should work this way or not is up to discussion, I think current semantics are ok) - but with current semantics the refresh button just does nothing for download. I think it would be good if the documentation mentioned that, so that the user can know what to expect / can see that this is expected and not a bug if they get confused seeing that refresh button does nothing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to replace_existing, and added documentation.

I am not sure about how the refresh button interacts with the Always option, and I cannot run the front end to find out, so I did not mention that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With Always we always redownload the file, right? How could the refresh button interfere with that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is, I assume that refresh will also cause a download.

download : (URI | Text) -> Writable_File -> Download_Mode -> HTTP_Method -> Vector (Header | Pair Text Text) -> File ! Request_Error | HTTP_Error
download (uri:(URI | Text)=(Missing_Argument.throw "uri")) file:Writable_File (mode:Download_Mode=..If_Not_Exists) (method:HTTP_Method=..Get) (headers:(Vector (Header | Pair Text Text))=[]) =
Context.Output.if_enabled disabled_message="As writing is disabled, cannot download to a file. Press the Write button ▶ to perform the operation." panic=False <|
response = HTTP.fetch uri method headers cache_policy=Cache_Policy.No_Cache
case Data_Link.is_data_link response.body.metadata of
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
import project.Data.Time.Duration.Duration

type Download_Mode
## Download the file if it does not already exist on disk.
If_Npt_Exists

## Download the file if it is older than the specified age.
If_Older_than age:Duration
1 change: 1 addition & 0 deletions distribution/lib/Standard/Base/0.0.0-dev/src/Main.enso
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ export project.Data.Boolean.Boolean.True
export project.Data.Decimal.dec
export project.Data.Decimal.Decimal
export project.Data.Dictionary.Dictionary
export project.Data.Download.Downlad_Mode.Download_Mode
export project.Data.Filter_Condition.Filter_Action
export project.Data.Filter_Condition.Filter_Condition
export project.Data.Hashset.Hashset
Expand Down
29 changes: 29 additions & 0 deletions test/Table_Tests/src/IO/Data_Spec.enso
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
from Standard.Base import all

from Standard.Test import all
import Standard.Test.Test_Environment

from enso_dev.Base_Tests.Network.Http.Http_Test_Setup import base_url_with_slash, pending_has_url

main filter=Nothing =
suite = Test.build suite_builder->
add_specs suite_builder
suite.run_with_filter filter


with_test_file f ~action =
Panic.with_finalizer (f.delete_if_exists)
action f

add_specs suite_builder =
suite_builder.group "Download Mode" pending=pending_has_url group_builder->
url_n_bytes n = base_url_with_slash+'test_download?length='+n.to_text

group_builder.specify "Will download a file if it does not exist" <|
with_test_file (enso_project.data / "transient" / "if_not_exist.txt") file->
file.exists . should_be_false
Data.download (url_n_bytes 10) file
first_contents = file.read
Data.download (url_n_bytes 11) file
second_contents = file.read
first_contents . should_equal second_contents