-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refine single-value column to treat it as that single value #12120
Changes from all commits
ac478e5
85fa20e
5a24c2c
7d10624
5b5f09c
22a39a5
6ba616b
04c63e6
38921c2
e81546e
586de76
06dca42
cc8d523
d0044e4
9bb73ae
528bb50
20c6e81
ef26c37
3ca42e7
167aeb1
1700cd0
d98b042
4820070
f757300
3fc9d1f
41db180
464f064
29f1451
26ce1c1
fc57147
7c937d7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -38,6 +38,7 @@ from project.Errors import Conversion_Failure, Inexact_Type_Coercion, Invalid_Co | |
from project.Internal.Column_Format import all | ||
from project.Internal.Java_Exports import make_string_builder | ||
from project.Internal.Storage import enso_to_java, java_to_enso | ||
from project.Internal.Type_Refinements.Single_Value_Column import refine_with_single_value | ||
|
||
polyglot java import org.enso.base.Time_Utils | ||
polyglot java import org.enso.table.data.column.operation.cast.CastProblemAggregator | ||
|
@@ -87,7 +88,7 @@ type Column | |
example_from_vector = | ||
Column.from_vector "My Column" [1, 2, 3, 4, 5] | ||
from_vector : Text -> Vector -> Auto | Value_Type -> Column ! Invalid_Value_Type | ||
from_vector (name : Text) (items : Vector) (value_type : Auto | Value_Type = Auto) = | ||
from_vector (name : Text) (items : Vector) (value_type : Auto | Value_Type = Auto) -> Column = | ||
## If the type does not accept date-time-like values, we can skip the | ||
additional logic for polyglot conversions that would normally be used, | ||
which is quite costly - so if we can guarantee it is unnecessary, | ||
|
@@ -120,10 +121,12 @@ type Column | |
case needs_polyglot_conversion of | ||
True -> Java_Column.fromItems name (enso_to_java_maybe items) expected_storage_type java_problem_aggregator | ||
False -> Java_Column.fromItemsNoDateConversion name items expected_storage_type java_problem_aggregator | ||
result = Column.from_java_column java_column . throw_on_warning Conversion_Failure | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Calling
The workaround is to avoid calling There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think that is the behaviour we need for the intersection type solution to make any sense. We can't make it so easy to 'loose' the intersected types. While in libraries we can try to rely on workarounds, our users will get confused if the column stops being an |
||
result.catch Conversion_Failure error-> | ||
if error.example_values.is_empty then result else | ||
raise_invalid_value_type_error error.example_values.first | ||
multi_result = Column.from_java_column java_column | ||
result = Warning.throw_on_warning multi_result Conversion_Failure | ||
if Meta.is_error result . not then result else | ||
result.catch Conversion_Failure error-> | ||
if error.example_values.is_empty then result else | ||
raise_invalid_value_type_error error.example_values.first | ||
|
||
## PRIVATE | ||
Creates a new column given a name and an internal Java storage. | ||
|
@@ -135,9 +138,9 @@ type Column | |
|
||
## PRIVATE | ||
Creates a new column given a Java Column object. | ||
from_java_column : Java_Column -> Column | ||
from_java_column java_column = | ||
Column.Value java_column | ||
from_java_column java_column:Java_Column -> Column = | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a PRIVATE function. It doesn't really need signature. E.g. the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Then the question is: what should be the signature of such a function? @radeusgd has proposed
Is that how you want There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The idea was that adding the return type check here, affects semantics of all functions that rely on it. So even if I forget to update the signature in some case, all methods that return a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I was initially thinking to avoid hiding any additional types that were intersected with I'm happy for either. But overall yes, that was my initial suggestion for how I think Even if we don't end up using |
||
column = Column.Value java_column | ||
column |> refine_with_single_value | ||
|
||
## PRIVATE | ||
ADVANCED | ||
|
@@ -1202,8 +1205,8 @@ type Column | |
storage_type = Storage.from_value_type_strict common_type | ||
new_storage = Java_Problems.with_problem_aggregator Problem_Behavior.Report_Warning java_problem_aggregator-> | ||
case default of | ||
Column.Value java_col -> | ||
other_storage = java_col.getStorage | ||
col : Column -> | ||
other_storage = col.java_column.getStorage | ||
storage.fillMissingFrom other_storage storage_type java_problem_aggregator | ||
_ -> | ||
storage.fillMissing default storage_type java_problem_aggregator | ||
|
@@ -2699,9 +2702,9 @@ run_vectorized_binary_op column name operand new_name=Nothing fallback_fn=Nothin | |
Java_Problems.with_map_operation_problem_aggregator column.name Problem_Behavior.Report_Warning problem_builder-> | ||
storage_type = resolve_storage_type expected_result_type | ||
case operand of | ||
Column.Value col2 -> | ||
operand_column : Column -> | ||
s1 = column.java_column.getStorage | ||
s2 = col2.getStorage | ||
s2 = operand_column.java_column.getStorage | ||
rs = Polyglot_Helpers.handle_polyglot_dataflow_errors <| | ||
s1.vectorizedOrFallbackZip name problem_builder fallback_fn s2 skip_nulls storage_type | ||
Column.from_storage effective_new_name rs | ||
|
@@ -2792,9 +2795,9 @@ run_vectorized_binary_op_with_fallback_problem_handling column name operand fall | |
_ -> fallback_fn problem_builder | ||
storage_type = resolve_storage_type expected_result_type | ||
case operand of | ||
Column.Value col2 -> | ||
operand_column : Column -> | ||
s1 = column.java_column.getStorage | ||
s2 = col2.getStorage | ||
s2 = operand_column.java_column.getStorage | ||
rs = Polyglot_Helpers.handle_polyglot_dataflow_errors <| | ||
s1.vectorizedOrFallbackZip name problem_builder applied_fn s2 skip_nulls storage_type | ||
Column.from_storage new_name rs | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
private | ||
|
||
from Standard.Base import all | ||
|
||
import project.Column.Column | ||
import project.Value_Type.Value_Type | ||
from project.Internal.Type_Refinements.Single_Value_Column_Extensions import all | ||
|
||
refine_with_single_value (column : Column) = | ||
## We treat a column as single value if it contains a single not-nothing value. | ||
if is_single_value column . not then column else case column.inferred_precise_value_type of | ||
Value_Type.Integer _ -> | ||
# `inferred_precise_value_type` will return Integer if the column was Float (or Mixed) but contained integral values - e.g. [2.0] | ||
# We inspect the actual value to correctly deal with both Float and Mixed base type. | ||
value = column.at 0 | ||
case value of | ||
# If the value was really a float, we preserve that. | ||
_ : Float -> (column : Column & Float) | ||
# Otherwise we treat it as an integer. | ||
_ -> (column : Column & Integer) | ||
Value_Type.Float _ -> (column : Column & Float) | ||
Value_Type.Char _ _ -> (column : Column & Text) | ||
Value_Type.Boolean -> (column : Column & Boolean) | ||
Value_Type.Date -> (column : Column & Date) | ||
Value_Type.Time -> (column : Column & Time_Of_Day) | ||
Value_Type.Date_Time True -> (column : Column & Date_Time) | ||
Value_Type.Decimal _ scale -> | ||
is_integer = scale == 0 | ||
if is_integer then (column : Column & Integer) else (column : Column & Decimal) | ||
# Other types (e.g. Mixed) are not supported. | ||
_ -> column | ||
|
||
is_single_value column:Column -> Boolean = | ||
(column.length == 1) && (column.at 0 . is_nothing . not) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
private | ||
|
||
from Standard.Base import all | ||
|
||
import project.Column.Column | ||
from project.Internal.Type_Refinements.Single_Value_Column import is_single_value | ||
|
||
## This conversion is internal and should never be exported. | ||
Integer.from (that : Column) -> Integer = | ||
Runtime.assert (is_single_value that) | ||
x = that.at 0 | ||
case x of | ||
_ : Integer -> x | ||
_ : Float -> | ||
Runtime.assert (x % 1.0 == 0.0) | ||
x.truncate | ||
|
||
## This conversion is internal and should never be exported. | ||
Float.from (that : Column) -> Float = | ||
Runtime.assert (is_single_value that) | ||
that.at 0 | ||
|
||
## This conversion is internal and should never be exported. | ||
Text.from (that : Column) -> Text = | ||
Runtime.assert (is_single_value that) | ||
that.at 0 | ||
|
||
## This conversion is internal and should never be exported. | ||
Boolean.from (that : Column) -> Boolean = | ||
Runtime.assert (is_single_value that) | ||
that.at 0 | ||
|
||
## This conversion is internal and should never be exported. | ||
Date.from (that : Column) -> Date = | ||
Runtime.assert (is_single_value that) | ||
that.at 0 | ||
|
||
## This conversion is internal and should never be exported. | ||
Time_Of_Day.from (that : Column) -> Time_Of_Day = | ||
Runtime.assert (is_single_value that) | ||
that.at 0 | ||
|
||
## This conversion is internal and should never be exported. | ||
Date_Time.from (that : Column) -> Date_Time = | ||
Runtime.assert (is_single_value that) | ||
that.at 0 | ||
|
||
## This conversion is internal and should never be exported. | ||
Decimal.from (that : Column) -> Decimal = | ||
Runtime.assert (is_single_value that) | ||
that.at 0 |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -144,13 +144,13 @@ type Table | |
new : Vector (Vector | Column) -> Table | ||
new columns = | ||
invalid_input_shape = | ||
Error.throw (Illegal_Argument.Error "Each column must be represented by a pair whose first element is the column name and the second element is a vector of elements that will constitute that column, or an existing column.") | ||
Error.throw (Illegal_Argument.Error "Each column must be represented by a pair whose first element is the column name and the second element is a vector of elements that will constitute that column, or an existing column. Got: "+columns.to_text) | ||
cols = columns.map on_problems=No_Wrap.Value c-> | ||
case c of | ||
v : Vector -> | ||
if v.length != 2 then invalid_input_shape else | ||
Column.from_vector (v.at 0) (v.at 1) . java_column | ||
Column.Value java_col -> java_col | ||
col : Column -> col.java_column | ||
_ -> invalid_input_shape | ||
Panic.recover Illegal_Argument <| | ||
if cols.is_empty then | ||
|
@@ -2472,9 +2472,9 @@ type Table | |
unique.mark_used self.column_names | ||
|
||
resolved = case value of | ||
_ : Column -> value | ||
_ : Text -> self.make_constant_column value | ||
_ : Expression -> self.evaluate_expression value on_problems | ||
_ : Column -> value | ||
Comment on lines
2474
to
-2477
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, intersection types have changed something fundamental about Enso semantics.
E.g. here single-text-value column would match both There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
_ : Constant_Column -> self.make_constant_column value.value | ||
_ : Simple_Expression -> value.evaluate self (set_mode==Set_Mode.Update && as=="") on_problems | ||
_ -> Error.throw (Illegal_Argument.Error "Unsupported type for `Table.set`.") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With intersection types, with
The
a
andb
are no longer interchangeable. TheT
part could be hidden ina
(so you cannot pass it into functions that expectT
) and it can be un-hidden by theb : T
check at which pointb
has it 'visible' andb
can be passed tof (t : T)
whereasa
cannot.So we need to keep in mind that whenever we rely on
case of
, we should not do_ : T
but name it and use the named component. At least anywhere where we may expect the intersection types to come up.cc: @jdunkerley @GregoryTravis
This is actually an important change to Enso semantics that we kind of knew about in #11600 but I'm not sure we have appreciate its implications enough yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f.sqrt
and notc.sqrt
."There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm just reminding.