Skip to content

Commit

Permalink
Generalize Bool parse method to AbstractString (JuliaLang#47782)
Browse files Browse the repository at this point in the history
* Generalize Bool parse method to AbstractString

Fixes JuliaStrings/InlineStrings.jl#57.

We currently have a specialization for `parse(Bool, ::Union{String, SubString{String})`
where `true` and `false` are parsed appropriately. The restriction to
`Union{String, SubString{String}}`, however, means we don't get this behavior
for other `AbstractString`s. In the linked issue above, for InlineStrings, we end up
going through the generic integer parsing codepath which results in an `InexactError`
when we try to do `Bool(10)`.

The proposal in this PR takes advantage of the fact that there is only the 2 comparisons
where we do `_memcmp` that require the input string to be "dense" (in memory), and otherwise,
we just do a comparison against a `SubString` of the input string.

Relatedly, I've wanted to introduce the concept of an abstrac type like:

```julia
abstract type MemoryAddressableString <: AbstractString
```

where the additional required interface would be being able to call
`pointer(::MemoryAddressableString)`, since a lot of our string algorithms depend on doing
these kind of pointer operations and hence makes it quite a pain to implement your own
custom string type.

* Apply suggestions from code review

Co-authored-by: Stefan Karpinski <[email protected]>
Co-authored-by: Nick Robinson <[email protected]>

Co-authored-by: Stefan Karpinski <[email protected]>
Co-authored-by: Nick Robinson <[email protected]>
  • Loading branch information
3 people authored Dec 6, 2022
1 parent cf5ae03 commit 63830a6
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 5 deletions.
15 changes: 10 additions & 5 deletions base/parse.jl
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@ function tryparse_internal(::Type{T}, s::AbstractString, startpos::Int, endpos::
return n
end

function tryparse_internal(::Type{Bool}, sbuff::Union{String,SubString{String}},
function tryparse_internal(::Type{Bool}, sbuff::AbstractString,
startpos::Int, endpos::Int, base::Integer, raise::Bool)
if isempty(sbuff)
raise && throw(ArgumentError("input string is empty"))
Expand All @@ -202,10 +202,15 @@ function tryparse_internal(::Type{Bool}, sbuff::Union{String,SubString{String}},
end

len = endpos - startpos + 1
p = pointer(sbuff) + startpos - 1
GC.@preserve sbuff begin
(len == 4) && (0 == _memcmp(p, "true", 4)) && (return true)
(len == 5) && (0 == _memcmp(p, "false", 5)) && (return false)
if sbuff isa Union{String, SubString{String}}
p = pointer(sbuff) + startpos - 1
GC.@preserve sbuff begin
(len == 4) && (0 == _memcmp(p, "true", 4)) && (return true)
(len == 5) && (0 == _memcmp(p, "false", 5)) && (return false)
end
else
(len == 4) && (SubString(sbuff, startpos:startpos+3) == "true") && (return true)
(len == 5) && (SubString(sbuff, startpos:startpos+4) == "false") && (return false)
end

if raise
Expand Down
10 changes: 10 additions & 0 deletions test/parse.jl
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,16 @@ Base.iterate(::Issue29451String, i::Integer=1) = i == 1 ? ('0', 2) : nothing
@test Issue29451String() == "0"
@test parse(Int, Issue29451String()) == 0

# https://github.com/JuliaStrings/InlineStrings.jl/issues/57
struct InlineStringIssue57 <: AbstractString end
Base.ncodeunits(::InlineStringIssue57) = 4
Base.lastindex(::InlineStringIssue57) = 4
Base.isvalid(::InlineStringIssue57, i::Integer) = 0 < i < 5
Base.iterate(::InlineStringIssue57, i::Integer=1) = i == 1 ? ('t', 2) : i == 2 ? ('r', 3) : i == 3 ? ('u', 4) : i == 4 ? ('e', 5) : nothing
Base.:(==)(::SubString{InlineStringIssue57}, x::String) = x == "true"

@test parse(Bool, InlineStringIssue57())

@testset "Issue 20587, T=$T" for T in Any[BigInt, Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8]
T === BigInt && continue # TODO: make BigInt pass this test
for s in ["", " ", " "]
Expand Down

0 comments on commit 63830a6

Please sign in to comment.