-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Building ranges #16
Comments
I'd reserved '\xff' ('ÿ') specifically to act as a high key sentinel for this use case, and I've been planning to add an API for encoding queries to make this easy. In the meantime you should be able to get the desired behavior by appending Just to be clear, you'd concat One caveat to this |
@jtremback don't use |
Thanks! I had been attempting to concat var range = {
gte: bytewise.encode(['i', query.k.join(',')].concat(gte)),
lte: bytewise.encode(['i', query.k.join(',')].concat(lte).concat([undefined]))
} |
Ah, one last issue was solved by using |
@jtremback instead of doing |
I'm actually not trying to sort on query.k, it is more of a namespacing thing. You can think of it as a sublevel for a particular index. // Generate a range that retreives the documents requested by the query
function makeRange (query, level_opts) {
// Avoid having to write queries with redundant array notation
if (!Array.isArray(query.k)) { query.k = [ query.k ] }
if (!Array.isArray(query.v)) { query.v = [ query.v ] }
// Gathers values in query value field, generating gte - lte
var acc = query.v.reduce(function (acc, item) {
// Avoid having to write queries with redundant array notation
if (!Array.isArray(item)) { item = [ item ] }
// Push bottom of range (first array element) into gte
acc.gte.push(esc(item[0]))
// If it is not a range, use same value for lte, if it is use top of range
acc.lte.push(esc(item.length > 1 ? item[1] : item[0]))
return acc
}, { gte: [], lte: [] })
// Eliminate null values
var lte = compact(acc.lte)
var gte = compact(acc.gte)
var range = {
gte: bytewise.encode(['i', query.k.join(',')].concat(gte).concat([null])),
lte: bytewise.encode(['i', query.k.join(',')].concat(lte).concat([undefined]))
}
if (query.reverse) { range.reverse = true }
range = merge(level_opts || {}, range)
return range
} |
@dominictarr using |
@loveencounterflow okay but how do you write that in JS?
|
AFAIK But the fact that you can still construct a legal js string that sorts higher (regardless of whether it's properly encoded) means you could still miss some string keys, so it'd be a lot cooler if the high (and low) key sentinels existed separate from the data types you're encoding -- and I've been working on in my recent refactoring work. You'll be able to reference Still a WIP, but should be tightened up soon -- I need this to work smoothly for the query generation functionality in bytewise-uri. Oh, and since this wouldn't get utf8-encoded we could probably just use |
@dominictarr "how do you write that in JS?"—your example is a bit on the wrong side since what you really intend to do in the first example is
|
@loveencounterflow hmm... so my node version doesn't have String.fromCharCode(0x10ffff) == '\uffff'
//=> true (!!!) |
@dominictarr |
so is there any way to do this in joyent/node? or browsers for that matter... looks like this needs to exist: |
oh no, looks like this one does it: https://www.npmjs.com/package/codepoint |
Nice find! I'll get the |
fromCodePoint and codePointAt is a pair of relatively new methods introduced in ES6; they're accessed as The complication that we're seeing here arises from the fact that when JavaScript was conceived in 1995, Brendan Eich was smart enough to embrace Unicode. Ironically, however, Unicode was at that point mere months away from going from a 16bit-codespace to a 32bit-codespace, and because of they way it was standardized, it was (seemingly permanently) doomed to deal with codepoints above As an aggravating factor, some methods (and, hence, people's concepts) got confused about what a 'character' should be. Long story short: On the one hand, things have gotten much simpler ever since NodeJS switched from CESU-8 to proper UTF-8, but on the other hand, other platforms may not be as smart; I sadly have no details on that because i do very little JS-in-the-browser these days. That said, CESU-8 only differs in how it deals with code points above @dominictarr "String.fromCharCode(0x10ffff) == '\uffff'"—I believe that is caused by @deanlandolt "AFAIK \uffff is an invalid unicode code point"—precise terminology is needed. There are (1) numbers that are outside of what Unicode considers candidates for code point assignment; those candidates are restricted to integer numbers |
This is almost done -- api and rationale explained in this ticket: #17 (comment) @jtremback let me know if the API sketched in that ticket addresses your use case. |
Hey- I'll be able to check this thoroughly tonight. FFWIW I made it work in my application with the old bytewise, but these changes look exciting! EDIT: Thanks! I haven't run into any issues with |
Hi, I'm replacing making indexes by concatenating values with ÿ and using an array encoded with bytewise instead. In my current code, when I retrieve a range, I make it like this:
At the end of the
lte
, you can see'ÿÿ'
, which ensures that the range is inclusive. How should I handle this with bytewise? Concat an extra'\xf0'
on the end of thelte
array?The text was updated successfully, but these errors were encountered: