-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autoformatting with dictation #110
Comments
Thanks for this. I think adding an auto-formatting parameter for It looks like It should be noted that an additional parameter like this wouldn't really work with the recent formatting additions to the You can access the raw dictation list by using the def process_text(text):
print(text.words)
mapping={"<text>": Function(process_text)}
extras = [Dictation("text")]
rule = MappingRule(mapping=mapping, extras=extras) I could be wrong, but I don't think you can catch dictation like this and still have commands like "scratch that" work correctly. |
@alexboche If issues arise when using nsformat, please notify. It is a function that is from quite a while ago, and not very intensively used, I think. |
I got the nsformat.py thing to work with a tip from @quintijn. For the scratch that thing, we can just store the length of the last dictation string printed out, and then press backspace that many times (using control backspace or control shift left repeated followed by backspace would be faster though not as reliable). See the code below for a working example. (full code here). The behavior of as format can be easily adjusted. For example, nsformat.py puts double spaces after periods ( as well as ? and !). To change this to single space, simply drop the double space on line 56. I.e. change it from Note that it appears to be essential to not pass in the nsformat state variable (and similarly with The state keeping behavior can sometimes put a space when you don't want it. For example, consider the following sequence of utterances (where As I see it, the only foolproof spacing solution would be to use information about the cursor position and the text on the page. This can be done through accessibility APIs and I think that is what Dragon itself does. (In principle, the clipboard could be used but I don't think that is worth pursuing.) In the meantime, it is much better to have an extra space than to not have a space when one should have one. To avoid the extra space, one can prefix words by "no-space" or just use a command like from dragonfly import *
import nsformat
dictation_lengths = [] # for "scratch that" purposes
input_state = None
def format_dictation(dictation):
global dictation_lengths, input_state
print("input_state: ", input_state)
formatted_output, output_state = nsformat.formatWords(dictation.words, state=input_state)
print("output_state: ", output_state)
formatted_output = str(formatted_output)
Text(formatted_output).execute()
# for "scratch that" purposes
dictation_lengths.append(len(formatted_output))
input_state = output_state
def scratch_that():
try:
for i in range(dictation_lengths[-1]):
Key("backspace").execute()
dictation_lengths.pop()
except IndexError:
print("to use the command 'strike' you must dictate first")
class CommandRule(MappingRule):
mapping = {"strike": Function(scratch_that) * Repeat(extra='n'), "ross": Key("right")}
command_rule = CommandRule()
class DictationRule(MappingRule):
mapping = {"<dictation>": Function(format_dictation),} Resetting the state one switching windows is something that might be useful and which nsformat does not do. Another issue is that the text action is a little bit slow when printing out long dictation utterances. The text can be put on the clipboard and pasted to increase print speed though it would be better to have the Text action just become faster somehow. Relatedly, the keypress speed is a bit too slow sometimes including one using backspace for the "scratch that" -style command shown above . Edit: using a mimic or playback action might be a way to increase the speed of the text output. I have not yet had time to look into how the formatting would work with that (maybe Dragon would take care of it automatically). Lastly, the grammar with a command like |
Thanks again for looking into this. It's good to see there are different ways to implement "scratch that", although I was thinking about it and realised you could probably use "scratch that" and other commands normally by deactivating the As to how you would do that, a Your approach could still be useful for every other engine backend though. |
I would vote to keep this kind of autoformatting functionality (as an optional feature) in Text, not Dictation. Dictation is just capturing what you are saying, where is this is a property of the action behavior. I do think that the accessibility API functionality I added would provide a clean way to implement this, because that way it integrates smoothly with text changes coming from outside Dragonfly (e.g. physical keypresses). This is currently a little awkward because the accessibility API depends on the Text action. Here are my initial thoughts on the cleanest way to integrate this:
I think this is nice because it provides a layered stack: device control at the lowest layer, accessibility API atop that, and actions atop that. The quick/hacky way to get something up and running would be to target (2) first using Text within the accessibility controller, like I currently do. I'm working on another project right now so I'm not planning on doing this anytime soon, but I'd be supportive if someone else wants to take it on. |
Thanks for your thoughts on this @wolfmanstout and sorry for the (very) late response! Implementing optional autoformatting functionality like this would be nice, although I think it would get pretty complicated even with the abstraction you mentioned. I wanted to touch on the points in your previous posts @alexboche. Having moved to mostly using the Kaldi Dragonfly engine backend, I figured I needed a way to format dictation output. I discovered that the Anyway, I have adapted the Natlink formatting classes to work with the other engines and created a context-aware grammar that keeps track of formatting flags for each window I dictate into, going a step further than the |
@Danesprite This sounds great! I am looking forward to some nice formatting. I threw together a hacky version of what you are describing, but haven't gotten around to cleaning it up to be generally usable. I would love to try yours out. |
@daanzu Yeah, I'm finding it really useful! I'll try to upload it somewhere today so you can try it out :) |
@daanzu Here is a Gist with the command module and the text formatting code: https://gist.github.com/Danesprite/413895d62a4a699f14a48796f9fda7e7 Place both files with your other command modules and you should be good to go. It is an exclusive grammar with enable/disable commands. Say I've tested it with WSR on Windows and Kaldi on Linux. It works pretty well with both. Because I've re-used the Natlink formatting code, it is tuned to things you would normally say when using DNS. For example:
As you can see, there are special phrases that get translated into characters, such as "full stop". I'm hoping to clean up the Anyway, hope you find this useful! :) |
@Danesprite Thanks, I'll try it out! |
@alexboche It would be trivial to add an optional parameter to If set to
I only implemented that parameter recently. It is documented here. Although, I must point out that the formatting output received by the |
I had forgotten about this issue. I still agree with James Stout, it should be optional, implemented in grammars. I'm hoping to add a dictation_format module for the other engines, used optionally. Probably it will only support English. I'll add a question and answer to Dragonfly's FAQ on solutions to (inter-utterance) dictation formatting, using Dragon or another engine. My dictation mode grammar should be enough to get the interested user started. |
( Editing this to make it more concise.)
Dragon does autoformatting with spacing and capitalization. for example,after you say a period, it will capitalize the next word. Regarding spacing, typically with dictation it is good to have a space inserted at the beginning of each utterance unless E.g. You just switched Windows or the last two characters were \n\n or something (possibly just a single \n).
Dragonfly already does something like this in dictation_format.py but it only formats dictation based on the other words said in the same utterance. It does not use information about what was said in the previous utterance (according to my testing). So e.g. if you say "period" in the middle of an utterance, it will capitalize the next word but if you end an utterance with a period and then say a new utterance it will not capitalize the next word after the period.
I think maybe such cross-utterance state-keeping behavior should be included inside dragonfly. That said, it might not be appropriate for when the user is just dictating a few phrases occasionally with mid utterance dictation commands e.g.
"say <dictation>"
rather than working with a dictation grammar where all dictation is filtered through a text action. PerhapsDictation
could have a special parameter for whether to output :Something simple like this has been implemented in Talon here (see the class AutoFormat) and in Natlink here (see the function
formatWords
on line 117 though I haven't looked closely) (edit: using nsformat.py in dragonfly is implemented in the post below), I also made a little toy example from scratch for capitalization after periods in my own dragonfly grammar hereIf this is not to be implemented in dragonfly itself, I would want to implement it in my own grammar. For doing that it might be useful for my commands to have access to the raw dictation list outputted by Dragon i.e. the list of words before it gets formatted in dictation_format.py and before it gets joined into a string. How do I access that?
I have been experimenting with CCR between dictation and commands (with @daanzu 's help) basically using
Repetition(Alternative(RuleRef(dictation_rule), RuleRef(command_rule)))
Where the dictation_rule filters dictation through a text action"<dictation>": Text(dictation)
. (full working grammar here Based on what daanzu gave me) I would like the Dragon autoformatting to work when I'm filtering through the text action.Also a way to use the "scratch that" command would be cool since that currently doesn't work when you filter dictation through a text action (edit: implemented below).
The text was updated successfully, but these errors were encountered: