Skip to content

scripting.new

Jeremy Faden edited this page Jun 14, 2024 · 1 revision

Purpose: Long-awaited refresh of scripting documentation.

Introduction to Scripting

Autoplot uses the programming language Jython to provide scripting. This is like Python, using the same syntax, but is based on the Java Virtual Machine rather than C or FORTRAN-based libraries. With scripting, the scientist can solve many common problems. Some are simple, like we want to see what the ratio of two data sets looks like. Others are more complex, like creating plots for each day of a ten-year mission. There is practically no limit to what can be done with scripting, because all of Java is available along with the libraries available in Autoplot.

Autoplot's data handling library, QDataSet, is used to represent data. This is used much like NumPy arrays, but along with data, these arrays carry metadata like axis labels, timetags, and units. Autoplot is used to easily load data into this environment using the "getDataSet" command and data URIs. For example, consider this script:

ds1= getDataSet( 'http://autoplot.org/data/image/Capture_00158.jpg?channel=greyscale' )
ds2= getDataSet( 'http://autoplot.org/data/image/Capture_00159.jpg?channel=greyscale' )
result= abs( ds2 - ds1 )
plot( result )

It should be fairly clear what the script does. It loads in two grayscale views of images, and then assigns the absolute difference to the variable "result." This variable could then be plotted, for example, or we might look for the greatest difference.

Autoplot Jython scripts are Jython plus other stuff

Autoplot scripts have useful symbols and objects imported automatically. We made a decision early on to make scripts easy to write instead of more portable Jython scripts, because the goal was to provide a scripting environment and that this was done with Jython or Python wasn't seen as important. Since this decision was made, Python has become quite popular in the space physics community, and those familiar with Python might find the automatic imports offensive. Because of this, Autoplot Jython scripts are given the extension .jy or .jyds instead of .py.

Some 200-300 commands are imported automatically, including "getDataSet" and "abs". A number of useful objects are also brought into the environment, such as "dom" which is the state of the application, and "monitor" which can report the progress when a task takes a while to perform.

Script Editor

Autoplot provides a editor GUI for working with scripts. Selecting Options→Enable Feature→Script Panel will reveal a tab named "script" where scripts may be entered and executed. The editor provides simple completions for this environment. To see completions, enter TAB or ctrl-space.

The editor also has a control for the context where the script will be run. There are two kinds of scripts: application context and data source context. Data source context scripts know only what's needed to load data. These might be a URI in a .vap file, called by a server, or used to send data to a remote client. These don't have the command "plot," for example, because the Autoplot application isn't running–just its libraries. The other context is the application context, and this has access to everything, including references to the running application. These can load dataset URIs making plots, respond to mouse events, and create new plot rendering types.

Data Source Context

Data Source Context scripts can load in data and return a new dataset (or datasets). These are files that can be used as if they were datasets. If the above script were saved in the file "http://autoplot.org/data/imageDiff.jyds," then http://autoplot.org/data/imageDiff.jyds?result would refer to this dataset. (Note "result" is what's plotted by default.) This allows scientists to publish operations done to data as well as data itself. Note these scripts are unaware of the Autoplot application, they can load and operate on data, but they cannot plot it. Commands available in this context are described below under the section "Ops."

Script examples: https://sourceforge.net/p/autoplot/code/HEAD/tree/autoplot/trunk/JythonDataSource/src/ and http://autoplot/cookbook#Scripting.

These scripts are saved with the extension ".jyds" and like all Autoplot data can reside on a web site so that anyone can use them from anywhere.

Application Context

These scripts access the application itself. Take the following:

trs= generateTimeRanges( '$Y-$m-$d', '2020-January' )
for tr in trs:
   plot( 'vap+cdaweb:ds=AC_H0_MFI&id=Magnitude&timerange='+tr )
   writeToPng( '/tmp/ap/%s.png' % tr )

This script would run the application through each day of the month January 2020, making images of each day. All commands are available in the application context.

Examples are available at https://sourceforge.net/p/autoplot/code/HEAD/tree/autoplot/trunk/Autoplot/src/scripts/, http://autoplot/cookbook#Scripting, and https://github.com/autoplot/dev/.

This Javadoc, https://ci-pw.physics.uiowa.edu/job/autoplot-javadoc/lastSuccessfulBuild/artifact/doc/org/autoplot/ScriptContext.html, describes the additional commands available in this Jython Application Context. Note the symbol "dom" is also available in Application Context scripts, and is the state of the application canvas.

Application context scripts are saved with the extension ".jy" and application-context scripts can be run from memory. That is, they can be run from the editor without first saving the script to a file.

Getting Started

This document hopes to introduce scripting sufficiently that the scientist will be enabled to understand scripts encountered and write scripts on their own.
There are hundreds of commands available, and this section starts with a number of commonly used commands to introduce scripting.

Name
Description
Example
getFile retrieve a file to make it local to the machine f='http://autoplot.org/data/14MB_1.qds'
print(getFile(f),monitor).length())
getDataSet load the URI into memory u='http://autoplot.org/data/14MB_1.qds'
ds=getDataSet(u,monitor)
dataset form a dataset from an array or multiple datasets, adding properties dataset(x,y,title='x as a function of y') and asserting y is dependent on x
bundle form a "bundle" dataset, asserting that two or more datasets are correlated. bmag=getDataSet(...)
fce=getDataSet(...)
fpi=getDataSet(...)
ds=bundle(bmag,fce,fpi)
ds.putProperty set a property for the data ds.putProperty(QDataSet.TITLE,'14 MB of data')
formatDataSet format the dataset to a file formatDataSet(dataset,'/tmp/mydata.cdf')
plot plot the data set plot(ds,ytitle='14MB')
trim trim the data to the indeces or range tr='2000-01-19T01:00/2000-01-20T10:00'
ds=trim(ds,datumRange(tr))
where return indeces where the condition is true r=where(ds.gt(1e5))
sum return the sum of the data print(sum(ds[r]))
reduceMax return the maximum value found in rows of the dataset plot(reduceMax(ds,1))
synchronize synchronize data to the same timetags by finding nearest neighbors (ds1,ds2)=synchronize(ds0,(ds1,ds2))

You can try each of these commands interactively in the log console at the "AP>" prompt, or in the script panel. If your console is not enabled, activate it with [menubar]→Options→Enable Feature→Log Console. You will also want to enable the script panel as well.

Here is an example script you can try:

ds=getDataSet('http://autoplot.org/data/14MB.qds',monitor)
r=where(ds.gt(1e5))
plot(ds[r],title='My Data')

The first line loads the data into the variable ds using the variable monitor to provide progress feedback. When a script is run, a number of variables are already defined, like monitor. The variable ds is a QDataSet, which is like an array in any other language, but can also carry metadata and also contains the time tags.

The second line uses the "where" command, which is much like the IDL "where" command, to return a list of indices showing where the condition is true. Note that ds is a two-index dataset, or "rank 2" dataset, containing measurements for each time and energy of the dataset. The result r is also rank 2, but the first index has a value for each position where the condition is true, and the second index is for the position within each record. So ds[r[0,0],r[0,1]] is the first value where the condition is true, and ds[r[-1]] is the last.

The third command plots the subset of the data where the condition is true, using "My Data" for the title of the plot.


Tab completions are useful for looking up command documentation. You might have spend some time poking around the documentation seeing the hundreds of commands which are available.

Data Set Operators

Jython allows operator overloading, so operators like +,-,*, and / are supported. So if A and B are datasets, then C=A*B will perform C[i]=A[i]*B[i] for each value of i. This is similar to Matlab and IDL. Note if A is rank 1 and B is rank 2, Autoplot will perform C[i,j]=A[i]*B[i,j]. In Jython, ** ("star star") is the "pow" operator (ds**2 squares each element), not ^ ("caret").

Additional "infix" operators go in between datasets:

name example description
gt A.gt(10.) returns 1 (True) where A is greater than 10.
ge A.ge(10.) returns 1 where A is greater than or equal to 10.
lt A.lt(10.) returns 1 (True) where A is less that than 10.
le A.le(10.) returns 1 where A is less than or equal to 10.
eq A.eq(10.) returns 1 where A is equal to 10.
ne A.ne(10.) returns 1 where A is not equal to 10.

Note that all operators propagate fill values. For example, if A[10] is a fill value, then C = A*1.2 will also have a fill value at C[10].

Building Scripts

Use of Progress Monitor

All scripts have a progress monitor that can be used to provide feedback to the script's invoker. This is the variable monitor and is used like so:

monitor.setTaskSize(200)                     # the number of steps (arbitrary units)
monitor.started()                            # the task is started  

d=getFile('http://autoplot.org/data/14MB.qds',monitor.getSubtaskMonitor(0,100,'load file'))
for i in xrange(100):                        # xrange(200) iterates through 0,1,...,198,199
   if ( monitor.isCancelled() ): break       # check to see if the task has been cancelled.
   monitor.setProgressMessage('at %d' % i)   # this describes actions done to perform the task.  
   monitor.setTaskProgress(i+100)
   sleep(100)                                # sleep for 100 milliseconds

monitor.finished()     # indicate the task is complete

A well-written script will use the monitor to effectively convey information to the person running it. Imagine the scientist is the head of a company, and the script is the Manager of a process. The process is implemented by a Worker. All three parties use the progress monitor. The Worker calls setProgressMessage and setTaskProgress to convey the state of the task. The Worker checks isCancelled to see if they should quit the process. The Manager calls setLabel to convey the overall goal of the progress. Typically Autoplot will play the role of the Manager, setting the label "loading data" or "executing script", but it is acceptable for the script to take on this role as well, since it can be more descriptive.

Adding Controls

Controls are added to the script using the "getParam" call. This call will get numbers, strings, URIs, and other types, using named parameters. Default values must be specified as well, and a single-line text label is used to describe the control. These controls are usually provided to the scientist using the script as a Java GUI control, but using a call like this allows scripts to be called from the command line with command line arguments, or from a web server where HTML is used to create the controls.

s= getParam( 's', 'deflt', 'string parameter' )                   # gets a string parameter, with default value "deflt"
f= getParam( 'f', 2.34, 'floating point parameter' )              # gets a float parameter, with default value 2.34
i= getParam( 'i', 100, 'array size' )                             # gets an integer parameter.  
f= getParam( 'f', 100.0, 'volume' )                               # gets a float parameter. 
e= getParam( 'e', 'RBSPA', 'spacecraft', [ 'RBSPA', 'RBSPB' ] )   # enumeration with the values given
b= getParam( 'v', 'F', 'apply correction', [ 'T', 'F' ] )         # booleans are just enumerations with the values 'T' and 'F'
b= getParam( 'b', False, 'apply correction', [ True, False ] )    # Python booleans can be used as well
l= getParam( 'l', 0, 'threshold', 
             { 'labels':['all','<20','20-50','>50'], 'values':[0,1,2,3] } ) # labels provided for each value of enumeration.

Autoplot will look for this in scripts and automatically add to GUI. The type is determined by the default value. Types are shown above, and there are also "Datums" which are a physical quantity (like 5Hz), "DatumRanges" which span a range from min datum to max datum, often used for time ranges, "URIs" which locate data, and "URLs" which locate web resources. Note files are not a type, and URLs can be used for this purpose with the "file:" prefix.

Autoplot simplifies the script to just the getParam calls and code needed to run them (range(1,4) returns 1,2,3 for example) running the simplified script to create the GUI.

Enumerations are supported as well, where a list of possible values is enumerated. For example:

sensor= getParam( 'sensor', 'left', 'sensor antenna', ['left','right'] )

will get the parameter sensor, which can be either left or right (with left as the default). When a GUI is created, a droplist of possible values is used instead of a text entry field.

Last, booleans are allowed, and a checkbox is used when a GUI is produced:

correct= getParam( 'correct', 'T', 'perform correction on the data', [ 'T', 'F' ] )
if correct=='T': doCorrection()

Note you cannot use the result as a boolean in the Jython code. You must compare it to 'T'. With Autoplot 2020, you can finally use booleans to avoid this problem:

correct= getParam( 'correct', True, 'perform correction on the data', [ True, False ] )
if correct: doCorrection()

Application-context scripts can use getParam as well. When the execute button is pressed the default values are used, but when shift-Execute is pressed a dialog for controlling the parameters is shown. Scripts can be run from the command line as well, and would look like:

java -cp autoplot.jar org.autoplot.AutoplotUI --script /tmp/myscript.jy sensor=right correct=T

Exceptions and Logging

Often when running a process, we know exceptional cases may occur and we want to invoke special code to handle them. For example, we may open a sequence of files, and if one is misformatted, we want to log it to a file and carry on with processing other files.

In Jython we use try/except blocks, something like:

try: 
    plot( uri )
    writeToPng( '/tmp/pngs/%s.png' % uri )
except Exception, ex:
    ERROR.write( '# unable to plot ' + uri + ' because of ' + str(ex) )

and sometimes you need to throw an exception, indicating a strange case the script has encountered:

if ds.length()==0:
    raise Exception('Dataset is empty')

Logging is useful for providing feedback to the script developer when things are broken but suppressing information when it's not needed.

from java.util.logging import Logger,Level
from org.das2.util import LoggerManager
logger= LoggerManager.getLogger('myScript')
logger.setLevel(Level.FINE) # Normally this is set to INFO, meaning only informational messages are printed.
logger.info('loading data')    # the script invoker is interested in these messages
logger.fine('reading record')  # only the developer is interested in this fine detail.
Using Loggers is useful, because you can set the level to Level.FINE when debugging code, but then leave it at Level.INFO when using it for science. The console also has a useful control for setting log levels.

Symbols Automatically Defined

name description
monitor progress monitor, which can be ignored or passed to slow function.
dom object representing the state of the current application
PWD directory containing the script, a URL like http://autoplot.org/data/tools/
or file:///home/user/analysis/

The Autoplot DOM contains the state of the application. This is a tree-like structure containing the list of plots and their positions on the canvas, the data URIs, and the plot elements with references to the data to read and plot on to which the data will be rendered. Any DOM property can be set, for example try dom.plots[0].title='My Data' or dom.plotElements[0].style.color=Color.RED.

DOM refers to a running application, and you will often see references containing "controller" like in dom.plots[0].controller.isPendingChanges(). The plot's controller node is added to manage the run-time data needed for managing the plot. This also provides access to the Das2 objects which are used to implement the plot, and additional functionality is provided by these. (Das2 is the graphics library used to create Autoplot.) Last, the controller node may provide functions useful in scripts like dom.controller.addConnector(dom.plots[0],dom.plots[1]).

PWD is allowed in scripts to refer to folder containing the script. This is a string containing the URI location of the script, be it on a local hard drive or a remote web folder. This will always end with a slash, and will often start with "http://" or "file://". The script editor tab allows a script to be composed and run without saving it to a file. In this case, PWD is not defined.

More Utility

Scripts have a central role in Autoplot. While Autoplot makes it easy to plot data directly from files, often a bit more "massaging" of the data is needed before it's useful for analysis. That first use case is just one of many fulfilled by scripts.

Scripts allow additional functionality to added to the application. For example, PNGWalk generation was once just a script. Scripts allow new functionality to be added to Autoplot by anyone, not just Java developers who have commit access to Autoplot's source code. Note that when a script is run, Autoplot provides an option to add it to the Tools menu, so that it's easily run. Note this also will run the script without any security checks, scripts referenced here are trusted for use.

Scripts can also be used to create simple web applications. For example, https://jfaden.net/AutoplotServlet/ScriptGUIServlet shows how a script is read by the server to create a web GUI and the image resulting from running the script is displayed alongside the controls. Groups interested in setting up such a server should contact Autoplot's developers.

Simple commands can be created as well, using the command-line interface. For example, download the script https://github.com/autoplot/dev/blob/master/demos/2019/20190726/demoParams.jy and then run at the command line:

/usr/bin/java -jar /home/jbf/bin/autoplot.jar --headless --testPngFilename=/tmp/foo.png --script=demoParams.jy nc=4 noise=0.01

Autoplot's "Run Batch" tool takes a script and generates parameter states and runs the script against each state. For example, suppose you want to detect an error condition in your data. You could write a script that detects if one day's data contains the condition, then use the Run Batch tool to test the entire mission.

Scripts are useful for testing as well. The Autoplot testing server contains hundreds of scripts which verify corners of its functionality which aren't tested using .vap files. Often scripts are provided by others which demonstrate a bug, and these scripts are incorporated into the testing environment to ensure the bug doesn't return.

Run Batch

Another use for scripts is creating batch jobs. A script is created which takes one or two parameters, and the "Run Batch Tool" at [menubar]->Tools->"Run Batch" will load up a script, and will generate inputs for the script. The job is then run, and blue or red dots show successful and failed runs, and the standard output from each run is kept. Optionally, the tool will also capture the canvas image at the end of each run.

Relation Between .jy, .jyds, and .vap Files

With two types of Jython scripts running around, it can be confusing as to what is used where. A .jy script can do everything Java and Autoplot can do, and can have references to .vap files. These extend Autoplot's functionality to perform some science task. For example, a .jy script can load a .vap and create a screenshot. It can also contain a "getDataSet" call to a .jyds file as well. A .vap file can contain references to a .jyds file (to load data), but it will not be able to use a .jy script.

Using GitHub (and GitLabs) to Store and Serve Scripts

Autoplot tries to work "in the web" so that people at different institutions can share vap files that grab files from web sites and no one thinks about where to download data before getting to analysis. To facilitate this, special support for GitHub and GitLabs instances has been added, and scripts can be served from these, along with .vap files.

Autoplot does this by interacting with a FileSystem interface. Whether it's a local directory or a web directory, it all looks the same to Autoplot as it requests files for plotting. (Remote files are downloaded and stored in the autoplot_data/fscache area, transparently.) GitLabs instances are appealing for this use because they are widely used, easily understood, and can be edited directly. (Note GitHub and GitLabs have a similar interface and both are supported.) Two scientists can collaborate using the same script, making changes which are trivially seen by both. Autoplot supports GitLabs instances, knowing for example "raw" must be added to the URL to download the file.

Presently GitLabs instances are not detected automatically, and within the code of Autoplot there is a list of the few supported instances. This will soon be resolved, but for now groups must request by email that a GitLabs instance be added. Also, only public projects are supported. This too may change, as groups request the feature.

Examples of .jy scripts

Autoplot is able to load resources from GitHub, and one area has been set up to contain hundreds of example scripts. The GitHub site https://github.com/autoplot/dev/tree/master/bugs contains codes which demonstrate bugs (which have been fixed). https://github.com/autoplot/dev/tree/master/demos/ and https://github.com/autoplot/dev/tree/master/rfe/ contain small but complete scripts which demonstrate commands. Note these scripts typically have dates for names, but they are easily searched. (See the upper-left search box, as of January 2020.)

Miscellaneous notes

Note the completions use a trick to work, and that is to rewrite the script to an equivalent script which can be executed immediately. This works by reducing the script to just imports, constructor calls, and to a number of routines which are known to be quick. This doesn't always work, but provides pretty good results. However, because of this, some parts of the code can not support completions, like callbacks from Java code (def boxSelected). It's also possible that a constructor call is very slow, which would hang completions, but this assumption has been effective. Last, there may be some side effects that occur as well, like GUIs created. The completions are getting more attention than before as many more people are using the editor, and this code is maturing.

Ops

All scripts import many routines to provide an environment for processing. These are commands like "abs" which return the absolute value of each element in the argument, or "fftPower" which performs FFTs on waveform data to create spectrograms. Here is a list of all commands available, and note this will need periodic updating and will often be out-of-date: https://github.com/autoplot/documentation/blob/master/javadoc/index-all.md. This may seem like an incomplete and arbitrary list, and it is, to some degree. Autoplot is used extensively by a number of workgroups, and as new commands are needed they are added. You should feel free to make requests for new commands!

Importing Libraries

Often a Java library exists which contains a code needed for analysis. For example, the Apache Math library was once like this, containing codes for non-linear optimization and other math methods. (It is now built-in to Autoplot.) Another example application would be importing a library that can write mp4 files. While this may eventually become part of Autoplot as well, it can also be imported at runtime. This is a unique feature of Java combined with Autoplot's downloading abilities. The following code can be run in Autoplot:

import sys
d= 'https://repo1.maven.org/maven2/org/jcodec/'
addToSearchPath( sys.path,d+'jcodec/0.2.5/jcodec-0.2.5.jar',monitor.getSubtaskMonitor('load jar'))
addToSearchPath( sys.path,d+'jcodec-javase/0.2.2/jcodec-javase-0.2.2.jar',monitor.getSubtaskMonitor('load jar'))
from org.jcodec.api.awt import AWTSequenceEncoder

and the script is now able to import the library to make MP4 files. See https://github.com/autoplot/dev/blob/master/demos/2019/20190302/encodePngwalkMpg.jy which takes each frame from a PNG Walk and encodes it into an MP4 movie file.

Note there are new security concerns with this, and this feature should be used carefully, using jars from a trusted source. A hostile script could delete files on your computer, but you can see the malicious code in the script.
Loading in and using a remote jar file does not have this transparency. Also coming soon is a new "sandbox mode" which will prevent codes from damaging your computer.

Notes

There should be a paragraph on good practices. These should include: verifying inputs and assumptions (does the directory contain files?). A script with no parameters should run. This allows one to verify that the script works as the last time we used it.

In 2020, we tried to identify "skill levels" for a script. This would separate scripts into beginner, intermediate, and expert scripts, and might provide a newbie some solace when a script has strange parts they haven't seen. For example, a beginner script would not have any imports. I'm not sure where this was documented, but this should be recovered or recreated.

Table Of Contents

URIs that Point to Data Files

Download a CDF and Plot it with Autoplot

Load a CDF directly from a website

URIs that Point to Data Servers

Saving to vap files

Loading vap files

Data Sources

CDF Files

HDF/NetCDF Files

Aggregation

CDAWeb

HAPI Servers

Exporting Data

Export Types

Additional controls

Aggregation

Tools

PNGWalk Tool

Data Mash Up

Events List

Run Batch

Advanced Topics

TimeSeriesBrowse and other Capabilities

Events Lists

Caching

Autoranging

Managing Autoplot's Data Cache

Using Autoplot with Python, IDL, and Matlab

Reading data into Python

Reading data into IDL

Reading data into Matlab

QDataSet Data Model

Clone this wiki locally