(C) Rob W. W. Hooft, European Molecular Biology Laboratory, 1989, 1995
Many scientific plotting programs exist. Some of these are programs freely available for UNIX. These can be divided into 2 main groups:
To be able to create plots in all styles some people happen to like, graphing programs need to be extremely flexible. One way to solve this in limited programmer time is to write the output in a meta file format, that can later be edited with a dedicated drawing program. This can be done with "gnuplot", and it is the basic philosophy behind SCATTER. Another big advantage of this approach is that only one "device driver" is needed, and all the devices that are supported by tools reading the metafile are indirectly supported by SCATTER
Early versions of VMS SCATTER created POS files: an internal meta file
format used at the crystallography department of the University of
Utrecht. For the new UNIX version SCATTER-V2, I have chosen for FIG,
the meta file storage for the "xfig" program, versions 3.1.3 or
newer. This makes it possible to use SCATTER with "xfig" and
"transfig", fitting the figures seamlessly into LaTeX documents.
SCATTER was written in FORTRAN. I'm not yet sure about portability: it
seems to work O.K. under DEC OSF/1 and Linux.
Using SCATTER
Scatter has only a command line interface. It is called as "scatter
<options>". All options are optional. If an option is not
recognised, it will be treated as the name of the input datafile.
Files
Scatter treats all unrecognised options as the name of the input
datafile it should use. Only the last such file is actually read. If
no name for an input datafile is given, the file "scatter.dat" is
read.
The output filename by default is constructed by stripping the extension from the input filename, and then appending ".fig". An explicit outputfilename can be given using the OUTFILE option. If desired (when putting multiple images together) the option APPEND can be used to append to an existing FIG file.
If an inputfile has a name that corresponds to a SCATTER option, the
use as filename can be enforced with the FILE option.
Input file format
Scatter reads one datafile consisting of lines of datapoints. Each
line should contain a number of numbers together defining one
datapoint. A datapoint can have 5 significant values for scatter:
Only NX and NY are required, all other data items are optional. And you don't even have to specify NX and NY: they default to 1 and 2, respectively.
It is possible to have a line of column headers as the first line of
the file. See the TIC option.
Data selection
Options NC and CC can be used to select lines in the file
that contain data points. The option CSL
can be used to write a new file that contains only the selected
lines. Points filtered out using NC and CC are completely ignored by SCATTER, use the
SX, EX, SY, EY, SR and ER options
to specify explicit axis boundaries.
Data manipulation
A number of options exist to act on the data before plotting. If
combinations of these options are used, SCATTER will try to select a
sensible order in which to perform them. If two options can not be
combined, you will be informed. In principle, if one option requires
another, SCATTER will automatically activate the required option if
you don't specify it.
Options exist to Fourier Transform the data (FFT), Fourier Transform the data assuming that they really are periodical (PFFT), calculate an Auto Correlation function (ACF), calculate a cumulative average (CUMAV, SCUMAV, LCUMAV), calculate a cumulative sum (CUMSUM), smooth the data (SMOOTH), averaging (AVER) and to sort the data (SORTX). Peak searching is also present in the option PEAK.
XMOD and YMOD options exist to specify that the data
is circular (e.g. torsion angles). FOLD
can be used to specify that SCATTER should fold back all numbers such
that they are between -MOD/2 and MOD/2. Without FOLD, scatter will
change the data read such that any two subsequent points are less than
MOD/2 apart. So: if you are reading a sequence of torsion angles using
'XMOD 360', and the values in subsequent lines are '178 179 179.5 -179
179', scatter will plot the values as '178 179 179.5 181 179' too keep
the values as close together as possible.
Before the data for X axis, Y axis and Radius are used, they can be
turned into logarithms using LOGX, LOGY and LOGR, effectively creating a logarithmic
scale.
Data can be made positive using ABSX, ABSY and ABSR.
Curve fitting and correlations
A linear least squares procedure can be activated with the LSQ and LSQL
options. Polynomial fits using POLY. ROBUST can be used to get a linear fit
minimizing the absolute deviations instead of the square
deviations. This gives a much more stable results in case of
outliers.
All least squares fits take the SDY or ERY value into account to perform weighting. SDX or ERX values can only be taken into account by the linear least squares procedure. A completely scale-independent fit is then produced. If there is a correlation between the SDX and SDY values, this correlation must be given using the RHO option
If no functional description for a relation exists, the RANK option can be used to calculate a
parameterless correlation. It will use different algorithms depending
on the size of the data set.
Appearance of plots
The size of the complete plot in centimeters can be given using the HI and WI
options. Space for multiple graphs can be reserved in this area using
NPLX and NPLY, using IPL to specify which area should be used for
the current graph.
The actual scatter area
Normally, SCATTER will plot a normal scatterplot for the X and Y
columns specified by the NX and NY options. If an NR option is also given, the size of the
'dots' will be dependent on a third column in the file, with the
maximum size determined by RADMAX. If
standard deviations are given using NSDX
and/or NSDY or ERX and/or ERY
the horizontal and vertical size of the 'dots' will represent these
standard deviations. Which sign to use can be changed with the DOTSTYLE option. If the plot is too
crowded, a selective display can be enabled using the NINTERVAL option.
Whether a line is to be drawn between the data points is selected with
the LINESTYLE option. If SPLINE is given, an interpolated cubic
spline is drawn instead of straight line segments, the number of
interpolations can be specified using NSP. If a fit was performed, any line drawn
will be the fit-line.
As an alternative to changing the size of the sign using NR, using the TORSR option one can tell SCATTER that the
column indicated with NR gives a torsion
angle. For each point a sign "-" "+" or "square" will then be used
indicating "- gauche" "+ gauche" and "trans" conformations of the
torsion angle, respectively. These three are defined as the areas
within 30 degrees of the ideal +60, -60, and 180 degrees. No sign will
be drawn at all if the torsion angle falls out of these areas.
The X=, Y=, X=0, Y=0, Y=X, and Y=-X options can be used to draw
selected horizontal, vertical and diagonal lines into the plot area.
The axes and text
Axes will normally be drawn based on the scaling of the plot, with a
number smaller than 30 appearing at the beginning and the end of the
axis, and a power of ten in the axis label. Use FULLNUM if the full number should be
printed at either end of the axes instead. Use TEX if the powers of ten should be given
using LaTeX controls. Use AXDIV to
specify an alternative maximum number.
Axis divisions will be performed in 10's, 5's and 1's with different scaled ticks. Use SAMELENGTH to make all ticks equal size. 5-type and 1-type ticks will be left out if needed to prevent crowding. Options XSIXTY and YSIXTY can be used to change the spacing in 60's such that angles can be conveniently read. The axis ticks can be completely suppressed using the NOPUB option.
To prevent SCATTER from extending the plot to get round numbers at the ends of the axes, specify the NOAX option. The SPACE option can be used to specify by what percentage to extend the plot at all four boundaries before trying to make the nice round numbers. Explicit axis boundaries can be specified with the SX, EX, SY, EY, SR and ER options.
If the X and Y axes specify similar variables, the EXTEND_SQUARE and SHRINK_SQUARE options can be used to make the two axes equal.
If one of the axes (or both) should be given High to low instead of low to high, the REVX and REVY options can be used
Normally text along the Y axes is printed under an angle of 90 degrees. If this is not desired use the THOR option.
The margin area around a plot that is used to put in the text can be changed by the MARGIN option. This also changes the size of the font.
The option NOTEXT suppresses all text from the figure, NOTITLE suppresses only the title. These two can be useful for overlaying different plots in one figure. The options TEXT, XTEXT and YTEXT can be used to change the title, X-axis label, and Y-axis label respectively. Additional text items can be put into the figure using the ZTEXT option.
Missing options
If you find an important option missing, please notify me. I might consider
adding it.