'Variate line color on dictionary lookup in Gnuplot

Lets say you have the following data file:

#id count        min  1st quart     median  3rd quart        max          sum    std-dev name
  1   172    0.00032    0.00033    0.00033    0.00033    0.00138      0.05811    0.00008 spec
  2   172    0.00039    0.00040    0.00041    0.00042    0.00142      0.07236    0.00008 schema
  3   172    0.00007    0.00008    0.00008    0.00009    0.00032      0.01539    0.00003 truss

And you want to draw three box plots with different color depending on which name, column 10, and you'd rather not add an additional column to your already wide table with redundant information.

You've currently got a graph that looks like: enter image description here

Through the script:

set terminal pdf enhanced size 8cm,8cm font "Verdana 10"
set output "charts/comparison-keyword-".ARG1.".pdf"
set boxwidth 0.2 absolute
set title "Validation comparison for key :".ARG1
set ylabel "milliseconds"
set xrange[0:4]
set yrange[0.00005:50]
set logscale y
set grid y

set tics scale 0
set xtics nomirror
set ytics nomirror
set border 2

set style fill solid 0.25 border -1
set style data boxplot

# Data columns: id count min 1st-quart median 3rd-quart max sum std-dev name

plot "data/comparison-keyword-".ARG1 using 1:4:3:7:6:(0.6):xticlabels(10) with candlesticks linecolor rgb 'orange' title 'Quartiles' whiskerbars, \
         ''         using 1:4:4:4:4:(0.6) with candlesticks lt -1 notitle

And would like to change the linecolor thruogh a dictionary lookup where:

spec   => blue
schema => orange
truss  => green

How would you go about it? Is it even possible to translate spec => blue in GnuPlot?



Solution 1:[1]

A late answer, but there is no need for sed and no need to modify the data by adding an extra column. You can do it with only gnuplot which would also be platform-independent. It can be done by a string lookup which is also used here. For the colors it would be easier to provide them in 0xrrggbb scheme, instead of color names, otherwise you need to check this: gnuplot: apply colornames from datafile

Script:

### selecting colors by key from data column ("lookup table")
reset session

$Data <<EOD
#id count        min  1st quart     median  3rd quart        max          sum    std-dev name
  1   172    0.00009    0.00023    0.00033    0.00043    0.00138      0.05811    0.00008 spec
  2   172    0.00011    0.00020    0.00037    0.00042    0.00142      0.07236    0.00008 schema
  3   172    0.00002    0.00003    0.00008    0.00012    0.00032      0.01539    0.00003 truss
EOD

$Lookup <<EOD
spec        0x0000ff
schema      0xffa500
truss       0x00ff00
EOD
getIdx(s)    = int(sum [_i=1:|$Lookup|] (word($Lookup[_i],1) eq s ? _i : 0))
myColor(col) = int(word($Lookup[getIdx(strcol(col))],2))

set title "Validation comparison for key :"
set xrange[0:4]
set xtics scale 0
set ylabel "milliseconds"
set ytics nomirror
set logscale y
set grid y
set border 2
set style fill solid 0.25 border -1
set style data boxplot
set key noautotitles

# Data columns: id count min 1st-quart median 3rd-quart max sum std-dev name
plot $Data u 1:4:3:7:6:(0.6):(myColor(10)):xtic(10) w candle lc rgb var whiskerbars, \
        '' u 1:5:5:5:5:(0.6) w candle lc rgb "black" ti 'Quartiles' whiskerbars
### end of script

The above lookup table works only for gnuplot>=5.2.0 because it uses indexing of datablocks. The lookup version for earlier versions would look like this:

myNames      = "spec schema truss"
myColors     = "0x0000ff 0xffa500 0x00ff00"
getIdx(s)    = int(sum [_i=1:words(myNames)] (word(myNames,_i) eq s ? _i : 0))
myColor(col) = int(word(myColors,getIdx(strcol(col))))

Result:

enter image description here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 theozh