Commit 610caf43acbd8b118cf3bccfd41e87074cda889e

  • avatar
  • Matthieu Weber <mweber @m…t.jyu.fi> (Committer)
  • Thu Jun 17 11:12:37 EEST 2010
  • avatar
  • Matthieu Weber <mweber @m…t.jyu.fi> (Author)
  • Thu Jun 17 11:12:37 EEST 2010
Final fixes
histograms.tex
(25 / 17)
  
4747The performance of evolutionary algorithms is generally evaluated by
4848repeatedly running them against a set of test functions; this process
4949generates a set of values for each algorithm/function pair, leading to a large
50amount of data which then needs to be intepreted. The common practice (see for
50amount of data which then needs to be interpreted. The common practice (see for
5151example~\cite{xxx,yyy,zzz}) is to present tables containing average and standard
5252deviations values, sometimes along with minima and maxima.
5353When reading those tables however, one is not so much interested in the
6464
6565\begin{table*}
6666\caption{Stacked focused histograms convey more information %{{{2
67regarding an algorithm's peformance than a table filled with
67regarding an algorithm's performance than a table filled with
6868numbers and are readable at first glance.}\label{t:eahist}
6969\begin{minipage}{.5\linewidth}
7070\centering
458458To evaluate the work, the reader is instructed to first study the
459459Table~\ref{tab:stdavg}. Casual study reveals, mostly due the bold font, that
460460Method 4 is likely to be the best candidate. At this point we make a claim:
461there are two functions for which this does not hold. How long does it take to see
462which ones they are? This simple test clearly illustrates the fact that
463reading this table is difficult.
461there are four functions for which this might not be the case. How long does
462it take to see which ones they are? This simple test clearly illustrates the
463fact that reading this table is difficult.
464464
465465In contrast we observe Table~\ref{tab:histograms}. We instantly see that in
466466many cases Method 4 has produced results closer to the optimum than other
467467methods, with the closest competitor being Method 3. Method 2 seems to be in
468468general not competitive compared to the other methods and Method 1 is in the
469competition but losing. In some cases, such as 8 and 10 we see significant
470overlap which is also indicated by the Mann-Whitney U test in
471Table~\ref{tab:utest}, so our visualisation seems to be effectively conveying
472the same information as the test. However, the visualisation also shows
473several other points of interest, that are not evident in either standard
474deviation table or statistical test. In some cases, some algorithms have their
475data entirely in the ``dump bin''. This is the author's way of visually claiming
476that those algorithms did not manage to produce any meaningful results.
469competition but losing. In four cases (functions 4, 6, 8 and 10), we see significant
470overlap, which confirms the result of the Mann-Whitney U test in
471Table~\ref{tab:utest}, indicating that for Functions 8 and 10, Method 4 is not
472performing significantly better than Method 1. The same test indicates however
473that on Function 4, Method 4 is outperforming Method 1 whereas the
474distributions are clearly overlapping. Since histogram for Method 4 is skewed
475to the left, the distribution is not symmetrical and the Mann-Whitney U test
476cannot therefore be applied, and its result on Function 4 cannot be trusted.
477These examples therefore illustrate the fact that our visualisation seems to
478be effectively conveying at least the same information as the Mann-Whitney U
479test, as well as the limits of its applicability. However, the visualisation
480also shows several other points of interest, that are not evident in either
481standard deviation table or statistical test. In some cases, some algorithms
482have their data entirely in the ``dump bin''. This is the author's way of
483visually claiming that those algorithms did not manage to produce any
484meaningful results.
477485
478486Method 1 seems to have a rather robust behaviour. Although it rarely competes
479487in the best solution quality, it seems to reliably achieve a certain level of
515515\section{\uppercase{Conclusions}} % half a page
516516
517517In this text we have presented a novel visualization for comparing evolutionary optimization methods. We claim that this visualisation
518can convey more information than average-standard deviation tables and statistical test tables while retaining nearly same usage of
518can convey more information than average/standard deviation tables and statistical test tables while retaining nearly the same usage of
519519space and still improve on the readability of the paper. We also offer our opinion that reporting averages, standard deviations or any single
520statistical number in context of stochastic algorithms is not an useful practise and can be misleading.
520statistical number in context of stochastic algorithms is not a useful practice and can be misleading.
521521In our view, sparkline histograms completely supersede the use of average and standard deviation table.
522522
523523We also present that histograms are a easier approach than statistical testing, which
524524requires great care to do properly. We do not claim that statistical test are not a valid tool, but instead fear that, based on experience
525in other fields, one that can be easily misused. Sparkline histograms carry the same information in a form that is easily understood by
525in other fields, they that can be easily misused. Sparkline histograms carry the same information in a form that is easily understood by
526526a layman and offers far fewer places for mistakes and misinterpretations.
527527
528Lastly, we would like to conclude that scientific visualisation is more than "pretty graphics" and can lead on previously unknown, but
528Lastly, we would like to conclude that scientific visualisation is more than ``pretty graphics'' and can lead on previously unknown, but
529529valid and useful conclusions about the data.
530530
531531% Remainder
title.tex
(2 / 2)
  
11% The Title is all in Uppercase
2\newcommand{\mytitle}{\uppercase{On Visualization and Comparison of
3Optimization Methods}%\\
2\newcommand{\mytitle}{\uppercase{sparkline histograms for comparing
3evolutionary optimization methods}%\\
44% The Subtitle should have initial letters capitalized
55%\fontsize{13}{15}\selectfont \textit{Preparation of Camera-Ready
66%Contributions to SciTePress Proceedings}