# Commit 610caf43acbd8b118cf3bccfd41e87074cda889e

- Diff rendering mode:
- inline
- side by side

histograms.tex

(25 / 17)

47 | 47 | The performance of evolutionary algorithms is generally evaluated by | |

48 | 48 | repeatedly running them against a set of test functions; this process | |

49 | 49 | generates a set of values for each algorithm/function pair, leading to a large | |

50 | amount of data which then needs to be intepreted. The common practice (see for | ||

50 | amount of data which then needs to be interpreted. The common practice (see for | ||

51 | 51 | example~\cite{xxx,yyy,zzz}) is to present tables containing average and standard | |

52 | 52 | deviations values, sometimes along with minima and maxima. | |

53 | 53 | When reading those tables however, one is not so much interested in the | |

… | … | ||

64 | 64 | ||

65 | 65 | \begin{table*} | |

66 | 66 | \caption{Stacked focused histograms convey more information %{{{2 | |

67 | regarding an algorithm's peformance than a table filled with | ||

67 | regarding an algorithm's performance than a table filled with | ||

68 | 68 | numbers and are readable at first glance.}\label{t:eahist} | |

69 | 69 | \begin{minipage}{.5\linewidth} | |

70 | 70 | \centering | |

… | … | ||

458 | 458 | To evaluate the work, the reader is instructed to first study the | |

459 | 459 | Table~\ref{tab:stdavg}. Casual study reveals, mostly due the bold font, that | |

460 | 460 | Method 4 is likely to be the best candidate. At this point we make a claim: | |

461 | there are two functions for which this does not hold. How long does it take to see | ||

462 | which ones they are? This simple test clearly illustrates the fact that | ||

463 | reading this table is difficult. | ||

461 | there are four functions for which this might not be the case. How long does | ||

462 | it take to see which ones they are? This simple test clearly illustrates the | ||

463 | fact that reading this table is difficult. | ||

464 | 464 | ||

465 | 465 | In contrast we observe Table~\ref{tab:histograms}. We instantly see that in | |

466 | 466 | many cases Method 4 has produced results closer to the optimum than other | |

467 | 467 | methods, with the closest competitor being Method 3. Method 2 seems to be in | |

468 | 468 | general not competitive compared to the other methods and Method 1 is in the | |

469 | competition but losing. In some cases, such as 8 and 10 we see significant | ||

470 | overlap which is also indicated by the Mann-Whitney U test in | ||

471 | Table~\ref{tab:utest}, so our visualisation seems to be effectively conveying | ||

472 | the same information as the test. However, the visualisation also shows | ||

473 | several other points of interest, that are not evident in either standard | ||

474 | deviation table or statistical test. In some cases, some algorithms have their | ||

475 | data entirely in the ``dump bin''. This is the author's way of visually claiming | ||

476 | that those algorithms did not manage to produce any meaningful results. | ||

469 | competition but losing. In four cases (functions 4, 6, 8 and 10), we see significant | ||

470 | overlap, which confirms the result of the Mann-Whitney U test in | ||

471 | Table~\ref{tab:utest}, indicating that for Functions 8 and 10, Method 4 is not | ||

472 | performing significantly better than Method 1. The same test indicates however | ||

473 | that on Function 4, Method 4 is outperforming Method 1 whereas the | ||

474 | distributions are clearly overlapping. Since histogram for Method 4 is skewed | ||

475 | to the left, the distribution is not symmetrical and the Mann-Whitney U test | ||

476 | cannot therefore be applied, and its result on Function 4 cannot be trusted. | ||

477 | These examples therefore illustrate the fact that our visualisation seems to | ||

478 | be effectively conveying at least the same information as the Mann-Whitney U | ||

479 | test, as well as the limits of its applicability. However, the visualisation | ||

480 | also shows several other points of interest, that are not evident in either | ||

481 | standard deviation table or statistical test. In some cases, some algorithms | ||

482 | have their data entirely in the ``dump bin''. This is the author's way of | ||

483 | visually claiming that those algorithms did not manage to produce any | ||

484 | meaningful results. | ||

477 | 485 | ||

478 | 486 | Method 1 seems to have a rather robust behaviour. Although it rarely competes | |

479 | 487 | in the best solution quality, it seems to reliably achieve a certain level of | |

… | … | ||

515 | 515 | \section{\uppercase{Conclusions}} % half a page | |

516 | 516 | ||

517 | 517 | In this text we have presented a novel visualization for comparing evolutionary optimization methods. We claim that this visualisation | |

518 | can convey more information than average-standard deviation tables and statistical test tables while retaining nearly same usage of | ||

518 | can convey more information than average/standard deviation tables and statistical test tables while retaining nearly the same usage of | ||

519 | 519 | space and still improve on the readability of the paper. We also offer our opinion that reporting averages, standard deviations or any single | |

520 | statistical number in context of stochastic algorithms is not an useful practise and can be misleading. | ||

520 | statistical number in context of stochastic algorithms is not a useful practice and can be misleading. | ||

521 | 521 | In our view, sparkline histograms completely supersede the use of average and standard deviation table. | |

522 | 522 | ||

523 | 523 | We also present that histograms are a easier approach than statistical testing, which | |

524 | 524 | requires great care to do properly. We do not claim that statistical test are not a valid tool, but instead fear that, based on experience | |

525 | in other fields, one that can be easily misused. Sparkline histograms carry the same information in a form that is easily understood by | ||

525 | in other fields, they that can be easily misused. Sparkline histograms carry the same information in a form that is easily understood by | ||

526 | 526 | a layman and offers far fewer places for mistakes and misinterpretations. | |

527 | 527 | ||

528 | Lastly, we would like to conclude that scientific visualisation is more than "pretty graphics" and can lead on previously unknown, but | ||

528 | Lastly, we would like to conclude that scientific visualisation is more than ``pretty graphics'' and can lead on previously unknown, but | ||

529 | 529 | valid and useful conclusions about the data. | |

530 | 530 | ||

531 | 531 | % Remainder |

title.tex

(2 / 2)

1 | 1 | % The Title is all in Uppercase | |

2 | \newcommand{\mytitle}{\uppercase{On Visualization and Comparison of | ||

3 | Optimization Methods}%\\ | ||

2 | \newcommand{\mytitle}{\uppercase{sparkline histograms for comparing | ||

3 | evolutionary optimization methods}%\\ | ||

4 | 4 | % The Subtitle should have initial letters capitalized | |

5 | 5 | %\fontsize{13}{15}\selectfont \textit{Preparation of Camera-Ready | |

6 | 6 | %Contributions to SciTePress Proceedings} |