Splitting strings

Have read Splitting Strings by Chris Zetter mini saga…

What this expression:


shall produce?

Python, JS, Sciter (now):

"".split(",") -> [""]

Ruby, Perl, AWK, Sciter (while ago):

"".split(",") -> []

That reminded me that initially in Sciter I’ve implemented the function in Ruby/Perl way as it looked more natural at that moment.

And only recently redesigned the String.split() to follow JavaScript way.

When linear search O(N) is faster than hash table O(1) lookup

Consider quite popular in programming task of string-to-enum value parsing.

You have enum declared in your code (C/C++ but can be any modern programming language)

enum FormatterType {

And text representation of these values that are coming as tokens – strings:


The task

To write a function that returns one of enum FormatterType constants when given a token – let it be std::string for brevity :

FormatterType t2e(std::string s) { ... }

In this article I will test three straightforward implementations of such string2enum function:

  1. linear lookup – that supposedly shall be slowest as it has O(N) performance.
  2. std::map lookup – binary search tree that is known of having O(log(N)) lookup performance.
  3. std::unordered_map lookup – hash table, should be fastest as we’ve been told it has O(1) complex lookup.

Linear search

The function is just a chain of if-elif-elif-else’es (I am using ternary operator here that is conceptually the if):

FormatterType if_t2e(std::string name)
  aux::chars name_chars = aux::chars_of(name);
    name_chars == CHARS("text") ? simple_formatter
     : name_chars == CHARS("integer") ? integer_formatter
     : name_chars == CHARS("decimal") ? decimal_formatter
     : name_chars == CHARS("currency") ? currency_formatter
     : name_chars == CHARS("date") ? date_formatter
     : name_chars == CHARS("date-local") ? date_local_formatter
     : name_chars == CHARS("time") ? time_formatter
     : name_chars == CHARS("time-local") ? time_local_formatter
     : name_chars == CHARS("enum") ? enum_formatter
     : name_chars == CHARS("duration") ? duration_formatter
     : simple_formatter;

Where aux::chars is simple slice structure { const char* start; unsigned int length; } that has operator== defined as:

      bool aux::chars::operator == ( const slice& r ) const
        if( length != r.length )
          return false;
        for( unsigned int i = 0; i < length; ++i )
          if( start[i] != r.start[i] )
            return false;
        return true;

As you see it is dead simple – nothing in it to write home about.

Binary search tree lookup

std::map<std::string, FormatterType> map_index = {
  { "text", simple_formatter },
  { "integer", integer_formatter },
  { "decimal", decimal_formatter },
  { "currency", currency_formatter },
  { "date", date_formatter },
  { "date-local", date_local_formatter },
  { "time", time_formatter },
  { "time-local", time_local_formatter },
  { "enum", enum_formatter },
  { "duration", duration_formatter }

FormatterType map_t2e(std::string name)
  auto it = map_index.find(name);
  return (it != map_index.end()) ? it->second : simple_formatter;

Simple and obvious. And so is…

Hash table lookup

std::unordered_map<std::string, FormatterType> unordered_map_index = {
  { "text", simple_formatter },
  { "integer", integer_formatter },
  { "decimal", decimal_formatter },
  { "currency", currency_formatter },
  { "date", date_formatter },
  { "date-local", date_local_formatter },
  { "time", time_formatter },
  { "time-local", time_local_formatter },
  { "enum", enum_formatter },
  { "duration", duration_formatter }

FormatterType unordered_map_t2e(std::string name)
  auto it = unordered_map_index.find(name);
  return (it != unordered_map_index.end()) ? it->second : simple_formatter;


We will test these functions on random set of input strings that contains our tokens:

std::vector input;

void generate_input() {
  for (int n = 0; n < TOTAL_RUNS; ++n)
    input.push_back(std::string(input_tokens[std::rand() % ITEMS_IN(input_tokens)]));

To show the tendency I will test these functions on different sizes of enum sets : 4, 5, 6, 7, 8, 9 and 10 items per enum:

The result

As smaller the number below as faster the function performs. Yellow cells mark best cases.

And here is the surprise – linear lookup, that is supposed to be the slowest, is actually the fastest on small enums:

enum items std::map (sec) std::u_map (sec) linear (sec)
4 3.44 2.88 2.13
5 3.6 2.93 2.36
6 3.52 3.13 2.57
7 3.58 3.18 2.64
8 3.74 3.23 2.78
9 3.76 2.76 2.81
10 3.73 2.76 2.83

And the chart to visualize the results:

As lowest the line – the better. That gray line above is our linear lookup results. As you see on small enums it is the champion.


  1. On small enum sets linear lookup is better – that is O(N) outperforms O(1) !
  2. On larger enum sets hash table will work better.
  3. std::map (binary search tree) is the worst possible tool for implementation of such things.
  4. The best one is so called perfect hash function like the one produced by the  gperf . It generates FSA (finite state automata) unique to the given set of items in enum. gperf based function takes 1.86 seconds to perform lookup on enum having 9 items.


So why is that?

The reason is that each lookup method has its own hidden cost of lookup preparation.

  1. My linear lookup implementation has simply no preparation code – comparisons starts immediately.
  2. For the hash map to start doing lookup it needs to compute hash value of given string – O(n) task, where n is the length of the string.
  3. In binary search tree each left/right comparison is a O(n) task in worst case.
  4. And perfect hash (FSA) does not require preparation code, yet input string is scanned strictly once.


You can take test application terrainformatica.com/experiments/t2e.zip

aux::chars primitive is a reduced version of aux-slice.h file aux-slice.h file from Sciter SDK. Sciter uses slices instead of strings pretty much everywhere. Strings are used too but only as a storage of characters.

In fact that slice construct is what makes D language of Walter Bright the Great so fast – arrays and strings there are slices – structures containing only {start,length} fields.

Epic discussion: VS Code Uses 13% CPU When Idle Due to Blinking Cursor Rendering

VS Code Uses 13% CPU When Idle Due to Blinking Cursor Rendering

Discussions: Reddit and Hacker News

Couple of citations:

Assuming 1 million users of VS Code This blinking cursor will waste $3 million per year in electricity costs, and output 32,000 tons of c02 per year.

2150: “Grandpa, why did the ocean’s dry up?”
“well sonny, long ago, Microsoft didn’t optimize their code for a code editor’s cursor and pushed earth’s climate over the edge”


All this is about Microsoft Visual Code editor that is Electron application — Chromium based.

They blink cursor by this CSS declaration:

@keyframes monaco-cursor-blink {
    50% {
        opacity: 0;
    100% {
        opacity: 1;

.cursor-blink {
    animation: monaco-cursor-blink 1s step-start 0s infinite;

Problems with this solution:

  • In chromium / webkit this animation runs with animation frame rate — each 16ms — 60 FPS.
  • As Chromium (WebKit with Skia backend) uses mostly CPU rasterization than simple task of filling 1px rectangle eats CPU (and so drains batteries). On each frame rate whole window gets updated.

In principle all this can be solved by something simple as:

setTimer(function() { caret.toggleClass("visible") }, 1000);

Maintainable CSS

I have found Maintainable CSS site exceptionally useful for designing maintainable CSS systems.

Modular and encapsulated: Styles don’t bleed or cascade without your permission.

Any design requirements: Completely flexible to your needs.

No tooling required: But you can use tooling if you want to.

Easy to learn: Read the guides and see.

Any size project: Whatever size project you have, MaintainableCSS will help.

Upgrade in your own time: You can start applying the approach today, bit by bit. You don’t need to upgrade the whole

UI usability solution I am especially proud of

25 or so years ago we were designing one accounting application that required intensive input of numbers by operators.
That was in Russia at the time when personal computers were just appearing in offices massively.

Amount of numeric data that operators needed to input was enormous and as in any financial institution the requirement for correct input was very high.

Operators were quite professional and were doing very fast blind input by hands at the time when their eyes were focused on paper sheets with data.

In order to help them to verify their input I invented and implemented “musical” verification. When focus lefts numeric field, the application is playing melody where each digit has its own tone. I was surprises how effective it happened to be. At the end of first day operators were able to determine at least number of digits filled and after some time some of girls were even able to tell exact number by only hearing the melody.

Cool, eh?

Pixel Art at its best

Andrey Lyapichev ( Андрей Ляпичев ) AKA Weilard have started series of articles about Pixel Art on habrahabr.ru.
That’s about graphics in computer games, but not only. Composition, perspective, perspective simulation by color and contrast, etc.

Pixel Art

Part one, part two
That’s in Russian. If you cannot read Russian just look through images there to get an idea somehow.