Why do more powerful programming languages tend to have slower implementations?

Question

I was reading this article. The author talks about "The Blub Paradox". He says programming languages vary in power. That makes sense to me. For example, Python is more powerful than C/C++. But its performance is not as good as that of C/C++.

Is it always true that more powerful languages must necessarily have lesser possible performance when compared to less powerful languages? Is there a law/theory for this?

score 56 · Answer 1 · edited Jun 21 '20 at 17:54

This is simply not true. And part of why it's false is that the premise isn't well formed.

There is no such thing as a fast or slow language. The expressive power of a language is purely a function of its semantics. It is independent of any particular implementation.

You can talk about the performance of code generated by GCC, or about the performance of the CPython interpreter. But these are specific implementations of the language. You could write a very slow C compiler, and you can write Python interpreters that are quite fast (like PyPy).

So the answer to the question of "is more power necessarily slower" is no, purely because you or I can go write a slow C compiler, that has the same expressive power as GCC, but that is slower than Python.

The real question is "why do more powerful languages tend to have slower implementations." The reason is that, if you're considering the C vs Python, the difference in power is abstraction. When you do something in Python, there is a lot more that is implicit that is happening behind the scenes. More stuff to do means more time.

But there's also lots of social elements at play. People who need high performance choose low level languages, so they have fine grained control of what the machine is doing. This has led to the idea that low level languages are faster. But for most people, writing in C vs Python will have pretty comparable performance, because most applications don't require that you eke out every last millisecond. This is particularly true when you consider the extra checks that are manually added to program defensively in C. So just because lots of specialists have built fast things in C and C++ doesn't mean they're faster for everything.

Finally, some languages have zero cost abstraction. Rust does this, using a type system to ensure memory safety without needing runtime garbage collection. And Go has garbage collection, but it's so fast that you get performance on par with C while still getting extra power.

The TLDR is that more powerful languages are sometimes faster in some cases, but this is not a firm rule, and there are exceptions and complications.

Jörg W Mittag · Answer 2 · 2020-06-21T18:35:58.510

Is it always true that more powerful languages must necessarily have lesser possible performance when compared to their less powerful counterparts? Is there a law/theory for this?

First off, we need to make one thing clear: languages don't have "performance".

A particular program written in a particular programming language executed on a particular machine in a particular environment under particular conditions using a particular version of a particular implementation of the programming language has a particular performance. This does not mean that all programs written in that language have a particular performance.

The performance that you can attain with a particular implementation is mostly a function of how many resources, how much money, how many engineers, etc. are invested to make that implementation fast. And the simple truth is that C compilers have more money and more resources invested in them than Python implementations. However, that does not mean that a Python implementation cannot be fast. A typical Python implementation has about as many full-time engineers as a typical C compiler vendor has full-time custodians that re-fill the developers' coffee machines.

Personally, I am more familiar with the Ruby community, so I will give some examples from there.

The Hash class (Ruby's equivalent to Python's dict) is written in 100% C in YARV. In Rubinius, however, it is written (mostly) in Ruby (relying only on a Tuple class that is partially implemented using VM primitives).

The performance of Hash-intensive benchmarks running on Rubinius is not significantly worse than running on YARV, which means that at least for those particular combinations of benchmark, language, operating system, CPU, environment, load, etc. Ruby is about as fast as C.

Another example is TruffleRuby. The TruffleRuby developers set up an interesting benchmark: they found two Ruby libraries that use lots Ruby idioms that are thought to be notoriously hard to optimize, such as runtime reflection, dynamically calculating method names to call, and so on. Another criterion they used, was that the Ruby library should have an API compatible replacement written as a YARV C extension, thus indicating that the community (or at least one person in it) deemed the pure Ruby version too slow.

What they then did, was create some benchmarks that heavily rely on those two APIs and run them with the C extensions on YARV and the pure Ruby versions on TruffleRuby. The result was that TruffleRuby could execute the benchmarks on average at 0.8x the performance of YARV with the C extensions, and at best up to 21x that of YARV, in other words, TruffleRuby was able to optimize the Ruby code to a point where it was on average comparable to C, and in the best case, over 20x faster than C.

[I am simplifying here, you can read the whole story in a blog post by the lead developer: *Pushing Pixels with JRuby+Truffle].

That does, however, not mean that we can simply say "Ruby is 20x faster than C". It does, however, show that clever implementations for languages like Ruby (and Python, PHP, ECMAScript, etc. are not much different in that regard) can achieve comparable, and sometimes even better, performance than C.

There are more examples that demonstrate how throwing money at the problem increases performance. E.g. until companies like Google started to develop entire complex applications in ECMAScript (GMail, Google Docs, Google Wave [RIP], MS Office online, etc.), nobody really cared about ECMAScript performance. Sure, there were browser benchmarks, and browser vendors tried to improve them bit by bit, but there was no serious effort to build a fundamentally high-performance ECMAScript engine. Until Google built V8. Suddenly, all other vendors also invested heavily in performance, and within just a few years, ECMAScript performance had increased by a factor of 10 across all implementations. But the language had not changed at all in that time! So, the exact same language suddenly became "10 times faster", just by throwing money at it.

This should show that performance is not an inherent characteristic of the language.

One last example is Java. The original JVM by Sun was dog-slow. Along came a couple of Smalltalk guys who had developed a high-performance Smalltalk VM (the Animorphic Smalltalk VM) and noticed that Smalltalk and Java were very similar, and they could easily build a high-performance JVM using the same ideas. Sun bought the company (which is ironic, because the same developers had already built the high-performance Self VM based on the same ideas while employed at Sun, but Sun let them go just a couple of years earlier because they wanted to focus on Java and not Self as their new language), and the Animorphic Smalltalk VM became the Sun HotSpot JVM, still the most widely-used JVM to date.

(Interestingly, the team that built V8 includes key people of the team that built HotSpot, and the ideas behind V8 are – not surprisingly – also based on the Animorphic Smalltalk VM.)

Lastly, I would also like to point out that we have only talked about languages and language implementations (interpreters, compilers, VMs, …) here. But there is a whole environment around those. For example, modern CPUs contain quite a lot of features that are specifically designed to make C-like languages fast, e.g. branch prediction, memory prefetching, or memory protection. None of these features really help languages like Java, ECMAScript, PHP, Python, or Ruby. Some (e.g. memory protection) even have the potential to slow them down. (Virtual memory can impact garbage collection performance, for example.) The thing is: these languages are memory-safe and pointer-safe, they don't need memory protection because they fundamentally do not allow the operations that memory protection protects agains in the first place!

On a CPU and an OS that were designed for such languages, it would be much easier to achieve higher performance. If you really wanted to do a fair benchmark between, say, C and Python, you would have to run the Python code on a CPU that has received just as many optimizations for Python as our current mainstream CPUs have for C.

You might find some more interesting information in these questions:

score 8 · Accepted Answer · edited Oct 28 '21 at 16:26

TL;DR: Performance is a factor of Mechanical Sympathy and Doing Less. Less flexible languages are generally doing less and being more mechanically sympathetic, hence they generally perform better out of the box.

Physics Matter

As Jorg mentioned, CPU designs today co-evolved with C. It's especially telling for the x86 instruction set which features SSE instructions specifically tailored for NUL-terminated strings.

Other CPUs could be tailored for other languages, and that may give an edge to such other languages, but regardless of the instruction set there are some hard physics constraints:

The size of transistors. The latest CPUs feature 7nm, with 5nm being experimental. Size immediately places an upper bound on density.
The speed of light, or rather the speed of electricity in the medium, places on an upper bound on the speed of transmission of information.

Combining the two places an upper bound on the size of L1 caches, in the absence of 3D designs – which suffer from heat issues.

Mechanical Sympathy is the concept of designing software with hardware/platform constraints in mind, and essentially to play to the platform's strengths. Language Implementations with better Mechanical Sympathy will outperform those with lesser Mechanical Sympathy on a given platform.

A critical constraint today is being cache-friendly, notably keeping the working set in the L1 cache, and typically GCed languages use more memory (and more indirections) compared to languages where memory is manually managed.

Less (Work) is More (Performance)

There's no better optimization than removing work.

A typical example is accessing a property:

In C value->name is a single instruction (lea).
In Python or Ruby, the same typically involves a hash table lookup.

The lea instruction is executed in 1 CPU cycle, an optimized hash table lookup takes at least 10 cycles.

Recovering performance

Optimizers, and JIT optimizers, attempt to recover the performance left on the table.

I'll take the example of two typical optimizations for JavaScript code:

NaN-tagging is used to store a double OR a pointer in 8 bytes. At run-time, a check is performed to know which is which. This avoids boxing doubles, eliminating a separate memory allocation and an indirection, and thus is cache-friendly.
The V8 VM optimizes dynamic property lookups by creating a C-like struct for each combination of properties on an object, hence going from hash table lookup to type-check + lea – and possibly lifting the type-check much earlier.

Thus, to some extent, even highly flexible languages can be executed efficiently... so long as the optimizer is smart enough, or the developer makes sure to massage the code to just hit the optimizer's sweet spot.

There is no faster language...

... there are just languages that are easier to write fast programs in.

I'll point to a serie of 3 blog articles from 2018:

Nick Fitzgerald explained how he sped up a JS library by writing the core algorithm in Rust and compiling it to WebAssembly: https://hacks.mozilla.org/2018/01/oxidizing-source-maps-with-rust-and-webassembly/
Vyacheslav Egorov (V8 developer) explained how you could massively speed up the JS library by making sure to hit V8 sweet spots (and some algorithmic improvements): https://mrale.ph/blog/2018/02/03/maybe-you-dont-need-rust-to-speed-up-your-js.html
Nick concluded with Speed without Wizardry, with a less flexible language (Rust) there was no need for expert's knowledge, nor for tuning for 1 specific JS engine (possibly at the expense of others): https://fitzgeraldnick.com/2018/02/26/speed-without-wizardry.html

I think the latter article is the key point. More flexible languages can be made to run efficiently with expert's knowledge, and time. This is costly, and typically brittle.

The main advantage of less flexible languages – statically typed, tighter control on memory – are that they make optimizing their performance more straightforward.

When the language's semantics already closely match the platform sweet spot, good performance is straight out of the box.

score 6 · Answer 4 · answered Jun 24 '20 at 08:03

In general, it's about what the language and its implementors are trying to do.

C has a long culture of keeping things as close to the hardware as possible. It doesn't do anything that could easily be translated into machine code at compile time. It was intended as a multi-platform kind of low level language. As time went on (and it was a lot of time!), C became sort of a target language for compilers in turn - it was a relatively simple way to get your language to compile for all the platforms that C compiled for, which was a lot of languages. And C ended up being the API-system of choice for most desktop software - not because of any inherent qualities in the way C calls things or shares header files or whatever, but simply because the barrier to introducing a new way is very high. So again, the alternatives usually sacrifice performance for other benefits - just compare C-style APIs with COM.

That isn't to say that C wasn't used for development, of course. But it's also clear that people were well aware of its shortcomings, since even people doing "hard-core" stuff like OS development always tried to find better languages to work with - LISP, Pascal, Objective-C etc. But C (and later C++) remained at the heart of most system-level stuff, and the compilers were continuously tweaked to squeeze out extra performance (don't forget there's ~50 years of C by now). C wasn't significantly improved in capabilities over that time; that was never seen as particularly important, and would conflict with the other design pillars.

Why do you design a new language? To make something better. But you can't expect to get everything better; you need to focus. Are you looking for a good way to develop GUIs? Build templates for a web server? Resolve issues with reliability or concurrency? Make it easier to write correct programs? Now, out of some of those, you may get performance benefits. Abstraction usually has costs, but it can also mean you can spend more of your time performance tweaking small portions of code.

It's definitely not true that using a low-level language (like C) will net you better performance. What is true, is that if you really really want to, you can reach the highest performance with a low-level language. As long as you don't care about the cost, maintainability and all that. Which is where economies of scale come in - if you can have a 100 programmers save performance for 100M programmers through a low-level tweak, that might be a great pay off. The same way, a lot of smart people working on a good high-level language can greatly increase the output of a lot more people using that language.

There is a saying that a sufficiently powerful compiler will be able to eliminate all the costs of high-level languages. In some sense, it's true - every problem eventually needs to be translated to a language the CPU understands, after all. Higher level abstractions mean you have fewer constraints to satisfy; a custom .NET runtime, for example, doesn't have to use a garbage collector. But of course, we do not have unlimited capacity to work on such compilers. So as with any optimisation problem, you solve the issues that are the most painful to you, and bring you the most benefit. And you probably didn't start the development of a new, high level language, to try to rival C in "raw" power. You wanted to solve a more specific problem. For example, it's really hard to write high-performance concurrent code in C. Not impossible, of course. But the "everything is shared and mutable by default" model means you have to either be extremely careful, or use plenty of guards everywhere. In higher level languages, the compiler or runtime can do that for you, and decide where those can be omitted.

More powerful programming languages tend to have slower implementations because fast implementations were never a priority, and may not be cost effective. Some of the higher level features or guarantees may be hard to optimise for performance. Most people don't think performance should trump everything - even the C and C++ people are using C or C++, after all. Languages often trade run-time, compile-time and write-time performance. And you don't even have to look at languages and their implementations to see that - for example, compare the original Doom engine with Duke Nukem 3D. Doom's levels need significant compile-time - Duke's can be edited in real-time. Doom had better runtime performance, but it didn't matter by the time Duke launched - it was fast enough, and that's all that matters when you're dealing with performance on a desktop.

What about performance on a server? You might expect a much stronger focus on performance in server software. And indeed, for things like database engines, that's true. But at the same time, servers are flooded with software like PHP or Node.js. Much of what's happening in server-space shifted from "squeeze every ounce of performance from this central server node" to "just throw a hundred servers at the problem". Web servers were always designed for high concurrency (and decentralisation) - that's one big reason why HTTP and the web were designed to be state-less. Of course, not everyone got the memo, and it's handy to have some state - but it still makes decoupling state from a particular server much easier. PHP is not a powerful language. It's not particularly nice to work with. But it provided something people needed - simple templating for their web sites. It took quite a while for performance to become an important goal, and it was further "delayed" by sharding, caching, proxying etc. - which were very simple to do thanks to the limitations of PHP and HTTP.

But surely, you'll always write an OS in C/C++? Well, for the foreseeable future on the desktop, sure. But not because of raw performance - the trump card is compatibility. Many research OSes have cropped up over time that provide greater safety, security, reliability and performance (particularly in highly concurrent scenarios). A fully memory managed OS makes many of the costs of managed memory go away; better memory guarantees, type safety and runtime type information allow you to elude many runtime checks and costs with task switching etc. Immutability allows processes to share memory safely and easily, at very low cost (heck, many of Unix strengths and weaknesses come from how fork works). Doing compilation on the target computer means you can't spend so much time optimising, but it also means you are targeting a very specific configuration - so you can always use the best available CPU extensions, for example, without having to do any runtime checks. And of course, safe dynamic code can bring its own performance benefits too (my software 3D renderer in C# uses that heavily for shader code; funnily enough, thanks to all the high-level language features, it's much simpler, faster and more powerful than e.g. the Build engine that powers Duke Nukem 3D - at the cost of extra memory etc.).

We're doing engineering here (poor as it may be). There's trade-offs to be had. Unless squeezing every tiny bit of performance out of your language gives you the greatest possible benefit, you shouldn't be doing it. C wasn't getting faster to please C programmers; it was getting faster because there were people who used it to work on stuff that made things faster for everyone else. That's a lot of history that can be hard to beat, and would you really want to spend the next 50 years catching up with some low-level performance tweaks and fixing tiny incompatibilities when nobody would want to use your language in the first place because it doesn't provide them with any real benefit over C? :)

Quantum Mechanic · Answer 5 · 2021-01-08T13:22:55.350

I reject the premise of "More powerful programming languages tend to have slower implementations."

"Power" is subjective. Is it faster? More robust? More exact? More efficient? More capable?

A nuclear warhead is very powerful, but not very precise.
An acupuncture needle is very precise, and can be very powerful, but it is only leveraging the underlying neural system.
Lisp is very powerful, very precise, and yet, (some) people find it an awkward language.
APL is very very powerful, precise, and succinct. But it requires a special keyboard (or mapping), and is sometimes labelled as too difficult to teach (though it's probably fairer to say it's not for everyone).
Pascal isn't very powerful, but is fairly precise. It was designed as a teaching language, and also an experiment to prove that a one-pass compiler is practical. (Leave it to Microsoft to distribute a 3-pass compiler for a 1-pass language.)
Python, Perl, Java, etc. These are easier to write in for most people, there are loads of libraries, tutorials, and online projects for examination. Many of these languages don't have "pointers" as such, but do have "references", which are more consistent with the language -- you don't have to bother with pointer arithmetic, wrap-around, and other implementation-specific details. Indeed, these were meant to be on most, if not all, hardware. They are an abstraction up from C and C compilers, making their programs more widely applicable without recompiling. But they lose some performance for this flexibility.
Turing machines: the most powerful, and yet, when was the last time you wrote a program? Performance is awful, because, in all but pathological cases, there are better implementations.
GOL (Game Of Life): since it's Turing complete, it's just as powerful, yet the performance is worse than a direct Turing machine implementation in the same context.

score 1 · Answer 6 · answered Dec 14 '20 at 21:17

The phenomenon you describe as one language being more "powerful" than another one is what we call a "high-level" language vs. "low-level" languages.

But, what is the meaning of "level" in this context ? In other words, what they refer to being high/low level of ?

They refer to levels of abstraction. C/C++ is a language with low level (of abstraction). Python has a higher level (of abstraction).

The fact that high-level (of abstraction) languages are slower than low-level (of abstraction) ones is called abstraction penalty:

High-level languages intend to provide features which standardize common tasks, permit rich debugging, and maintain architectural agnosticism; while low-level languages often produce more efficient code through optimization for a specific system architecture. Abstraction penalty is the cost that high-level programming techniques pay for being unable to optimize performance or use certain hardware because they don't take advantage of certain low-level architectural resources. High-level programming exhibits features like more generic data structures and operations, run-time interpretation, and intermediate code files; which often result in execution of far more operations than necessary, higher memory consumption, and larger binary program size. For this reason, code which needs to run particularly quickly and efficiently may require the use of a lower-level language, even if a higher-level language would make the coding easier. In many cases, critical portions of a program mostly in a high-level language can be hand-coded in assembly language, leading to a much faster, more efficient, or simply reliably functioning optimised program.

References:

Pankaj Surana, Meta-compilation of language abstractions

Why do more powerful programming languages tend to have slower implementations?

6 Answers6

Physics Matter

Less (Work) is More (Performance)

Recovering performance

There is no faster language...