I always wanted to say that since I saw Raymond Chen's
blog, and now. at last, I can! The title of the book is
"Hands-On High Performance Programming with Qt 5" and you can get it from Amazon (
here) or Packt Publishing (
here). The book's cover looks like that:
I have to say that I rather like the look of it!
TL;DR
A new book About Qt's and C++ performance. Uses Qt 5.9 - 5.12 on Windows. Guides the reader from an intermediate to the (lower) advanced level. 4 of 5 stars ๐.
What this book is about.
This book is about many things. Judging from the title it's about performance optimization of Qt 5
(Qt 5.9 - Qt 5.12) programs. But because Qt's underlying language is C++ it is also
about C++ optimizations and performance. Because C++ is a close-to-the-metal language (some would say low-level) we also discuss the
hardware architectures. Also because Qt framework covers so many areas we also discuss
data structures and
algorithms,
multithreading,
file I/O and
parsing,
GUI and
graphics,
networking, and even
mobile and
embedded platforms.
Quite a mouthful, you'd say? Yes, it's true. But it also makes it interesting, and shows how many facets there are to the performance optimization.
Additionally, I decided to use
Windows as the development platform for this book. Most Qt books use Linux, probably because Qt Creator IDE offers a much better tooling there. But because Windows is also a very popular platform I wanted to take
the road less traveled and try to find out how to compensate the lack of the standard Linux performance tools on Windows.
We use only open-source or free tools, Qt Creator as development environment and Qt 5.9 LTE version for code examples (as it was the last LTE version at the time I started writing this book).
Why did I write it?
Well, because I always found performance optimization very interesting. And because I have been working with Qt for quite a long time now, nothing to say about C++! So, as the publishers approached me with a proposed book title I knew I could write rather a good book about it!
The second reason was that there
wasn't a resource about performance optimization I would be content with. There area a couple of C++ performance books, but I wasn't entirely convinced by them. As of Qt and its specific problems there was nothing! Well, there was a host of information scattered in blogs, Stack Overflow, books, articles and Twitter threads. So I thought that before I forget it all, I'd better put it to paper!
I also wanted to write a book I'd have liked to read when I first started to learn about performance and optimizations, because there wasn't such a thing either. Thus I settled on a intermediate-level, approachable, but not trivial book.
What material is covered
This book was planned as an intermediate level read. If you can write some C++ and have learned to write basic Qt application then it is for you. No further knowledge is requires, as every topic (e.g. data structures, graphics, networking) will be introduced in a understandable and (I hope so) entertaining manner.
My initial plan for the book was:
- Part I - Basics
- Intro - basic performance wisdom and techniques, hardware architecture and its impact.
- Profiling - performance tools and how to use them. As I said, we don't want to make it easy for us and look at the Windows platform!
- C++ - how performant is C++ really? And what are optimizations the compiler (and linker) can do?
- Part II - General Techniques
- Data structures and algorithms - what is the performance of Qt containers and strings? Do we have to use them?
- Multithreading - Qt's take on threading and how to speed up your program with concurrency.
- Fails - some more memorable performance problems I encountered in my career
- Part III - Selected Qt Modules
- File I/O etc - Qt and file system performance, JSON and XML parsing. Also memory mapped files and caching.
- Graphic - GUI, widget and QML performance. Probably the most interesting part of the book.
- Networking - network performance and network support in Qt. Because we live in the ubiquitous connectivity era!
However, the editors wished two additional chapters, so I also added them (unfortunately, this made the three-part structure obsolete):
- Mobile and embedded - how to use Qt on mobile and embedded (a kind of advanced chapter applying all we have learned before to two specific platforms)
- Testing - a goodie chapter about testing techniques, GUI testing and performance regression tests. At the end this chapter turned out pretty interesting!
I really think that this book can do a good job to guide the reader from an intermediate to the (lower) advanced level!
Table of Contents
On the publisher's book page there is a TOC (
here), but it is very high level, and doesn't really show the real contents of the book. It goes rather monotonously like "Some intro for topic", "Qt classes fro that", "Performance techniques for that", "Summary", "Questions" and "Further reading". I agree with you that one cannot judge the level and quality of the book from that. It could be everything - from horrible to superb.
For that reason I include here a complete TOC as it appears in the book, so you can have a better idea of themes covered. Without much further ado, here it is:
OK, I tried, but it's way too long, it follows at the end* of the post.
You can see from that more detailed TOC that there is a wealth of information and techniques in there! I really think that This book can do a good job to guide the reader from an intermediate to the (lower) advanced level!
The "Questions" and "Further Reading" sections
As unseemly as they look in the TOC, these sections were my secret personal favorites. The "Questions" sections contain, as you probably guessed, some questions to test your understanding of themes discussed in the given chapter, but they also will try to deepen your understanding and sometimes even to introduce new and interesting information! I enjoyed writhing them because it was fun trying to find out how could I keep an intelligent reader still interested after the real material was already introduced.
As this is an intermediate-level book, and because I didn't want to write a 500-600 pages tome, it couldn't go deep on every of the introduced themes. Because of that I also included the "Further Reading" sections, where more advanced materials are referenced. Qt covers so many areas, and some themes like networking, graphics, embedded or mobile are so deep, that you will definitely need to consult more books and articles! In hindsight I should have to include even more references, but you know, If I had more time etc...
Summary
If I had more time, this book could (of course) be much better. But even so I'm rather content with it, considering that I wrote it in half of a year only, in parallel with my normal working hours. I hope you will enjoy it!
PS: Maybe I will start some kind of online addendum/errata for this book, as there are quite many things I'm still leaning after I finished writing it.
--
* Here comes the TOC in its whole glory:
Preface
Chapter 1: Understanding Performant
Programs 1
Why performance is
important 1
The price of performance
optimization 2
Traditional wisdom and basic guidelines 2
Avoiding
repeated computation 4
Avoiding
paying the high price 4
Avoiding
copying data around 5
General performance
optimization approach 6
Modern processor architectures 7
Caches 7
Pipelining 8
Speculative execution and branch
prediction 10
Out-of-order execution 10
Multicore 11
Additional
instruction sets 12
Impact on
performance 13
Keeping your caches hot 14
Don't confuse your branch predictor 15
Parallelizing your application 16
Summary 16
Questions
17
Further
reading 17
Chapter 2:
Profiling to Find Bottlenecks 19
Types of profilers 20
Instrumenting
profilers 20
Sampling
profilers 21
External
counters 22
Note on Read Time-Stamp Counter 22
Platform and tools 22
Development
environment 23
Profiling
tools 24
Just use gprof? 25
Windows system tools 25
Program profiling tools 27
Visualizing performance data 30
Memory tools
30
Profiling CPU usage 31
Poor man's
sampling technique 31
Using Qt Creator's QML profiler 32
Using
standalone CPU profilers 35
Reiterating on sampling profiling's limitations 39
Investigating memory usage 39
Poor man's
memory profiling 40
Using Qt
Creator's heob integration 41
Manual instrumentation and benchmarks 45
Debug outputs
45
Benchmarks
46
Benchmarks in regression testing 46
Manual
instrumentation 47
Further advanced tools 48
Event
Tracing for Windows (ETW) and xperf 48
Installation 48
Recording and visualizing traces 50
Conclusion 53
GammaRay 54
Building GammaRay 54
When can we use it? 56
Other tools
57
Graphic profilers 57
Commercial Intel tools 58
Visual Studio tools 58
Summary 59
Questions 59
Chapter 3:
Deep Dive into C++ and Performance 61
C++ philosophy and design 61
Problems
with exceptions 62
Run-time
overheads 63
Non-determinism
63
RTTI
64
Conclusion
64
Virtual
functions 64
Traditional C++ optimizations 65
Low-hanging
fruit 65
Temporaries 66
Return values and RVO 66
Conversions 67
Memory management 68
Basic
truths 68
Replacing
the global memory manager 69
Custom
memory allocators 70
Where
they do make sense 71
Stack
allocators 71
Conclusion
71
Custom STL allocators 72
Template
trickery 73
Template computations 73
Expression templates 74
CRTP for static polymorphism 75
Removing branches 76
C++11/14/17 and performance 77
Move
semantics 77
Passing by value fashionable again 78
Compile time
computations 78
Other
improvements 80
What your compiler can do for you 81
Examples of
compiler tricks 81
More on
compiler optimizations 85
Inlining of functions 86
Loop unrolling and vectorization 86
What
compilers do not like 87
Aliasing 87
External functions 88
How can you
help the compiler? 89
Profile Guided Optimization 90
When compilers
get overzealous 90
Optimization tools beyond compiler 92
Link time
optimization and link time code generation 93
Workaround –
unity builds 93
Beyond
linkers 94
Summary 95
Questions 95
Further reading 96
Chapter 4:
Using Data Structures and Algorithms Efficiently 97
Algorithms, data structures, and performance
98
Algorithm
classes 98
Algorithmic complexity warning 100
Types of
data structures 100
Arrays 100
Lists 101
Trees 101
Hash tables 102
Using Qt containers 103
General
design 103
Implicit
sharing 103
Relocatability
105
Container
classes overview 105
Basic
Qt containers 106
QList
106
QVarLengthArray
107
QCache
108
C++11
features 108
Memory management 109
Should we
use Qt containers? 110
Qt algorithms, iterators, and gotchas 110
Iterators
and iterations 111
Gotcha -
accidental deep copies 111
Working with strings 113
Qt string
classes 113
QByteArray 113
QString 114
QStringBuilder 114
Substring classes 115
More string
advice 115
Interning 115
Hashing 116
Searching substrings 117
Fixing the size 117
Optimizing with algorithms and data
structures 118
Optimizing
with algorithms 118
Reusing other people's work 120
Optimizing
with data structures 120
Be cache-friendly 121
Flatten your data structures 122
Improve access patterns 122
Structure
of arrays 122
Polymorphism avoidance 123
Hot-cold data separation 123
Use a custom allocator 124
Fixed size containers 124
Write your own 125
Summary 125
Questions 125
Further reading 126
Chapter 5:
An In-Depth Guide to Concurrency and Multithreading 128
Concurrency,
parallelism, and multithreading 128
Problems
with threads 130
More
problems – false sharing 131
Threading
support classes in Qt 133
Threads 134
Mutexes 134
Condition
variables 135
Atomic
variables 136
Thread local
storage 136
Q_GLOBAL_STATIC
136
Threads, events, and QObjects 137
Events and
event loop 137
QThreads and
object affinities 138
Getting rid
of the QThread class 142
Thread
safety of Qt objects 142
Higher level Qt concurrency mechanisms 142
QThreadPool
143
QFuture 143
QFutureInterface
145
Should we use it? 146
Map, filter,
and reduce 147
Which
concurrency class should I use? 149
Multithreading and performance 150
Costs of multithreading 150
Thread costs 150
Synchronization costs 151
QMutex implementation and performance
151
Atomic operation costs 151
Memory allocation costs 152
Qt's signals and slots performance 152
Speeding up
programs with threads 153
Do not block the GUI thread 153
Use the correct number of threads 154
Avoid thread creation and switching cost
154
Avoid locking costs 154
Fine-grained locks 155
Lock coarsening 155
Duplicate or partition resources 156
Use concurrent data structures 157
Know your concurrent access patterns
157
Do not share any data 157
Double-checked
locking and a note on static objects 158
Just switch to lock-free and be fine? 159
Lock-free performance 159
Progress guarantees 160
Messing with thread scheduling? 161
Use a share nothing architecture 162
Implementing
a worker thread 162
Active object pattern 163
Command queue pattern 164
Beyond threading 164
User-space
scheduling 164
Transactional
memory 165
Continuations
165
Coroutines
166
Summary 167
Questions 168
Further reading 168
Chapter 6:
Performance Failures and How to Overcome Them 170
Linear search
storm 170
Context 171
Problem 172
Solution 172
Conclusion
173
Results dialog window opening very slowly
173
Context 173
Problem 174
Solution 174
Conclusion
174
Increasing HTTP file transfer times 174
Context 175
Problem 175
Solution 176
Conclusion
177
Loading SVGs 177
Context 177
Problem 178
Solution 178
Conclusion
179
Quadratic algorithm trap 179
Context 180
Problem 180
Solution 180
Conclusion
180
Stalls when displaying widget with QML
contents 181
Context 181
Problem 181
Solution 182
Conclusion
182
Too many items in view 182
Context 183
Problem 183
Solution 183
Conclusion
183
Two program startup stories 184
Time system
calls 184
Font cache
184
Conclusion
185
Hardware shutting down after an error message
185
Context 185
Problem 185
Solution 185
Conclusion
186
Overly generic design 186
Context 186
Problem 187
Solution 187
Conclusion
187
Other examples 187
Summary 188
Questions 189
Further reading 189
Chapter 7:
Understanding I/O Performance and Overcoming Related Problems 190
Reading and writing files in Qt 191
Basics of
file I/O performance 191
Buffering and flushing 191
Tied and synchronized streams 192
Reading and writing 193
Seeking 194
Caching files 194
Qt's I/O
classes 195
QFile 195
QTextStream and QDataStream 196
Other helper I/O classes 198
QDebug and friends 198
Parsing XML and JSON at the speed of light
199
QtXml
classes 200
QDomDocument 200
QXmlSimpleReader 201
New stream
classes in QtCore 201
Quick
parsing of XML 202
Reading JSON
203
QJsonDocument's performance 204
Connecting databases 204
Basic
example using SQLite 204
Some
performance considerations 205
More about operating system interactions 206
Paging,
swapping, and the TLB 206
Reading from
disk 207
Completion
ports 208
Summary 208
Questions 208
Further reading 209
Chapter 8:
Optimizing Graphical Performance 210
Introduction
to graphics performance 211
Graphics
hardware's inner workings 211
What is a
GPU? 211
OpenGL
pipeline model 213
Performance
of the graphics pipeline 215
CPU problems 217
Data transfer optimization 217
Costly GPU operations 217
Newer
graphics programming APIs 218
Qt graphics architecture and its history
218
The graphics
API Zoo 219
Qt Widget 219
QGraphicalView 220
QOpenGLWidget 221
QVulkanWindow 222
Qt
Quick 223
QtQuick Controls 1 and 2 224
Extending QML 224
Canvas 2D 224
QQuickPaintedItem 225
QQuickItem 226
QQuickFrameBufferObject 227
More APIs 228
Qt 3D 229
OpenGL drivers
and Qt 231
Graphic drivers and performance 231
Setting the OpenGL implementation for QML
233
Qt Widget's performance 234
QPainter 234
Images 234
Optimized calls 235
OpenGL
rendering with QOpenGLWidget 236
Images 236
Threading
and context sharing 236
Usage of QPainter 237
QGraphicsView
237
Model/view
framework 237
QML performance 238
Improvements
in 5.9 and beyond 239
Measuring
QML performance 240
Startup of a
QML application 242
QML
rendering 243
Scene graph optimizations 243
Scene graph and threading 245
Scene graph performance gotchas 245
Batching 245
Texture atlas 246
Occlusion, blending, and other costly
operations 246
Antialiasing
246
Use caching 247
Which QML custom item should you choose?
247
JavaScript usage 247
Qt Quick Controls 248
Other modules 248
Qt 3D
performance 249
Hybrid web
applications 249
Summary 249
Questions 250
Further reading 251
Chapter 9:
Optimizing Network Performance 252
Introduction to networking 253
Transport
layer 254
User Datagram Protocol (UDP) 254
Transmission Control Protocol (TCP) 254
A better TCP? 256
Application layer 256
Domain Name Service (DNS) 256
HyperText Transfer Protocol (HTTP) 257
Secure data transfer 258
A better HTTP? 259
Qt networking classes 259
TCP and UDP
networking classes 259
QTcpServer and QTcpSocket 260
QUdpSocket 261
QAbstractSocket 262
QSslSocket 264
Other socket types 265
HTTP
networking using Qt classes 265
DNS queries 265
Basic HTTP 266
HTTPS and other extensions 267
Qt WebSocket classes 267
Miscallaneous classes 268
Other
higher-level communication classes 269
Qt WebChannel 269
Qt WebGL streaming 269
Qt remote objects 269
Improving network performance 270
General
network performance techniques 270
Receive
buffers and copying 271
TCP
performance 271
HTTP and
HTTPS performance 272
Connection reuse 273
Resuming SSL connections 273
Preconnecting 274
Pipelining 275
Caching and compression 276
Using HTTP/2 and WebSocket 276
Advanced networking themes 278
Summary 278
Questions 279
Further reading 279
Chapter
10: Qt Performance on Embedded and Mobile Platforms 281
Challenges
in embedded and mobile development 282
Basic performance
themes 282
Run to idle
283
Some
hardware data 283
Embedded
hardware and performance 285
Qt usage in embedded and mobile worlds 285
Qt for
embedded 286
Qt usage on embedded Linux 286
Qt's embedded tooling 287
Supported hardware 288
Example usage with Raspberry Pi 288
Qt for
mobile 289
Android support in Qt Creator 289
Profiling Android applications 290
Mobile APIs in Qt 290
Embedded Linux and Qt performance 291
Executable
size 291
Minimizing
assets 292
Power
consumption 292
Start-up
time 293
Using the current Qt version 293
Using loaders 294
3D asset conditioning 294
Linux start-up optimizations 294
Hardware matters! 295
Graphical
performance 295
Time series
chart display 296
Qt Charts and OpenGL acceleration 296
Polyline simplifications 297
Floating-point
considerations 298
Mobile-specific performance concerns 299
Executable
size 299
Power usage
299
Mobile
networking 300
Batch and piggyback 301
Consider a push model 302
Prefetch data 302
Reuse connections 303
Adapting to the current network
connection type 303
Graphic
hardware 304
Summary 304
Questions 305
Further reading 305
Chapter 11: Testing and Deploying Qt Applications 307
Testing of Qt code 307
Unit testing
308
Qt Test 308
Test support in Qt Creator 310
Automated
GUI testing 312
Squish 312
Example Squish test 313
Performance
regression testing 316
Adding a qmlbench benchmark 316
Using Squish 317
Deploying Qt applications 318
Flying parts
318
Static
versus dynamic builds 319
Deploying on
Windows 320
Windows
deployment tool 320
Installation
and paths 320
Summary and farewell 321
Questions 322
Further reading 323
Appendix
A: Responses to questions 324
Chapter 1 324
Chapter 2 325
Chapter 3 326
Chapter 4 328
Chapter 5 329
Chapter 6 331
Chapter 7 332
Chapter 8 333
Chapter 9 334
Chapter 10 336
Chapter 11 337