Skip to main content

Hi there! I am a programmer and I'm very enthusiastic about OpenSource and OpenKnowledge.

Norman Köhring

The Magic 0xC2

3 min read

I built a web application with file upload functionality. Some Vue.js in the front and a CouchDB in the back. Everything should be pretty simple and straigt forward.


When I uploaded image files, they somehow got mangled. The uploaded file was bigger than the original and the new "file format" was not readable by any means. I got intrigued. What is it, that happens to the files? The changes seemed very random but reproducible, so I created a few test files to see what exactly changes and when.

My first file looked like this:


To my surprise, the file stayed the same! My curiosity grew. In the meantime I found a very intriguing pattern in uploads hexdump: C3 BF C3. It was everywhere. In another file, I found similar patterns with C2. So I wrote my next test file. This time a binary file:

00 01 02 03 04 05 06 07  08 09 10 11 12 13 14 15 |................|
16 17 18 19 20 21 22 23  24 25 26 27 28 29 30 31 |.... !"#$%&'()01|
32 33 34 35 36 37 38 39  40 41 42 43 44 45 46 47 |23456789@ABCDEFG|
48 49 50 51 52 53 54 55  56 57 58 59 60 61 62 63 |HIPQRSTUVWXY`abc|
64 65 66 67 68 69 70 71  72 73 74 75 76 77 78 79 |defghipqrstuvwxy|
80 81 82 83 84 85 86 87  88 89 90 91 92 93 94 95 |................|
96 97 98 99 a0 a1 a2 a3  a4 a5 a6 a7 a8 a9 aa ab |................|
ac ad ae af b0 b1 b2 b3  b4 b5 b6 b7 b8 b9 ba bb |................|
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00 |................|

EDIT: As you probably already noticed, I counted up like in Base10 but it is actually Base16. So I skipped A-F until reaching A0. This might look weird but didn't affect the test.

The result after uploading was

00 01 02 03 04 05 06 07  08 09 10 11 12 13 14 15  |................|
16 17 18 19 20 21 22 23  24 25 26 27 28 29 30 31  |.... !"#$%&'()01|
32 33 34 35 36 37 38 39  40 41 42 43 44 45 46 47  |23456789@ABCDEFG|
48 49 50 51 52 53 54 55  56 57 58 59 60 61 62 63  |HIPQRSTUVWXY`abc|
64 65 66 67 68 69 70 71  72 73 74 75 76 77 78 79  |defghipqrstuvwxy|
c2 80 c2 81 c2 82 c2 83  c2 84 c2 85 c2 86 c2 87  |................|
c2 88 c2 89 c2 90 c2 91  c2 92 c2 93 c2 94 c2 95  |................|
c2 96 c2 97 c2 98 c2 99  c2 a0 c2 a1 c2 a2 c2 a3  |................|
c2 a4 c2 a5 c2 a6 c2 a7  c2 a8 c2 a9 c2 aa c2 ab  |................|
c2 ac c2 ad c2 ae c2 af  c2 b0 c2 b1 c2 b2 c2 b3  |................|
c2 b4 c2 b5 c2 b6 c2 b7  c2 b8 c2 b9 c2 ba c2 bb  |................|
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

There it was again: The magic 0xC2!

So all bytes with a value higher than 0x79 got followed by a 0xC2. 0x79 is the ASCII code for y. This is at least what I thought. It actually is the other way around: All bytes with value 0x80 or higher got prefixed by a 0xC2! — there the scales fell from my eyes: UTF-8 encoding!

In UTF-8 all characters after 0x7F are at least two bytes long. They get prefixed with 0xC2 until 0xC2BF (which is the inverted question mark ¿), which is then followed by 0xC380. So what happened is, that on the way to the server, the file got encoded to UTF-8 ¯\_(ツ)_/¯

EDIT: Corrected some mistakes after some comments on Hackernews

Norman Köhring

the price to crack your password

6 min read

Nearly six years ago, I wrote about password complexity and showed how long it takes to crack passwords per length. You can find that article on github (in German).

So, times changed and I thought about a reiteration of that topic, but instead focussing on the amount of money you need to crack the password using Amazons biggest GPU computing instances p2.16xlarge, which – at the time of writing this - costs 14.4 USD per hour. I will also compare this with the much faster Sagitta Brutalis (nice name, eh?), a 18500 USD computer optimised for GPU calculation.


The numbers on this article always assume brute-force attacks, that means the attacker uses a program that tries all possible combinations until it finds the password. The numbers indicate average time to compute all possible entries. If the program simply adds up, for example, from 000000 to 999999 and your password is 000001, it will be found much faster of course.

How long a single calculation needs also depends on the used hashing algorithm. I will compare some of the typically used algorithms. In case you have to implement a password security system, please use BCrypt which is in most cases the best choice but NEVER try to implement something on your own! It is never ever a good idea to create an own password hashing scheme, even if it is just assembled out of existing building blocks. Use the battle-tested standard solutions. They are peer-reviewed and the safest and most robust you can get.

Password complexity basics

Password complexity is calculated out of the possible number of combinations. So a 10-character password that only contains numbers is far less complex than a mix of letters and numbers of the same length. Usually an attacker has no idea if a specific password only contains numbers or letters, but a brute-force attack will try simpler combinations first.

To calculate the complexity of a password, find the amount of possible combinations first:

  • Numbers: 10
  • ASCII Lowercase letters: 26
  • ASCII Uppercase letters: 26
  • ASCII Punctuation: 33
  • Other ASCII Characters: 128
  • Unicode: millions

To get the complexity of your password, simply add up the numbers. A typical password contains numbers, lowercase and uppercase letters which results in 62 possible combinations per character. Add some punctuation to raise that number to 95.

Other ASCII Characters are the less typical ones like ÿ and Ø which add to the complexity but might be hard to type on foreign keyboards. Unicode is super hard (if not impossible) to type on some computers but would theoretically add millions of possible characters. Fancy some ਪੰਜਾਬੀ ਦੇ in your password?

A very important factor in the password complexity is of course also the length. And because random passwords with crazy combinations of numbers, letters and punctuation are hard to remember, some people suggest to use long combination of normal words instead.

The password ke1r$u@U is considered a very secure password as the time of writing this article. Its complexity calculates like this:

8 characters with 95 possibilites:

95^8 = 6634204312890625 = ~6.6×10^15

log2(x) calculates the complexity in bits:

log2(6634204312890625) = ~52.56 bits

Data sources

I didn't try the password cracking myself, and neither did I ask a friend (insert trollface here). Instead I used publicly available benchmark results:

The results

I will compare some widely used password hashing methods, programs and protocols for four different password complexity categories:

  • eight numeric digits (might be your birthday)
  • eight alphanumeric characters (eg 'pa55W0Rd')
  • eigth alphanumeric characters mixed with special character (eg 'pa$$W0Rd')
  • a long memorisable pass sentence ('correct horse battery staple')

eight numeric digits (might be your birthday)

hash Amazon Brutalis price to crack in less than a month
MD5 0.0s 0.0s $0.01 (1 EC2 instance)
Skype 0.0s 0.0s $0.01 (1 EC2 instance)
WPA2 1.27m 31.47s $0.30 (1 EC2 instance)
SHA256 0.01s 0.0s $0.01 (1 EC2 instance)
BCrypt 49.1m 15.77m $11.78 (1 EC2 instance)
AndroidPIN 4.65s 2.3s $0.02 (1 EC2 instance)
MyWallet 0.34s 0.25s $0.01 (1 EC2 instance)
BitcoinWallet 1.98h 46.26m $28.53 (1 EC2 instance)
LastPass 11.07s 5.4s $0.04 (1 EC2 instance)
TrueCrypt 9.06m 5.69m $2.18 (1 EC2 instance)
VeraCrypt 4d 2d $1120.45 (1 EC2 instance)

Conclusion: Don't do this. Never ever do this.

eight alphanumeric characters (eg 'pa55W0Rd')

hash Amazon Brutalis price to crack in less than a month
MD5 49.65m 18.17m $11.92 (1 EC2 instance)
Skype 1.3h 34.92m $18.67 (1 EC2 instance)
WPA2 6y 3y $499500 (27 Brutalis)
SHA256 4.94h 2.64h $71.15 (1 EC2 instance)
BCrypt 204y 66y $14.7M (797 Brutalis)
AndroidPIN 118d 59d $37000 (2 Brutalis)
MyWallet 9d 7d $3003.3 (1 EC2 instance)
BitcoinWallet 494y 193y $43.25M (2338 Brutalis)
LastPass 280d 137d $92,500 (5 Brutalis)
TrueCrypt 38y 24y $5.3M (288 Brutalis)
VeraCrypt 19381y 11629y $2.62B (141574 Brutalis)

eigth alphanumeric characters mixed with special character (eg 'pa$$W0Rd')

hash Amazon Brutalis price to crack in less than a month
MD5 2d 9.2h ~$362 (1 EC2 instance)
Skype 2d 17.7h ~$567 (1 EC2 instance)
WPA2 160y 67y ~$14.9M (806 Brutalis)
SHA256 7d 4d ~$2162 (1 EC2 instance)
BCrypt 6194y 1989y ~$448M (24,215 Brutalis)
AndroidPIN 10y 5y ~$1.09M (59 Brutalis)
MyWallet³ 265d 191d ~$129500 (7 Brutalis)
BitcoinWallet 14996y 5835y ~$1.3B (71,038 Brutalis)
LastPass 24y 12y ~$2.6M (139 Brutalis)
TrueCrypt² 1144y 718y ~$162M (8,742 Brutalis)
VeraCrypt¹ 588867y 353320y ~$79.6B (4,301,668 Brutalis)
  1. VeraCrypt PBKDF2-HMAC-Whirlpool + XTS 512bit (super duper paranoid settings)
  2. TrueCrypt PBKDF2-HMAC-Whirlpool + XTS 512bit
  3. Blockchain MyWallet:

a long memorisable pass sentence ('correct horse battery staple')

Okay, this doesn't need a table. It takes millions of billions of years to even crack this in MD5.

As illustration: The solar system needs around 225 Million years to rotate around the core of the Milkyway. This is the so called galactic year. The sun exists since around 20 galactic years. To crack such a password, even when hashed in MD5 takes 3 trillion (million million) galactic years.

Of course nobody would ever attempt to do this. There are many possibilities to crack a password faster. Explaining some of them would easily fill another article, so I leave you here. Sorry.


To find your way into the topic, you might visit some of the following links:

Norman Köhring

Blog's coming!

1 min read

Central part of this page should be my blog. But for now, all my articles are still waiting to be imported :(