SkillAgentSearch skills...

NEMO

Modeling Password Guessability Using Markov Models

Install / Use

/learn @RUB-SysSec/NEMO
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

NEMO: Modeling Password Guessability Using Markov Models

tl;dr

This is our ongoing effort of using Markov models to build probabilistic password models. Common use cases include:

  • Strength estimation
  • Guessing
  • (Adaptive) Natural Language Encoders
  • ...

WARNING

  • This is research-quality code that should only be used for a proof of concept (PoC).
  • We share this code in the hope that the research community can benefit from it. Please share your code, too! :heart_eyes:
  • We recommended running this software using PyPy (see performance stats below).

About NEMO

The scope of this project is not limited to passwords, this software has also been used in the context of other human-chosen secrets like Emoji, PINs, and Android unlock patterns.

The architecture of the software is inspired by OMEN. More background information about OMEN can be found here and here. An excellent Python implementation of OMEN, called py_omen, by Matthew Weir (@lakiw) can be found here.

Difference to OMEN

OMEN makes use of so-called levels (a form of binning). This implementation does not. Thus, efficient enumeration of password candidates (guessing passwords as OMEN does), is not (out of the box) possible, if the key space becomes too big. However, because of the non-binned output, this software has other advantages, for example it can produce more accurate strength estimates.

Overview: Markov Model-Based Password Guessing

Publications

In the past, we used different versions of this code in the following publications: :bowtie:

A simpler version of this code has been used for other user-chosen secrets such as Emoji and Android unlock patterns.

Design Decisions

Warning: Markov models are memory-eating monsters! :smiling_imp:

We use three copies of a data structure (in the past: Python OrderedDicts(), today: plain Python lists) to store the frequencies of the n-grams in the training corpus. We use:

  • IP: Initial probabilities (ngram_size - 1)
  • CP: Conditional probabilities
  • EP: End probabilities (ngram_size - 1)

Here is an example for 3-grams:

password PW

pa       IP  (some literature uses this annotation: ^pa)

pas      CP1
 ass     CP2
  ssw    CP3
   swo   CP4
    wor  CP5
     ord CP6

      rd EP  (some literature uses this annotation: rd$)

How Big Are They?

IP: alphabet_length ^ (ngram_size - 1)
CP: alphabet_length ^  ngram_size
EP: alphabet_length ^ (ngram_size - 1)

Some Details For the Ones Interested:

:nerd_face:

n-gram size: Currently, we support 2,3,4,5-grams. The higher the order of the Markov chains, the more accurate the model becomes. Unfortunately, this also introduces the risk of overfitting and sparsity. If one does not have enough training data, e.g., when using the model with Android unlock patterns, computing the transition probabilities from too small count numbers will become too noisy. While we only support fixed-order Markov chains, we recommend using Dell’Amico and Filippone backoff model for variable-order Markov chains.

Smoothing: Currently, we only support Additive smoothing (add '1' to the counts), also known as Laplace smoothing.

Alphabet: We tested this software with ASCII passwords only. Using non-ASCII passwords, likely requires to drop the support for Python 2 first. Hint: You can use the info.py script in the utils folder to determine the alphabet.

Development

In early versions of this code, we made heavy use of Python's (Ordered)-Dictionary class. Fun fact: As of Python 3.7 dictionaries are always ordered :)

cp_dict_full:
    key: "aaaa", value: 0.0071192...
    key: "aaab", value: 0.0034128...
    ...

A few months later, we optimized the memory consumption by only storing n-grams that really occur in the training corpus. If a rare n-gram like the 4-gram o9py does not occur in the training file, we used to return a very small default probability instead. This helped quite a lot to reduce the required memory, still, like Google Chrome, our solution easy occupied more than 20GB of RAM. :poop:

cp_dict_sparse:
    key: "assw", value: 0.0838103...
    key: "sswo", value: 0.0954193...
    ...

Thus, we decided to refactor the code again to further limit the amount of required memory to nowadays approx. 16GB of RAM. Today, we use simple Python lists to store the n-gram probabilities in memory. However, this force

View on GitHub
GitHub Stars58
CategoryDevelopment
Updated1mo ago
Forks7

Languages

Python

Security Score

85/100

Audited on Feb 1, 2026

No findings