Encoding kinship relations: The Fcode

The method outlined below allows for the straightforward encoding of any type of kinship relationship. I have designated the resulting code as the "Fcode".

Utilizing Fcodes for encoding kinship relationships offers several advantages:

Efficiency: The encoding algorithm is straightforward, easy to grasp, and quick to implement. Additionally, the generated codes are easily comprehensible to humans, maintaining the utility of the encoded data.
Adaptability: Kinship relationships are deduced from a set of codes, eliminating the necessity for adhering to a rigid structure. Fcodes operate independently, allowing them to be inserted out of sequence without disrupting the fundamental relationships. This affords a high degree of flexibility in data collection.
Accessibility: Fcode encoding can be carried out with minimal tools, such as a pen and paper. In digital environments, these codes can be stored in any text file, thereby freeing the codification process from dependence on a specific operating system.

[!TIP] For kinship encoding with the Fcode algorithm, consider using the F-Tree graphical interface. It provides a user-friendly way to apply the algorithm.

The Encoding Algorithm
The Fcode CLI
- random
- read
- record
- report
- search
- tree
Classes
- FcodeManager
- FBook
- FamilyTree
- Fgenerator
- Freader
Modules
- html_report
Publications

The Encoding Algorithm

This section explains the process of encoding kinship relationships, establishes a valid data file for working with fcodes, and introduces key terms essential for its comprehension.

Encoding legend

| | Fcode legend | |:-:|-------------------------------------| | * | origin of coordinates (usually oneself) | | C | spouse | | O | brother | | A | sister | | H | sibling (sex unknown) | | P | father | | M | mother | | o | son | | a | daughter | | h | offspring (sex unknown) |

Along with the encoding legend, it is essential to address the following considerations (especially consideration 1) in order to write a valid fcode:

Consideration 1: Numbering of siblings and offspring

Siblings (O, A, H) and offspring (o, a, h) must be numerated, starting from the oldest to the youngest (1,2,3…).The numbering does not distinguish between sexes. In case of being unknown use:

“-” to indicate that it is the youngest.

“:” to indicate that it is not the oldest nor the youngest.

“.” to indicate that its number is unknown.

Consideration 1 is essential as it is necessary to distinguish between individuals of the same gender and generation.

Consideration 2: Numbering of parents and partners

Parents (P, M) and partners (C) should also be numerated following the same rules of consideration 1, but only when they are the last layer (see section Layers and depth). For example: *M2, *MO4, *C3, *CMP1, *CMPO3...

Encoding rules

An origin of coordinates (OC) is taken, usually oneself, and from there the kinships are referenced. From the OC to the destination, each letter that is added refers to the previous letter. In this way, the kinship can be built from left to right like so:

[!NOTE]
Numbers have been removed from the following example for simplicity.

My father = *P
- me --> *
- my father --> *P
My paternal aunt = *PA
- me --> *
- my father --> *P
- the sister of my father --> *PA
My paternal cousin = *PAa
- me --> *
- my father --> *P
- the sister of my father --> *PA
- the daughter of the sister of my father --> *PAa

And so on:

*MO2CO-a1:
- my mother --> *M
- her second brother (my uncle) --> *MO2
- the spouse of my uncle (my aunt-in-law) --> *MO2C
- the younger brother of my aunt-in-law (my uncle-in-law) --> *MO2CO-
- the first daughter of my uncle-in-law --> *MO2CO-a1

Preparing the data file

To encode an entire family, it is necessary to represent the desired kinship relationships in a tab-separated values (TSV) file —see sections encoding legend and encoding rules—. The TSV file should consist of two columns: the first one with the fcode, and the second one with the individual's name. Empty lines and lines starting with number sign (#) are ommited, thus providing more flexibility to the data collection process.

The aforementioned TSV file will be referred to as the data file, or just "the data". An illustrative example of a valid data file is provided below.

# Parents
*   Homer Simpson
*C  Marge Simpson

# Offspring
*o1	Bart Simpson
*a2	Lisa Simpson
*a3	Maggie Simpson

# Grandpas
*P  Abe Simpson
*M  Mona Simpson

The Fcode nomenclature

Layers and depth

Each letter of the Fcode is a layer, or in other words, a layer is each of the kindship relationships of a given Fcode. A layer can also have numbers or symbols, those specified on the "encode legend". For example, the code *MPA1o- can be descomposed in four layers:

Layers and depth

The total number of layers of an Fcode is its depth.

Directionality

Fcodes can be read left to right (direction up), or right to left (direction down). Thus, for a given layer, the upward layer is the one on its right, whereas the downward layer is the one on its left.

Directionality

Number

The number of a fcode is the number of its last layer, that is, the number of son that is (see consideration 1).

Number

Position

When talking about a full Fcode, the term position is used to refer to a specific layer, and all the downward layers until the OC. For example, given the Fcode "*MO2CO-a1", position 2 is "*MO2" and position 4 is "*MO2CO-".

Position

Booleaning

Booleaning is removing symbols ('-', ':', '.', '') and numbers from a layer, a position, or an fcode, to obtain an l_bool, a p_bool or a f_bool respectively. The length of a f_bool is equal to the depth of its fcode.

| Fcode | F_bool | | --- | --- | | *MPO1a1 | *MPOa | | *MPO2a3 | *MPOa | | *MPA3a2 | *MPAa |

Three important concepts stem from booleaning: boolcodes, parbools and sexed types. Boolcodes are fcodes with the upper layer booleanized, parbools are fcodes with just the parents (P, M) booleanized, whereas the sex type is the upper layer of a booleanized fcode. For example:

| Fcode | Boolcode | parbool | Sexed type | | --- | --- | --- | :---: | | *MP3O1 | *MPO | *MPO1 | O | | *MP3O1a1 | *MPO1a | *MPO1a1 | a | | *MP3A3a2 | *MPA3a | *MPA3a2 | a | | *MP3A3o3 | *MPA3o | *MPA3o3 | o |

Boolcodes are used to predict parent-offspring relationships; sexed types to predict potential partners, and parbools enable the numeration of parents and partners in a single instance.

Lineage, type and sexed linages

Lineages are similar to booleaning, but without considering sexes. Therefore, P's, M's, O's and A's are replaced with X's, H's and h's like so:

| Fcode encoding | Lineage encoding | | --- | :---: | | P / M | X | | O / A | H | | o / a | h |

Lineages can also be applied to a layer (l_lineage), position (p_lineage) or fcode (f_lineage). The lineage of the upper layer is the type of the fcode, for example:

| Fcode | f_lineage | type | | --- | --- | :---: | | *MPO1a1 | *XXHh | h| | *MPO2a3 | *XXHh | h| | *MPA3a2 | *XXHh | h| | *MPA3C | *XXHC | C|

An special type of linages are sexed linages, which are linages with their upper layer booleanized. Sexed linages are very useful for stablish sex-aware kindship relationships (father, grandmother, uncle in law...).

Other type of linages are linagecodes, an fcode with a linage in the upper layer. Linagecodes are useful for searching siblings and offspring of a given fcode, for example:

| Fcode | Lineagecode | | --- | --- | | *MPO1a1 | *MPO1h | | *MPO2a3 | *MPO2h | | *MPA3a2 | *MPA3h | | *MPA3o1 | *MPA3h |

Fcode patterns

As fcodes should be written using the minimum ammount of symbols possible, there are some combinations of layers that are not recommended. The best strategy to detect such combinations is through lineages. In particular, there are eight patterns that should be steered clear of (refer to the table below).

| Wrong pattern | Right pattern | Fix | Example (wrong) | Example (right) | |-------|------------------|-------------------------------------------------------------------------------------|-----------------|-----------------| | CC | (loop) | Remove pattern. | *CC | * | | hX | X | Remove offspring. | *a1M3 | *M3 | | HX | X

Fcodes

Install / Use

README

Encoding kinship relations: The Fcode

Table of contents

The Encoding Algorithm

Encoding legend

| | Fcode legend | |:-:|-------------------------------------| | * | origin of coordinates (usually oneself) | | C | spouse | | O | brother | | A | sister | | H | sibling (sex unknown) | | P | father | | M | mother | | o | son | | a | daughter | | h | offspring (sex unknown) |

Consideration 1: Numbering of siblings and offspring

Consideration 2: Numbering of parents and partners

Encoding rules

Preparing the data file

The Fcode nomenclature

Layers and depth

Directionality

Number

Position

Booleaning

Lineage, type and sexed linages

Fcode patterns