Fcodes
No description available
Install / Use
/learn @Dannyzimmer/FcodesREADME
Encoding kinship relations: The Fcode
The method outlined below allows for the straightforward encoding of any type of kinship relationship. I have designated the resulting code as the "Fcode".
Utilizing Fcodes for encoding kinship relationships offers several advantages:
-
Efficiency: The encoding algorithm is straightforward, easy to grasp, and quick to implement. Additionally, the generated codes are easily comprehensible to humans, maintaining the utility of the encoded data.
-
Adaptability: Kinship relationships are deduced from a set of codes, eliminating the necessity for adhering to a rigid structure. Fcodes operate independently, allowing them to be inserted out of sequence without disrupting the fundamental relationships. This affords a high degree of flexibility in data collection.
-
Accessibility: Fcode encoding can be carried out with minimal tools, such as a pen and paper. In digital environments, these codes can be stored in any text file, thereby freeing the codification process from dependence on a specific operating system.
[!TIP] For kinship encoding with the Fcode algorithm, consider using the F-Tree graphical interface. It provides a user-friendly way to apply the algorithm.
Table of contents
The Encoding Algorithm
This section explains the process of encoding kinship relationships, establishes a valid data file for working with fcodes, and introduces key terms essential for its comprehension.
Encoding legend
| | Fcode legend | |:-:|-------------------------------------| | * | origin of coordinates (usually oneself) | | C | spouse | | O | brother | | A | sister | | H | sibling (sex unknown) | | P | father | | M | mother | | o | son | | a | daughter | | h | offspring (sex unknown) |
Along with the encoding legend, it is essential to address the following considerations (especially consideration 1) in order to write a valid fcode:
Consideration 1: Numbering of siblings and offspring
Siblings (O, A, H) and offspring (o, a, h) must be numerated, starting from the oldest to the youngest (1,2,3…).The numbering does not distinguish between sexes. In case of being unknown use:
- “-” to indicate that it is the youngest.
- “:” to indicate that it is not the oldest nor the youngest.
- “.” to indicate that its number is unknown.
Consideration 1 is essential as it is necessary to distinguish between individuals of the same gender and generation.
Consideration 2: Numbering of parents and partners
Parents (P, M) and partners (C) should also be numerated following the same rules of consideration 1, but only when they are the last layer (see section Layers and depth). For example: *M2, *MO4, *C3, *CMP1, *CMPO3...
Encoding rules
An origin of coordinates (OC) is taken, usually oneself, and from there the kinships are referenced. From the OC to the destination, each letter that is added refers to the previous letter. In this way, the kinship can be built from left to right like so:

[!NOTE]
Numbers have been removed from the following example for simplicity.
- My father = *P
- me --> *
- my father --> *P
- My paternal aunt = *PA
- me --> *
- my father --> *P
- the sister of my father --> *PA
- My paternal cousin = *PAa
- me --> *
- my father --> *P
- the sister of my father --> *PA
- the daughter of the sister of my father --> *PAa
And so on:
- *MO2CO-a1:
- my mother --> *M
- her second brother (my uncle) --> *MO2
- the spouse of my uncle (my aunt-in-law) --> *MO2C
- the younger brother of my aunt-in-law (my uncle-in-law) --> *MO2CO-
- the first daughter of my uncle-in-law --> *MO2CO-a1
Preparing the data file
To encode an entire family, it is necessary to represent the desired kinship relationships in a tab-separated values (TSV) file —see sections encoding legend and encoding rules—. The TSV file should consist of two columns: the first one with the fcode, and the second one with the individual's name. Empty lines and lines starting with number sign (#) are ommited, thus providing more flexibility to the data collection process.
The aforementioned TSV file will be referred to as the data file, or just "the data". An illustrative example of a valid data file is provided below.
# Parents
* Homer Simpson
*C Marge Simpson
# Offspring
*o1 Bart Simpson
*a2 Lisa Simpson
*a3 Maggie Simpson
# Grandpas
*P Abe Simpson
*M Mona Simpson
The Fcode nomenclature
Layers and depth
Each letter of the Fcode is a layer, or in other words, a layer is each of the kindship relationships of a given Fcode. A layer can also have numbers or symbols, those specified on the "encode legend". For example, the code *MPA1o- can be descomposed in four layers:

The total number of layers of an Fcode is its depth.
Directionality
Fcodes can be read left to right (direction up), or right to left (direction down). Thus, for a given layer, the upward layer is the one on its right, whereas the downward layer is the one on its left.

Number
The number of a fcode is the number of its last layer, that is, the number of son that is (see consideration 1).

Position
When talking about a full Fcode, the term position is used to refer to a specific layer, and all the downward layers until the OC. For example, given the Fcode "*MO2CO-a1", position 2 is "*MO2" and position 4 is "*MO2CO-".

Booleaning
Booleaning is removing symbols ('-', ':', '.', '') and numbers from a layer, a position, or an fcode, to obtain an l_bool, a p_bool or a f_bool respectively. The length of a f_bool is equal to the depth of its fcode.
| Fcode | F_bool | | --- | --- | | *MPO1a1 | *MPOa | | *MPO2a3 | *MPOa | | *MPA3a2 | *MPAa |
Three important concepts stem from booleaning: boolcodes, parbools and sexed types. Boolcodes are fcodes with the upper layer booleanized, parbools are fcodes with just the parents (P, M) booleanized, whereas the sex type is the upper layer of a booleanized fcode. For example:
| Fcode | Boolcode | parbool | Sexed type | | --- | --- | --- | :---: | | *MP3O1 | *MPO | *MPO1 | O | | *MP3O1a1 | *MPO1a | *MPO1a1 | a | | *MP3A3a2 | *MPA3a | *MPA3a2 | a | | *MP3A3o3 | *MPA3o | *MPA3o3 | o |
Boolcodes are used to predict parent-offspring relationships; sexed types to predict potential partners, and parbools enable the numeration of parents and partners in a single instance.
Lineage, type and sexed linages
Lineages are similar to booleaning, but without considering sexes. Therefore, P's, M's, O's and A's are replaced with X's, H's and h's like so:
| Fcode encoding | Lineage encoding | | --- | :---: | | P / M | X | | O / A | H | | o / a | h |
Lineages can also be applied to a layer (l_lineage), position (p_lineage) or fcode (f_lineage). The lineage of the upper layer is the type of the fcode, for example:
| Fcode | f_lineage | type | | --- | --- | :---: | | *MPO1a1 | *XXHh | h| | *MPO2a3 | *XXHh | h| | *MPA3a2 | *XXHh | h| | *MPA3C | *XXHC | C|
An special type of linages are sexed linages, which are linages with their upper layer booleanized. Sexed linages are very useful for stablish sex-aware kindship relationships (father, grandmother, uncle in law...).
Other type of linages are linagecodes, an fcode with a linage in the upper layer. Linagecodes are useful for searching siblings and offspring of a given fcode, for example:
| Fcode | Lineagecode | | --- | --- | | *MPO1a1 | *MPO1h | | *MPO2a3 | *MPO2h | | *MPA3a2 | *MPA3h | | *MPA3o1 | *MPA3h |
Fcode patterns
As fcodes should be written using the minimum ammount of symbols possible, there are some combinations of layers that are not recommended. The best strategy to detect such combinations is through lineages. In particular, there are eight patterns that should be steered clear of (refer to the table below).
| Wrong pattern | Right pattern | Fix | Example (wrong) | Example (right) | |-------|------------------|-------------------------------------------------------------------------------------|-----------------|-----------------| | CC | (loop) | Remove pattern. | *CC | * | | hX | X | Remove offspring. | *a1M3 | *M3 | | HX | X
