SkillAgentSearch skills...

XBRLFiles

Explore XBRL with R

Install / Use

/learn @bergant/XBRLFiles
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Exploring XBRL files with R

Darko Bergant
Saturday, January 17, 2015

#What is XBRL? Extensible Business Reporting Language (XBRL) is the open international standard for digital business reporting, managed by a global not for profit consortium, XBRL International.

#XBRL Parser for R Parsing XBRL is not something you could do with your eyes closed. Fortunately the XBRL package by Roberto Bertolusso and Marek Kimel takes all the pain away.

To parse complete XBRL, use xbrlDoAll function. It extracts xbrl instance and related schema files to a list of data frames.

library(XBRL)

inst <- "http://edgar.sec.gov/Archives/edgar/data/21344/000002134414000008/ko-20131231.xml"
options(stringsAsFactors = FALSE)
xbrl.vars <- xbrlDoAll(inst, cache.dir = "XBRLcache", prefix.out = NULL)

str(xbrl.vars, max.level = 1)
## List of 10
##  $ element     :'data.frame':	21453 obs. of  8 variables:
##  $ role        :'data.frame':	96 obs. of  5 variables:
##  $ calculation :'data.frame':	196 obs. of  11 variables:
##  $ context     :'data.frame':	740 obs. of  13 variables:
##  $ unit        :'data.frame':	4 obs. of  4 variables:
##  $ fact        :'data.frame':	2745 obs. of  9 variables:
##  $ footnote    :'data.frame':	6 obs. of  5 variables:
##  $ definition  :'data.frame':	1398 obs. of  11 variables:
##  $ label       :'data.frame':	2798 obs. of  5 variables:
##  $ presentation:'data.frame':	1582 obs. of  11 variables:

XBRL Data Frames Structure

The data structure of the data frames is shown in the image below

XBRL tables

All values are kept in the fact table (in the fact field, precisely). The element table defines what are these values (the XBRL concepts, e.g. “assets”, “liabilities”, “net income” etc.). The context table defines the periods and other dimensions for which the values are reported.

With dplyr's join and filter it is quite easy to explore the data in interrelated tables. For example, to extract revenue from the sale of goods we have to join the facts (the numbers) with the context (periods, dimensions):

library(dplyr)

xbrl.vars$fact %>%
  filter(elementId == "us-gaap_SalesRevenueGoodsNet") %>%
  left_join(xbrl.vars$context, by = "contextId") %>%
  filter(is.na(dimension1)) %>%
  select(startDate, endDate, fact, unitId, elementId) %>% 
  (knitr::kable)(format = "markdown")

|startDate |endDate |fact |unitId |elementId | |:----------|:----------|:-----------|:------|:----------------------------| |2011-01-01 |2011-12-31 |46542000000 |usd |us-gaap_SalesRevenueGoodsNet | |2012-01-01 |2012-12-31 |48017000000 |usd |us-gaap_SalesRevenueGoodsNet | |2013-01-01 |2013-12-31 |46854000000 |usd |us-gaap_SalesRevenueGoodsNet |

Balance Sheet Example

Select Statement

XBRL encapsulates several reports of different types:

table(xbrl.vars$role$type)
## 
## Disclosure   Document  Statement 
##         86          1          9

To find all statements, filter roles by type:

htmlTable::htmlTable(data.frame(Statements=
  with(
    xbrl.vars$role[xbrl.vars$role$type=="Statement", ],
    paste(roleId, "\n<br/>", definition, "\n<p/>")
  )),
  align = "l",
  rnames = FALSE
)
<table class='gmisc_table' style='border-collapse: collapse;' > <thead> <tr> <th style='border-bottom: 1px solid grey; border-top: 2px solid grey; text-align: center;'>Statements</th> </tr> </thead> <tbody> <tr> <td style='text-align: left;'>http://www.thecocacolacompany.com/role/ConsolidatedBalanceSheetParentheticals <br/> 1003500 - Statement - CONSOLIDATED BALANCE SHEET (Parentheticals) <p/></td> </tr> <tr> <td style='text-align: left;'>http://www.thecocacolacompany.com/role/ConsolidatedBalanceSheets <br/> 1003000 - Statement - CONSOLIDATED BALANCE SHEETS <p/></td> </tr> <tr> <td style='text-align: left;'>http://www.thecocacolacompany.com/role/ConsolidatedStatementsOfCashFlows <br/> 1004000 - Statement - CONSOLIDATED STATEMENTS OF CASH FLOWS <p/></td> </tr> <tr> <td style='text-align: left;'>http://www.thecocacolacompany.com/role/ConsolidatedStatementsOfComprehensiveIncome <br/> 1002000 - Statement - CONSOLIDATED STATEMENTS OF COMPREHENSIVE INCOME <p/></td> </tr> <tr> <td style='text-align: left;'>http://www.thecocacolacompany.com/role/ConsolidatedStatementsOfComprehensiveIncomeCalc2 <br/> 1002000 - Statement - CONSOLIDATED STATEMENTS OF COMPREHENSIVE INCOME <p/></td> </tr> <tr> <td style='text-align: left;'>http://www.thecocacolacompany.com/role/ConsolidatedStatementsOfIncome <br/> 1001000 - Statement - CONSOLIDATED STATEMENTS OF INCOME <p/></td> </tr> <tr> <td style='text-align: left;'>http://www.thecocacolacompany.com/role/ConsolidatedStatementsOfIncomeCalc2 <br/> 1001000 - Statement - CONSOLIDATED STATEMENTS OF INCOME <p/></td> </tr> <tr> <td style='text-align: left;'>http://www.thecocacolacompany.com/role/ConsolidatedStatementsOfShareownersEquity <br/> 1005000 - Statement - CONSOLIDATED STATEMENTS OF SHAREOWNERS' EQUITY <p/></td> </tr> <tr> <td style='border-bottom: 2px solid grey; text-align: left;'>http://www.thecocacolacompany.com/role/ConsolidatedStatementsOfShareownersEquityParentheticals <br/> 1005500 - Statement - CONSOLIDATED STATEMENTS OF SHAREOWNERS' EQUITY (Parentheticals) <p/></td> </tr> </tbody> </table>

Presentation hierarchy

To find out which concepts are reported on specific financial statement component, we have to search the presentation tree from the top element.

library(tidyr)
library(dplyr)

# let's get the balace sheet
role_id <- "http://www.thecocacolacompany.com/role/ConsolidatedBalanceSheets"

# prepare presentation linkbase : 
# filter by role_id an convert order to numeric
pres <- 
  xbrl.vars$presentation %>%
  filter(roleId %in% role_id) %>%
  mutate(order = as.numeric(order))

# start with top element of the presentation tree
pres_df <- 
  pres %>%
  anti_join(pres, by = c("fromElementId" = "toElementId")) %>%
  select(elementId = fromElementId)

# breadth-first search
while({
  df1 <- pres_df %>%
    na.omit() %>%
    left_join( pres, by = c("elementId" = "fromElementId")) %>%
    arrange(elementId, order) %>%
    select(elementId, child = toElementId);
  nrow(df1) > 0
}) 
{
  # add each new level to data frame
  pres_df <- pres_df %>% left_join(df1, by = "elementId")
  names(pres_df) <-  c(sprintf("level%d", 1:(ncol(pres_df)-1)), "elementId")
}
# add last level as special column (the hierarchy may not be uniformly deep)
pres_df["elementId"] <- 
  apply( t(pres_df), 2, function(x){tail( x[!is.na(x)], 1)})
pres_df["elOrder"] <- 1:nrow(pres_df) 

# the final data frame structure is
str(pres_df, vec.len = 1 )
## 'data.frame':	37 obs. of  8 variables:
##  $ level1   : chr  "us-gaap_StatementOfFinancialPositionAbstract" ...
##  $ level2   : chr  "us-gaap_StatementTable" ...
##  $ level3   : chr  "us-gaap_StatementScenarioAxis" ...
##  $ level4   : chr  "us-gaap_ScenarioUnspecifiedDomain" ...
##  $ level5   : chr  NA ...
##  $ level6   : chr  NA ...
##  $ elementId: chr  "us-gaap_ScenarioUnspecifiedDomain" ...
##  $ elOrder  : int  1 2 ...

Amounts and Contexts

Elements (or concepts in XBRL terminology) of the balance sheet are now gathered in data frame with presentation hierarchy levels. To see the numbers we have to join the elements with numbers from fact table and periods from context table:

# join concepts with context, facts
pres_df_num <-
  pres_df %>%
  left_join(xbrl.vars$fact, by = "elementId") %>%
  left_join(xbrl.vars$context, by = "contextId") %>%
  filter(is.na(dimension1)) %>%
  filter(!is.na(endDate)) %>%
  select(elOrder, contains("level"), elementId, fact, decimals, endDate) %>%
  mutate( fact = as.numeric(fact) * 10^as.numeric(decimals)) %>%
  spread(endDate, fact ) %>%
  arrange(elOrder)

library(pander)
pres_df_num %>% 
  select(elementId, contains("2013"), contains("2012")) %>%
  pandoc.table(
    style = "rmarkdown",
    split.table = 200,
    justify = c("left", "right", "right")
  )

| elementId | 2013-12-31 | 2012-12-31 | |:-------------------------------------------------------------------------------|-------------:|-------------:| | us-gaap_CashAndCashEquivalentsAtCarryingValue | 10414 | 8442 | | us-gaap_OtherShortTermInvestments | 6707 | 5017 | | us-gaap_CashCashEquivalentsAndShortTermInvestments | 17121 | 13459 | | us-gaap_MarketableSecuritiesCurrent | 3147 | 3092 | | us-gaap_AccountsReceivableNetCurrent | 4873 | 4759 | | us-gaap_InventoryNet | 3277 | 3264 | | us-gaap_PrepaidExpenseAndOtherAssetsCurrent | 2886 | 2781 | | us-gaap_AssetsHeldForSaleCurrent | 0 | 2973 | | us-gaap_AssetsCurrent | 31304 | 30328 | | us-gaap_EquityMethodInvestments | 10393 | 9216 | | ko_AvailableForSaleSecuritiesAndCostMethodInvestments | 1119 | 1232 | | us-gaap_OtherAssetsNoncurrent | 4661 | 3585 | | us-gaap_PropertyPlantAndEquipmentNet |

Related Skills

View on GitHub
GitHub Stars83
CategoryDevelopment
Updated2mo ago
Forks28

Security Score

80/100

Audited on Jan 9, 2026

No findings