SkillAgentSearch skills...

Themis

Themis provides cross-row/cross-table transaction on HBase based on google's percolator.

Install / Use

/learn @XiaoMi/Themis
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Themis

Introduction

Themis provides cross-row/cross-table transaction on HBase based on google's Percolator.

Themis guarantees the ACID characteristics of cross-row transaction by two-phase commit and conflict resolution, which is based on the single-row transaction of HBase. Themis depends on Chronos to provide global strictly incremental timestamp, which defines the global order for transactions and makes Themis could read database snapshot before given timestamp. Themis adopts HBase coprocessor framework, which could be applied without changing source code of HBase. We validate the correctness of Themis for a few months, and optimize the algorithm to achieve better performance.

Implementation

Themis contains three components: timestamp server, client library, themis coprocessor.

themis_architecture

Timestamp Server

Themis uses the timestamp of HBase's KeyValue internally, and the timestamp must be global strictly incremental. Themis depends on Chronos to provide such timestamp service.

Client Library

  1. Provide transaction APIs.
  2. Fetch timestamp from Chronos.
  3. Issue requests to themis coprocessor in server-side.
  4. Resolve conflict for concurrent mutations of other clients.

Themis Coprocessor:

  1. Provide RPC methods for two-phase commit and read.
  2. Create auxiliary families and set family attributes for the algorithm automatically.
  3. Periodically clean the data of the aborted and expired transactions.

Usage

Build

  1. Get the latest source code of Themis:

    git clone https://github.com/XiaoMi/themis.git 
    
  2. The master branch of Themis depends on hbase 0.94.21 with hadoop.version=2.0.0-alpha. We can download source code of hbase 0.94.21 and install it in maven local repository by:

    (in the directory of hbase 0.94.21)
    mvn clean install -DskipTests -Dhadoop.profile=2.0
    
  3. Build Themis and install in local repository:

    cd themis
    mvn clean install -DskipTests
    

Loads themis coprocessor in HBase:

  1. Add themis-coprocessor dependency in the pom of HBase:

    <dependency>
      <groupId>com.xiaomi.infra</groupId>
      <artifactId>themis-coprocessor</artifactId>
      <version>1.0-SNAPSHOT</version>
    </dependency>
    
  2. Add configurations for themis coprocessor in hbase-site.xml:

    <property>
      <name>hbase.coprocessor.user.region.classes</name>
      <value>org.apache.hadoop.hbase.themis.cp.ThemisProtocolImpl,org.apache.hadoop.hbase.themis.cp.ThemisScanObserver,org.apache.hadoop.hbase.regionserver.ThemisRegionObserver</value>
    </property>
    <property>
       <name>hbase.coprocessor.master.classes</name>
       <value>org.apache.hadoop.hbase.master.ThemisMasterObserver</value>
    </property>
    
    
  3. Add the themis-client dependency in the pom of project which needs cross-row transactions.

Depends themis-client:

Add the themis-client dependency in the pom of project which needs cross-row transactions.

 <dependency>
  <groupId>com.xiaomi.infra</groupId>
  <artifactId>themis-client</artifactId>
  <version>1.0-SNAPSHOT</version>
 </dependency>

Run the example code

  1. Start a standalone HBase cluster(0.94.21 with hadoop.version=2.0.0-alpha) and make sure themis-coprocessor is loaded as above steps.

  2. After building Themis, run example code by:

    cd themis-client
    mvn exec:java -Dexec.mainClass="org.apache.hadoop.hbase.themis.example.Example"
    

The screen will output the result of read and write transactions.

Example of Themis API

The APIs of Themis are defined in TransactionInterface.java, including put/delete/get/getScanner, which are similar to HBase's APIs:

 public void put(byte[] tableName, ThemisPut put) throws IOException;
 public void delete(byte[] tableName, ThemisDelete delete) throws IOException;
 public void commit() throws IOException;
 public Result get(byte[] tableName, ThemisGet get) throws IOException;
 public ThemisScanner getScanner(byte[] tableName, ThemisScan scan) throws IOException;

The following code shows how to use Themis APIs:

 // This class shows an example of transfer $3 from Joe to Bob in cash table, where rows of Joe and Bob are
 // located in different regions. The example will use the 'put' and 'get' APIs of Themis to do transaction.
 public class Example {
   private static final byte[] CASHTABLE = Bytes.toBytes("CashTable"); // cash table
   private static final byte[] JOE = Bytes.toBytes("Joe"); // row for Joe
   private static final byte[] BOB = Bytes.toBytes("Bob"); // row for Bob
   private static final byte[] FAMILY = Bytes.toBytes("Account");
   private static final byte[] CASH = Bytes.toBytes("cash");

   public static void main(String args[]) throws IOException {
     Configuration conf = HBaseConfiguration.create();
     HConnection connection = HConnectionManager.createConnection(conf);
     // create table and set THEMIS_ENABLE in family 'Account' 
     createTable(connection);

     // transfer $3 from Joe to Bob
     Transaction transaction = new Transaction(conf, connection);
     // firstly, read out the current cash for Joe and Bob
     ThemisGet get = new ThemisGet(JOE).addColumn(FAMILY, CASH);
     int cashOfJoe = Bytes.toInt(transaction.get(CASHTABLE, get).getValue(FAMILY, CASH));
     get = new ThemisGet(BOB).addColumn(FAMILY, CASH);
     int cashOfBob = Bytes.toInt(transaction.get(CASHTABLE, get).getValue(FAMILY, CASH));

     // then, transfer $3 from Joe to Bob, the mutations will be cached in client-side
     int transfer = 3;
     ThemisPut put = new ThemisPut(JOE).add(FAMILY, CASH, Bytes.toBytes(cashOfJoe - transfer));
     transaction.put(CASHTABLE, put);
     put = new ThemisPut(BOB).add(FAMILY, CASH, Bytes.toBytes(cashOfBob + transfer));
     transaction.put(CASHTABLE, put);
     // commit the mutations to server-side
     transaction.commit();

     connection.close();
     Transaction.destroy();
   }
 }

For the full example, please see : org.apache.hadoop.hbase.themis.example.Example.java

Schema Support

  1. Themis will use the timestamp of KeyValue internally, so that the timestamp and version attributes of HBase's KeyValue can't be used by the application.
  2. For families need Themis, set THEMIS_ENABLE to 'true' by adding "CONFIG => {'THEMIS_ENABLE', 'true'}" to the family descriptor when creating table.
  3. For each column, Themis will introduce two auxiliary columns : lock column and commit column. Themis saves the auxiliary columns in specific families : lock column in family 'L', and commit column in family #p(or in family #d if it is a Delete). The character '#' is preserved by Themis and application should not include it in name of the family needing Themis. Themis will create auxiliary families automically when creating table if 'THEMIS_ENABLE' is set on some family.

Themis Configuration

Client Side

Timestamp server

If users want strong consistency across client processes, the 'themis.timestamp.oracle.class' should be set to 'RemoteTimestampOracleProxy'. Then, Themis will access globally incremental timestamp from Chronos, the entry of Chronos will be registered in Zookeeper where the quorum address and entry node can be configured.

The default value of 'themis.timestamp.oracle.class' is 'LocalTimestampOracle', which provides incremental timestamp locally in one process. If users only need strong consistency in one clent process, the default value could be used.

| Key | Description | Default Value | |--------------------------------------------|----------------------------------------------------|----------------------| | themis.timestamp.oracle.class | timestamp server type | LocalTimestampOracle | | themis.remote.timestamp.server.zk.quorum | ZK quorum where remote timestamp server registered | 127.0.0.1:2181 | | themis.remote.timestamp.server.clustername | cluster name of remote timestamp server | default-cluster |

Lock clean

The client needs to clean lock if encountering conflict. Users can configure the ttl of lock in client-side by 'themis.client.lock.clean.ttl'. The default value of this configuration is 0, which means the lock ttl will be decided by the server side configurations.

Users can configure 'themis.worker.register.class' to 'ZookeeperWorkerRegister' to help resolve conflict faster. For details of conflict resolve, please see: Percolator paper.

| Key | Description | Default Value | |--------------------------------------------|----------------------------------------------------|----------------------| | themis.client.lock.clean.ttl | lock ttl configured in client-side | 0 | | themis.worker.register.class | worker register class | NullWorkerRegister | | themis.retry.count | retry count when clean lock | 10 | | themis.pause | sleep time between retries | 100 |

Server Side

Data Clean Options

Both read and write transactions should not last too long. Users can set 'themis.transaction.ttl.enabl

View on GitHub
GitHub Stars226
CategoryDevelopment
Updated6d ago
Forks58

Languages

Java

Security Score

95/100

Audited on Apr 2, 2026

No findings