By A Web Design

HBase Querying via JDO

A few challenges arise when querying HBase via JDO (Datanucleus) with indexes.

  • Index Selection – We take into account fields in the case where there are only and operators and we allow tree traversal for multi-key indexes in the case there is an equals operator.  We added special handling for startsWith since it helps us with front end search and sorting.
  • Key Creation – The keys need to be multi-column, byte[] sortable, and searchable.
  • Filter Creation – The filters need to be specialized to understand the index and implement a comparison of values.  In essence, the index definition provides the typing for comparison.
  • Paging – Based on sorting, the index, and the page size, Hbase needs to determine whether it can page or if the paging needs to pass to JDO.
  •  JDOQLEvaluator override – Does the JDOEvaluator need to filter the data, sort the data, or page the data?  It depends on the index utilization cases.
It is amazing how fast this implementation is when you split the processing between HBase and JDO correctly.