About Me

My photo
Software Engineer at Starburst. Maintainer at Trino. Previously at LINE, Teradata, HPE.


Try HDP 3.0.1

Started using HDP 3.0.1. Past version I was used is 2.6.5, therefore I found many changes, new Ambari design, hive command starts beeline and so one.

One major change to me, hive view has been deleted since HDP 3.0.

Tried Data Analytics Studio instead of that.

You can move the page by
Ambari → Services → Data Analytics Studio  → Data Analytics Studio UI

I guess it's BI tool from the name, but it seems the DBA tool. It has 4 components.

1. Queries
You can see query history. And Recommendations tab gives us suggested changes.

Query Details

Visual Explain



DAG Info

2. Compose
This is almost the same as class hive view. We can use auto suggest.

3. Database
Partitions tab shows column name, column type and a comment if the table is partitioned. I want to see partition's numbers, size and so on.


Storage Information

Detailed Information


Data Preview

4. Reports
This page visualizes lineage from the query history. It looks really useful. 

Read and Write Report
Join Report


Trino Storage Connector

When I played Apache Drill, I felt it's useful to analyze local csv and parquet. Then, I developed a Trino connector to support accessing for local files. This is GitHub repository.

As you may already know, Trino specify a table name like `catalog.schema.table`. The connector identify the file type from schema. Current supported types are  csv, tsv, txt, raw and excel. All types except for raw returns multi rows if the file contains EOL. A raw type returns 1 column and 1 record. I assume the raw type can be used for converting to JSON file.

`table` name will be like below. You can access local and remote file.

Here is the entire query example. The `csv` on schema and the extension of `file:///tmp/numbers.csv` are not irrelevant. Therefore, if you change the schema to `tsv`, it returns the result split by tab.


Current Trino doesn't have importer like PostgreSQL's COPY. This connector may useful in such case🍭


AES CBC ISO7816 and SHA256 on Java

This is Java snippet using AES/CBC/ISO7816-4Padding and SHA256. javafx.cypto doesn't support ISO7816 padding, therefore I use org.bouncycastle.

The group id and artifact is is below and my sample code uses 1.52.
groupId: org.bouncycastle
artifactId: bcprov-jdk15on
version: 1.52

import java.security.Security;

import javax.crypto.Cipher;
import javax.crypto.spec.IvParameterSpec;
import javax.crypto.spec.SecretKeySpec;

import org.apache.commons.codec.digest.DigestUtils;
import org.bouncycastle.jce.provider.BouncyCastleProvider;

public final class Encryptor {

private static final byte[] KEY = "0123456789abcdef".getBytes();
private static final byte[] INIT_VECTOR = "abcdef0123456789".getBytes();

static {
Security.insertProviderAt(new BouncyCastleProvider(), 1);

private static String encrypt(String value)
throws Exception
SecretKeySpec keySpec = new SecretKeySpec(KEY, "AES");
Cipher cipher = Cipher.getInstance("AES/CBC/ISO7816-4Padding");
cipher.init(Cipher.ENCRYPT_MODE, keySpec, new IvParameterSpec(INIT_VECTOR));
byte[] encrypted = cipher.doFinal(value.getBytes());
return DigestUtils.sha256Hex(encrypted);


Try HBase Java Client

This is HBase Java test example. My repository is here (https://github.com/ebyhr/hbase-embed). I've changed log level to INFO because default log level is little noisy.

import org.apache.hadoop.hbase.HBaseTestingUtility;
import org.apache.hadoop.hbase.TableNotDisabledException;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.HBaseAdmin;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.util.Bytes;
import org.junit.jupiter.api.AfterAll;
import org.junit.jupiter.api.BeforeAll;
import org.junit.jupiter.api.Test;

import java.io.IOException;
import java.util.Arrays;
import java.util.List;

import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertThrows;

class HBaseTest {
    private static final HBaseTestingUtility HBASE = new HBaseTestingUtility();
    private static final byte[] TABLE_NAME = Bytes.toBytes("EMPLOYEE");
    private static final byte[] CF = Bytes.toBytes("cf1");
    private static final List PEOPLES = Arrays.asList(
            new People("山田", 20),
            new People("영수", 33),
            new People("John", 18)

    static void setUp()
            throws Exception

        HTable table = HBASE.createTable(TABLE_NAME, CF);
        for (People people : PEOPLES) {
            int i = PEOPLES.indexOf(people);
            String rowKey = "rowkey" + i;
            Put put = new Put(Bytes.toBytes(rowKey));
            put.add(CF, Bytes.toBytes("name"), Bytes.toBytes(people.name));
            put.add(CF, Bytes.toBytes("age"), Bytes.toBytes(people.age));

    static void tearDown()
            throws Exception

    void testAccessTable()
            throws IOException
        HTable table = new HTable(HBASE.getConfiguration(), TABLE_NAME);
        String rowKey = "rowkey1";
        Result getResult = table.get(new Get(Bytes.toBytes(rowKey)));
        assertEquals("영수", Bytes.toString(getResult.getValue(CF, Bytes.toBytes("name"))));
        assertEquals(33, Bytes.toInt(getResult.getValue(CF, Bytes.toBytes("age"))));

    void testDrop()
            throws Exception
        HTable table = HBASE.createTable(Bytes.toBytes("TMP-TABLE1"), CF);
        HBaseAdmin admin = HBASE.getHBaseAdmin();
        assertThrows(TableNotDisabledException.class, () -> admin.deleteTable(table.getTableName()));

    private static class People {
        private String name;
        private int age;

        private People(String name, int age)
            this.name = name;
            this.age = age;