Ch08-Kylin 之 部分细节

Ch08-Kylin 之 部分细节

July 10, 2021
Apache Kylin
kylin

Kylin 部分细节

1. KylinConfig #

KylinConfig 除过承载kylin.propertieskylin-defaults.properties中的所有配置项之外,还保留了一个transient Map<Class, Object> managersCache = new ConcurrentHashMap<>();,这个 CHM 采用懒加载的方式,将 Kylin 所有的 manager 加载进来,这样当某个逻辑需要指定的 Manager 时,就可以直接通过这个 map 获取到(如果获取不到的话,会将对应的 manager 构建起来,然后放到这个 map 里面,同时返回)。

目前整个 Kylin-4.0.0 中总共有 23 种 manager:BadQueryHistoryManager, Broadcaster, CubeDescManager, CubeManager, CuboidManager, DataModelManager, DictionaryManager, DistributedScheduler, DraftManager, ExecutableDao, ExecutableManager, HybridManager, KafkaConfigManager, KylinUserManager, ProjectManager, RealizationRegistry, SnapshotManager, SourceManager, StreamingManager, StreamingSourceConfigManager, TableACLManager, TableMetadataManager, TempStatementManager。这些 manager 全部通过 KylinConfig 中的 getManager 方法初始化并获取。

    public <T> T getManager(Class<T> clz) {
        KylinConfig base = base();
        if (base != this)
            return base.getManager(clz);

        if (managersCache == null) {
            managersCache = new ConcurrentHashMap<>();
        }

        Object mgr = managersCache.get(clz);
        if (mgr != null)
            return (T) mgr;

        synchronized (clz) {
            mgr = managersCache.get(clz);
            if (mgr != null)
                return (T) mgr;

            try {
                logger.info("Creating new manager instance of {}", clz);

                // new manager via static Manager.newInstance()
                Method method = clz.getDeclaredMethod("newInstance", KylinConfig.class);
                method.setAccessible(true); // override accessibility
                mgr = method.invoke(null, this);
            } catch (Exception e) {
                throw new RuntimeException(e);
            }
            managersCache.put(clz, mgr);
        }
        return (T) mgr;
    }

2. olap_model.json 是每次查询都会生成吗? #

这个 json 文件出现在每次查询,调用 calcite 的时候,传入 schema 信息时用到。

public class QueryConnection {

    private static Boolean isRegister = false;

    public static Connection getConnection(String project) throws SQLException {
        if (!isRegister) {
            try {
                Class<?> aClass = Thread.currentThread().getContextClassLoader()
                        .loadClass("org.apache.calcite.jdbc.Driver");
                Driver o = (Driver) aClass.getDeclaredConstructor().newInstance();
                DriverManager.registerDriver(o);
            } catch (ClassNotFoundException | InstantiationException | IllegalAccessException | NoSuchMethodException | InvocationTargetException e) {
                e.printStackTrace();
            }
            isRegister = true;
        }
        File olapTmp = OLAPSchemaFactory.createTempOLAPJson(project, KylinConfig.getInstanceFromEnv());
        Properties info = new Properties();
        info.putAll(KylinConfig.getInstanceFromEnv().getCalciteExtrasProperties());
        // Import calcite props from jdbc client(override the kylin.properties)
        info.putAll(BackdoorToggles.getJdbcDriverClientCalciteProps());
        info.put("model", olapTmp.getAbsolutePath());
        info.put("typeSystem", "org.apache.kylin.query.calcite.KylinRelDataTypeSystem");
        return DriverManager.getConnection("jdbc:calcite:", info);
    }
}

可以看到 QueryConnection 不是单例的实现,所以 Kylin Web 界面每发起一次查询时,这里都会借助 Calcite 建立一次链接。因此 info 中的model.json的内容在每次查询时都会生成,但是最终写入磁盘的文件只会在第一次查询的时候生成。在OLAPSchemaFactory内部有一个缓存Map<String, File> cachedJsons = Maps.newConcurrentMap(),这个 key 值是 json 的内容,value 是 json 的路径。因此只有这个 map 还存在,那么这个文件是不会再生成的(注意,这里没有检测文件是否存在,所以在 kylin 启动起来后,删掉这个文件,那查询就只能报错了)。

"caseSensitive" -> "true"
"unquotedCasing" -> "TO_UPPER"
"model" -> "/home/li/Software/apache-kylin-4.0.0-beta-bin/bin/../tomcat/temp/olap_model_277471611618725080.json"
"typeSystem" -> "org.apache.kylin.query.calcite.KylinRelDataTypeSystem"
"conformance" -> "LENIENT"
"quoting" -> "DOUBLE_QUOTE"
{
    "version": "1.0",
    "defaultSchema": "DEMO",
    "schemas": [
        {
            "type": "custom",
            "name": "DEMO",
            "factory": "org.apache.kylin.query.schema.OLAPSchemaFactory",
            "operand": {
                "project": "project_demo"
            },
            "functions": [
               {
                   name: 'MASSIN',
                   className: 'org.apache.kylin.query.udf.MassInUDF'
               },
               {
                   name: 'INTERSECT_VALUE',
                   className: 'org.apache.kylin.measure.bitmap.BitmapIntersectValueAggFunc'
               },
               {
                   name: 'PERCENTILE',
                   className: 'org.apache.kylin.measure.percentile.PercentileAggFunc'
               },
               {
                   name: 'CONCAT',
                   className: 'org.apache.kylin.query.udf.ConcatUDF'
               },
               {
                   name: 'INTERSECT_COUNT',
                   className: 'org.apache.kylin.measure.bitmap.BitmapIntersectDistinctCountAggFunc'
               },
               {
                   name: 'VERSION',
                   className: 'org.apache.kylin.query.udf.VersionUDF'
               },
               {
                   name: 'PERCENTILE_APPROX',
                   className: 'org.apache.kylin.measure.percentile.PercentileAggFunc'
               }
            ]
        }
    ]
}

3. 参考文献 #