Ch08-Kylin 之 部分细节
July 10, 2021
Kylin 部分细节
1. KylinConfig #
KylinConfig 除过承载kylin.properties
和kylin-defaults.properties
中的所有配置项之外,还保留了一个transient Map<Class, Object> managersCache = new ConcurrentHashMap<>();
,这个 CHM 采用懒加载的方式,将 Kylin 所有的 manager 加载进来,这样当某个逻辑需要指定的 Manager 时,就可以直接通过这个 map 获取到(如果获取不到的话,会将对应的 manager 构建起来,然后放到这个 map 里面,同时返回)。
目前整个 Kylin-4.0.0 中总共有 23 种 manager:BadQueryHistoryManager
, Broadcaster
, CubeDescManager
, CubeManager
, CuboidManager
, DataModelManager
, DictionaryManager
, DistributedScheduler
, DraftManager
, ExecutableDao
, ExecutableManager
, HybridManager
, KafkaConfigManager
, KylinUserManager
, ProjectManager
, RealizationRegistry
, SnapshotManager
, SourceManager
, StreamingManager
, StreamingSourceConfigManager
, TableACLManager
, TableMetadataManager
, TempStatementManager
。这些 manager 全部通过 KylinConfig 中的 getManager 方法初始化并获取。
public <T> T getManager(Class<T> clz) {
KylinConfig base = base();
if (base != this)
return base.getManager(clz);
if (managersCache == null) {
managersCache = new ConcurrentHashMap<>();
}
Object mgr = managersCache.get(clz);
if (mgr != null)
return (T) mgr;
synchronized (clz) {
mgr = managersCache.get(clz);
if (mgr != null)
return (T) mgr;
try {
logger.info("Creating new manager instance of {}", clz);
// new manager via static Manager.newInstance()
Method method = clz.getDeclaredMethod("newInstance", KylinConfig.class);
method.setAccessible(true); // override accessibility
mgr = method.invoke(null, this);
} catch (Exception e) {
throw new RuntimeException(e);
}
managersCache.put(clz, mgr);
}
return (T) mgr;
}
2. olap_model.json 是每次查询都会生成吗? #
这个 json 文件出现在每次查询,调用 calcite 的时候,传入 schema 信息时用到。
public class QueryConnection {
private static Boolean isRegister = false;
public static Connection getConnection(String project) throws SQLException {
if (!isRegister) {
try {
Class<?> aClass = Thread.currentThread().getContextClassLoader()
.loadClass("org.apache.calcite.jdbc.Driver");
Driver o = (Driver) aClass.getDeclaredConstructor().newInstance();
DriverManager.registerDriver(o);
} catch (ClassNotFoundException | InstantiationException | IllegalAccessException | NoSuchMethodException | InvocationTargetException e) {
e.printStackTrace();
}
isRegister = true;
}
File olapTmp = OLAPSchemaFactory.createTempOLAPJson(project, KylinConfig.getInstanceFromEnv());
Properties info = new Properties();
info.putAll(KylinConfig.getInstanceFromEnv().getCalciteExtrasProperties());
// Import calcite props from jdbc client(override the kylin.properties)
info.putAll(BackdoorToggles.getJdbcDriverClientCalciteProps());
info.put("model", olapTmp.getAbsolutePath());
info.put("typeSystem", "org.apache.kylin.query.calcite.KylinRelDataTypeSystem");
return DriverManager.getConnection("jdbc:calcite:", info);
}
}
可以看到 QueryConnection
不是单例的实现,所以 Kylin Web 界面每发起一次查询时,这里都会借助 Calcite 建立一次链接。因此 info 中的model.json
的内容在每次查询时都会生成,但是最终写入磁盘的文件只会在第一次查询的时候生成。在OLAPSchemaFactory
内部有一个缓存Map<String, File> cachedJsons = Maps.newConcurrentMap()
,这个 key 值是 json 的内容,value 是 json 的路径。因此只有这个 map 还存在,那么这个文件是不会再生成的(注意,这里没有检测文件是否存在,所以在 kylin 启动起来后,删掉这个文件,那查询就只能报错了)。
"caseSensitive" -> "true"
"unquotedCasing" -> "TO_UPPER"
"model" -> "/home/li/Software/apache-kylin-4.0.0-beta-bin/bin/../tomcat/temp/olap_model_277471611618725080.json"
"typeSystem" -> "org.apache.kylin.query.calcite.KylinRelDataTypeSystem"
"conformance" -> "LENIENT"
"quoting" -> "DOUBLE_QUOTE"
{
"version": "1.0",
"defaultSchema": "DEMO",
"schemas": [
{
"type": "custom",
"name": "DEMO",
"factory": "org.apache.kylin.query.schema.OLAPSchemaFactory",
"operand": {
"project": "project_demo"
},
"functions": [
{
name: 'MASSIN',
className: 'org.apache.kylin.query.udf.MassInUDF'
},
{
name: 'INTERSECT_VALUE',
className: 'org.apache.kylin.measure.bitmap.BitmapIntersectValueAggFunc'
},
{
name: 'PERCENTILE',
className: 'org.apache.kylin.measure.percentile.PercentileAggFunc'
},
{
name: 'CONCAT',
className: 'org.apache.kylin.query.udf.ConcatUDF'
},
{
name: 'INTERSECT_COUNT',
className: 'org.apache.kylin.measure.bitmap.BitmapIntersectDistinctCountAggFunc'
},
{
name: 'VERSION',
className: 'org.apache.kylin.query.udf.VersionUDF'
},
{
name: 'PERCENTILE_APPROX',
className: 'org.apache.kylin.measure.percentile.PercentileAggFunc'
}
]
}
]
}