Data Lake Analysis Leader / Principal / Expert [China]


 

$ads={1}

Responsibilities

1. Create an industry-leading PB-level data lake federation analysis engine, and support almost all ByteDance product lines (such as Douyin, Toutiao) ETL and ad-hoc queries 2. Create a unified distributed metadata view of the company, and through DAG / The MPP computing engine supports direct query of heterogeneous data sources and joint analysis across data sources 3. Responsible for the design and development of data lake storage solutions, supporting incremental updates, cache acceleration, etc. 4. Responsible for endorsement of technologies & products, creating influence inside and outside the company.


Qualifications

1. Familiar with the principles and source codes of mainstream big data systems such as Spark, Presto, Druid, Kylin, Hive, Impala, Flink (not required to be familiar with all technology stacks) 2. Master the optimization principles of mainstream OLAP engines, including but not limited to vectorized execution, columnar storage, late materialization, Dynamic Filter, code generation 3 . Understand the principles of open source data lake storage solutions (such as Delta Lake, Hudi, IceBerg) 4. Rich experience in big data computing and storage related fields, and large-scale implementation of applications is preferred 5. Experience in Spark, Presto, Flink, Calcite communities is preferred.

$ads={2}


 

.

Post a Comment

Previous Post Next Post

Sponsored Ads

نموذج الاتصال