Hortonworks.com
  • Explore
    • All Tags
    • All Questions
    • All Repos
    • All Repos
    • All SKB
    • All SKB
    • All Articles
    • All Ideas
    • All Articles
    • All Ideas
    • All Users
    • All Badges
    • Leaderboard
  • Create
    • Ask a question
    • Add Repo
    • Add Repo
    • Create Article
    • Post Idea
    • Create Article
    • Post Idea
  • Tracks
    • All Tracks
    • Community Help
    • Cloud & Operations
    • CyberSecurity
    • Data Ingestion & Streaming
    • Data Processing
    • Data Science & Advanced Analytics
    • Design & Architecture
    • Governance & Lifecycle
    • Hadoop Core
    • Sandbox & Learning
    • Security
    • Solutions
  • Login
HCC Hortonworks Community Connection
  • Home /
  • Data Processing /
  • Home /
  • Data Processing /
  • LLAP troubleshooting and debugging /
avatar image

LLAP - a one-page architecture overview   
  • How to setup knox for Hive2(LLAP) service
  • Investigating LLAP cache hit rate
  • Investigating when LLAP doesn’t start
  • Investigating when the queries on LLAP are slow or stuck
  • LLAP debugging overview - logs, UIs, etc

  • Export to PDF
Sergey Shelukhin created · Dec 01, 2017 at 10:01 PM · edited · Dec 01, 2017 at 10:20 PM
6

Article

Introduction: how does LLAP fit into Hive

LLAP is a set of persistent daemons that execute fragments of Hive queries. Query execution on LLAP is very similar to Hive without LLAP, except that worker tasks run inside LLAP daemons, and not in containers.

High-level lifetime of a JDBC query:

Without LLAP With LLAP
Query arrives to HS2; it is parsed and compiled into “tasks” Query arrives to HS2; it is parsed and compiled into “tasks”
Tasks are handed over to Tez AM (query coordinator) Tasks are handed over to Tez AM (query coordinator)
Coordinator (AM) asks YARN for containers Coordinator (AM) locates LLAP instances via ZK
Coordinator (AM) pushes task attempts into containers Coordinator (AM) pushes task attempts as fragment into LLAP
RecordReader used to read data LLAP IO/cache used to read data

or RecordReader used to read data

Hive operators are used to process data Hive operators are used to process data*
Final tasks write out results into HDFS Final tasks write out results into HDFS
HS2 forwards rows to JDBC HS2 forwards rows to JDBC

* sometimes, minor LLAP-specific optimizations are possible - e.g. sharing a hash table for map join

Theoretically, a hybrid (LLAP+containers) mode is possible, but it doesn’t have advantages in most cases, so it’s rarely used (e.g.: Ambari doesn’t expose any knobs to enable this mode).

In both cases, query uses a Tez session (YARN app with aTez AM serving as a query coordinator). In container case, AM will start more containers in the same YARN app; in LLAP case, LLAP itself runs as an external, shared YARN app, so Tez session will only have one container (the query coordinator).

Inside LLAP daemon: execution

LAP daemon runs work fragments using executors. Each daemon has a number of executors to run several fragments in parallel, and a local work queue. For the Hive case, fragments are similar to task attempts – mappers and reducers. Executors essentially “replace” containers – each is used by one task at a time; the sizing should be very similar for both.

Inside LLAP daemon: IO

Optionally, fragments may make use of LLAP cache and IO elevator (background IO threads). In HDP 2.6, it’s only supported for ORC format and isn’t supported for most ACID tables. In 3.0, support is added for text, Parquet, and ACID tables. In HDInsight, text format is also added in 2.6.

Note that queries can still run in LLAP even if they cannot use the IO layer. Each fragment would only use one IO thread at a time. Cache stores metadata (on heap in 2.6, off heap in 3.0) and encoded data (off-heap); SSD cache option is also added in 3.0 (2.6 on HDInsight).

thub.nodes.view.add-new-comment
HiveHivearchitecturedebuggingfaqfaqfaqfaqfaqfaqfaqllapllap
image6.png (24.1 kB)
image9.png (44.5 kB)
image14.png (49.5 kB)
Add comment · Featured
10 |6000 characters needed characters left characters exceeded
▼
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Viewable by all users

Article

Contributors

avatar image

Follow

Follow

avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image

Navigation

LLAP troubleshooting and debugging
  • Investigating LLAP cache hit rate
  • Investigating when the queries on LLAP are slow or stuck
  • Investigating when LLAP doesn’t start
  • LLAP debugging overview - logs, UIs, etc
  • LLAP - a one-page architecture overview
  • How to setup knox for Hive2(LLAP) service

Related Articles

Row vs Columnar Storage For Hive

Teragen, Terasort, and Teravalidate Performance testing on AWS

What’s new in HDP 2.5 Hive, HBase, and Phoenix? Where can I find the documentation about the new features and how to use the functionality?

RecordReader : How Records from a file are read in hadoop

LLAP debugging overview - logs, UIs, etc

More Hadoop nodes = faster IO and processing time?

Investigating when LLAP doesn’t start

Investigating when the queries on LLAP are slow or stuck

Investigating LLAP cache hit rate

Teragen, Terasort, and Teravalidate Performance testing on Bigstep

HCC Guidelines | HCC FAQs | HCC Privacy Policy

Hortonworks - Develops, Distributes and Supports Open Enterprise Hadoop.

© 2011-2017 Hortonworks Inc. All Rights Reserved.
Hadoop, Falcon, Atlas, Sqoop, Flume, Kafka, Pig, Hive, HBase, Accumulo, Storm, Solr, Spark, Ranger, Knox, Ambari, ZooKeeper, Oozie and the Hadoop elephant logo are trademarks of the Apache Software Foundation.
Privacy Policy | Terms of Service

HCC Guidelines | HCC FAQs | HCC Privacy Policy | Privacy Policy | Terms of Service

© 2011-2018 Hortonworks Inc. All Rights Reserved.

Hadoop, Falcon, Atlas, Sqoop, Flume, Kafka, Pig, Hive, HBase, Accumulo, Storm, Solr, Spark, Ranger, Knox, Ambari, ZooKeeper, Oozie and the Hadoop elephant logo are trademarks of the Apache Software Foundation.

  • Anonymous
  • Login
  • Create
  • Ask a question
  • Add Repo
  • Add Repo
  • Create SupportKB
  • Create SupportKB
  • Create Article
  • Post Idea
  • Create Article
  • Post Idea
  • Tracks
  • Community Help
  • Cloud & Operations
  • CyberSecurity
  • Data Ingestion & Streaming
  • Data Processing
  • Data Science & Advanced Analytics
  • Design & Architecture
  • Governance & Lifecycle
  • Hadoop Core
  • Sandbox & Learning
  • Security
  • Solutions
  • Explore
  • All Tags
  • All Questions
  • All Repos
  • All Repos
  • All SKB
  • All SKB
  • All Articles
  • All Ideas
  • All Articles
  • All Ideas
  • All Users
  • Leaderboard
  • All Badges