From the viewpoint of computer framework, it can be a situation of dilemma whether one needs to go for Hadoop or Hive as both of them are effective options.
Hadoop forms a software or framework that was devised to deal with large data or big data. Hadoop can be utilised for stacking and processing the huge data dealt out all through a cluster of commodity servers.
Hadoop stacks the data utilising Hadoop dealt out file system and query or process it utilising Map Reduce programming model.
Hadoop’s Main Components
Hadoop common or base: Hadoop common shall offer a single forum to put in place the whole of its components.
Hadoop distributed file system or HDFS: HDFS forms the main portion of the Hadoop framework and manages the whole of the data within the Hadoop Cluster. It functions over slave or master architecture and stacks the data utilising replication.
Slave or master architecture plus replication:
Name node or master node: Name nodes stack the metadata of files stacked or each block stacked in HDFS, and HDFS tends to have merely one master node – as regards HA one more master node shall work by way of the secondary master node.
Data node or slave node: Data nodes enclose actual data files within blocks. HDFS shall have many data nodes.
Replication: HDFS stacks its data within by separating it into blocks. Here block size happens to be 64 MB. Because of replication data is stacked in 3 (you can increase default replication factor according to need) dissimilar data nodes, therefore, there happens to be the least probability of losing the data if there takes place any node failure.
Hive forms an application that runs on Hadoop framework & offers an SQL akin to an interface for query or processing the data. Hive has been planned and developed by Facebook prior to becoming portion relating to the Apache- Hadoop project.
Hive runs the query utilising Hive query language or HQL. Hive possesses a similar structure like RDBMS, and nearly similar commands shall be utilised in Hive.
Hive can stack the data within external tables. Hence it cannot be taken compulsorily to be utilised HDFS as well it backs file formats like test files, sequence file, Avro files. ORC etc.
Below are discussed Hive main components that contender is needed to know during the hiring process when Hive interview questions are asked. It can make their recruitment process easy.
Hive’s Main Component:-
Hive client: Not merely an SQL, Hive as well backs programming languages such as Python, C, Java utilising different drivers like Thrift, JDBC, and ODBC. You can script any Hive client application using other languages and are capable of running in Hive utilising these clients.
Hive Services: Within Hive services, the completion of queries and commands come about. Hive web interface has got some sub-components.
CLI: Default command line interface offered by Hive for the completion of Hive commands or queries.
Hive Web Interfaces: This forms an easy graphical user interface. This makes a substitute to Hive command line & utilised to run commands and queries within Hive application.