CCA Data Analyst (CCA159) Exam

Cloudera Distribution Hadoop (CDH) is one of the world’s most common distribution of Hadoop, has all the leading components of the Hadoop ecosystem to store, process, discover, model, and serve unlimited data, and is designed to meet the highest standards of stability and reliability for the business.

Cloudera is offering certain exams and those exams will definitely help you to promote your skills and validate your Hadoop abilities for Hadoop jobs that list Cloudera Hadoop Certification as a requirement if you have the CDH certificate on your resume. Professionals with a qualification from Cloudera Hadoop get salary rises and there is a greater likelihood of promotions.

CDH is offering the following exams:

  • CCP Data Engineer
  • CCA Spark and Hadoop Developer
  • CCA Data Analyst
  • CCA Administrator
  • CCA HDP Administrator Exam

We are going to discuss the “CCA Data Analyst” exam. The syllabus of the exam is loading, transforming, and modeling the data in order to define the relationships and extracting meaningful results from the raw input.

Below are the required skills for the exam-

Provide Structure to the Data

  • Use Data Definition Language (DDL) statements to create or alter structures in the metastore for use by Hive and Impala.
    • You must be familiar with database creation, altering the database properties, creating and modifying the table structures.
  • Create tables using a variety of data types, delimiters, and file formats
    • You must practice on all available primitive data types along with the complex data types such as Arrays, Map and Structs.
    • You Should be familiar with various file formats and compressions available in Hive such as Text file format, Avro, Parquet, RC, Sequence file formats, etc.
  • Create new tables using existing tables to define the schema
    • Practice on CTAS (Create Table AS) and “Create Table – Like” to create table structures from the existing tables.
  • Improve query performance by creating partitioned tables in the metastore
    • Must be familiar with Static and Dynamic partitioning in Hive.
  • Alter tables to modify the existing schema
    • You should practice how to change the table’s location, altering the table’s structure, modifying and replacing the columns in Hive.
  • Create views in order to simplify queries
    • SQL lovers are already aware of this concept, if you aren’t one of them, you must practice on Views.

Data Analysis

  • Use Query Language (QL) statements in Hive and Impala to analyze data on the cluster.
    • You should be familiar with SELECT statements with Operators, Clauses, Conditional clauses, Built-in functions such as String functions, Arithmetic functions, Date functions, etc.
  • Prepare reports using SELECT commands including unions and subqueries
    • Be prepared with Sub-Queries, Common Table Expressions, Union & Union All functions.
  • Calculate aggregate statistics, such as sums and averages, during a query
    • Arithmetic and statistic functions such as Rank, Dense_Rank, Percentile, Standard Deviation, etc.
  • Create queries against multiple data sources by using join commands
    • Simply, all types of Joins.
  • Transform the output format of queries by using built-in functions
    • Cast and Convert functions, Grouping functions, etc.
  • Perform queries across a group of rows using windowing functions
    • Practice on Windowing functions and Analytical functions

The exam measures your in-depth knowledge of Apache Hive and Impala. Apache Sqoop has recently been withdrawn from the specification. Compared to other tech tests, the exam is a little expensive. Visit Cloudera’s website to know about the exam, to buy the voucher and Do’s and Don’ts. Some key points are mentioned below for your knowledge.

  • To pass the test, you must have advanced skills.
  • On a live cluster supported by Cloudera, you will be given 8–12 hands-on problems.
  • Exam duration is 2 hours.
  • Exam voucher costs $295 which is valid for one year from the date of purchase.
  • There’s no on-premise test, unlike Oracle, Microsoft, etc. It’s an online test, and you need high-speed broadband on your desktop/laptop.
  • In order to perform their tests, Cloudera collaborated with examslocal. Once you purchase the voucher, Cloudera will let you know how to proceed further. Signup in examslocal with the same email ID you used in Cloudera. In examslocal, you will need to select your exam and schedule it.
  • Passing criteria is 70 % and the validity of the CCA certificate is two years.
  • If you are already a registered user in Cloudera and appearing for the exam, Cloudera has already provided a sample question and the exam pattern in their portal – Click on this link.

The upcoming posts on this website will take you through what is needed for the practice that will not only help in passing the exam but also helpful in solving the real-time issues. Please remember this website will not provide any exam simulator or the questions and answers to pass the test. This will take you through the portions that is needed for the exam.

As stated above, Cloudera changed its requirement for some of the exam. CCA159 is one of them. Cloudera wanted certificate-seekers to be more in line with updates to the courses that they teach. They removed some topics from the previous versions of the exam (i.e. Sqoop used to be on the exam and is now gone).

Hadoop and its ecosystem component’s study material is under preparation and will be available in this site soon for free of cost.

Please do follow the blog via Email to receive email notifications.

Keep in touch.!!

One comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s