Category Archives: Drools

java.io.IOException: can not read class parquet.format.PageHeader: null – Hive

While evaluating the Cloudera Kite Morphlines , java.io.IOException: can not read class parquet.format.PageHeader:
I came across this exception while reading the
data from the table.

java.io.IOException: can not read class
parquet.format.PageHeader: null
.

Before going ahead let me give you the background what I am trying to do here.
I am building an application where external client will upload input XML files and there corresponding XSDs ,once these files are uploaded a job will run that will unmarshall these XML files into Java objects , later on these these java objects will be passed to Drools Framework where validation and minor transformations will be performed on this data. During this Continue reading java.io.IOException: can not read class parquet.format.PageHeader: null – Hive

Share and Enjoy

  • Facebook
  • Twitter
  • Delicious
  • LinkedIn
  • StumbleUpon
  • Add to favorites
  • Email
  • RSS

Anatomy of a Configuration File with Example- Cloudera Kite Morphlines

At the heart of the Cloudera Kite Morphlines Cloudera Kite Morphlines
is the configuration file that contains all of
your commands that you want to execute
as a part of your ETL process. In the last post we have seen the structure of a configuration file and how the commands are specified in the configuration file.

In Cloudera Kite Morphlines every configuration file ends with an extension of .conf , it is a little bit new and more specific to Morphlines. In this post we are going to dissect the configuration file that we had seen in the last post, we will see the flow of execution Continue reading Anatomy of a Configuration File with Example- Cloudera Kite Morphlines

Share and Enjoy

  • Facebook
  • Twitter
  • Delicious
  • LinkedIn
  • StumbleUpon
  • Add to favorites
  • Email
  • RSS

Cloudera Kite Morphlines Getting Started Example

Kite Morphlines development was initiated  Cloudera Kite Morphlines
as a part of Cloudera Search project and
later it was moved to Kite SDK to make it
more available to a wide range of users
and to invite contributions from the CDK
active community.Idea behind the Kite
Morphlines development is to streamline
the ETL processing , so that the time and
effort involved in Extraction , Transformation and Load of the huge data into Apache Solr, HBase, HDFS, Enterprise Data Warehouses can be reduced. Continue reading Cloudera Kite Morphlines Getting Started Example

Share and Enjoy

  • Facebook
  • Twitter
  • Delicious
  • LinkedIn
  • StumbleUpon
  • Add to favorites
  • Email
  • RSS