Categories of Data in Big Data Technology

In Big Data, we have 3 different types of data to be dealt with.

Structured Data: 

Any data which has definite strict schema associated with it is called Structured Data. For example: data in tabular form like CSV file or relational table etc.

Semi-Structured Data: 

Any data which does not have strict schema associated with it but we can make some sense out of it is called Semi-Structured Data. For example: XML and JSON files.

Unstructured Data:

Any data which does not have any schema associated with it is called Unstructured data. For example: Tweet, Audio, Video etc.


Tools based on their handling of different categories of data:-

Structured Data: Hive, Pig and Sqoop

Semi-Structured Data: Hive, Pig

UnStructured Data: Pig

Comments

Popular posts from this blog

Hadoop Architecture version 1.x

Hadoop Components

What is Data Analytics?