Hive External Table

Hive: External Tables


External Tables:-
             As the name implies, these tables are external to the Hive warehouse location. i.e. In external tables, data will not be stored generally on the Hive warehouse location instead external tables will store the data in a location which we are specifying in schema creation (definition) time by using "EXTERNAL" keyword.
When we drop an external table, Hive will leave the data untouched and Hive will only delete the metadata.



    Syntax:-
                           create external table <tablename>(col1 datatype,col2 datatype.........)
                           row format delimited
                           fields terminated by 'delimiter character'
                           stored as <fileformat>
                           location 'HDFS path';

Creation of External table and loading the data in it:-
hive (hivedb)> create external table external_tab(year int,cause string,sex string,race_ethnicity string,death string,deathrate double,ageadjusted_deathrate double)
             > row format delimited
             > fields terminated by ','
             > lines terminated by '\n'
             > stored as textfile
             > location '/externaltab';
OK
Time taken: 2.033 seconds
hive (hivedb)> desc formatted external_tab;
OK
# col_name             data_type            comment             
    
year                 int                                      
cause                string                                   
sex                  string                                   
race_ethnicity       string                                   
death                string                                   
deathrate            double                                   
ageadjusted_deathrate double                                   
    
# Detailed Table Information    
Database:            hivedb                
Owner:               hduser                
CreateTime:          Tue Jan 24 01:02:02 PST 2017  
LastAccessTime:      UNKNOWN               
Retention:           0                     
Location:            hdfs://localhost:9000/externaltab  
Table Type:          EXTERNAL_TABLE        
Table Parameters:    
 EXTERNAL             TRUE                
 numFiles             1                   
 totalSize            92389               
 transient_lastDdlTime 1485248583          
    
# Storage Information    
SerDe Library:       org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
InputFormat:         org.apache.hadoop.mapred.TextInputFormat  
OutputFormat:        org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat  
Compressed:          No                    
Num Buckets:         -1                    
Bucket Columns:      []                    
Sort Columns:        []                    
Storage Desc Params:    
 field.delim          ,                   
 line.delim           \n                  
 serialization.format ,                   
Time taken: 3.05 seconds, Fetched: 36 row(s)
hive (hivedb)> load data local inpath '/home/hduser/HIVE/NYLeadingCausesOfDeath.csv' into table external_tab;
Loading data to table hivedb.external_tab
OK
Time taken: 4.756 seconds
hive (hivedb)> select year,cause,sex from external_tab limit 10;
OK
NULL        Leading Cause        Sex
2014        Diabetes Mellitus (E10-E14)        F
2011        Cerebrovascular Disease (Stroke: I60-I69)        M
2008        Malignant Neoplasms (Cancer: C00-C97)        M
2010        Malignant Neoplasms (Cancer: C00-C97)        F
2012        Cerebrovascular Disease (Stroke: I60-I69)        M
2007        Mental and Behavioral Disorders due to Use of Alcohol (F10)        M
2011        All Other Causes        F
2007        Chronic Lower Respiratory Diseases (J40-J47)        F
2012        "Essential Hypertension and Renal Diseases (I10         I12)"
Time taken: 4.607 seconds, Fetched: 10 row(s)



Loading the data in the External table is same as loading the data in Managed table.

Dropping the External table:-
hive (hivedb)> drop table external_tab;
OK
Time taken: 10.196 seconds
hive (hivedb)> select * from external_tab;
FAILED: SemanticException [Error 10001]: Line 1:14 Table not found 'external_tab'
hive (hivedb)> show tables;
OK
ny_cause_of_death
ny_cause_of_death_new
Time taken: 0.114 seconds, Fetched: 2 row(s)
After dropping the external table, we can see the actual data "NYLeadingCausesOfDeath.csv" is still present in "/externaltab" directory.

We can reuse the same data which is stored in '/externaltab' directory. But in this case, no need to load the data again.
hive (hivedb)> create external table reuse_extrnl_tab(year int,cause string,sex string,race_ethnicity string,death string,deathrate double,ageadjusted_deathrate double) 
             > row format delimited
             > fields terminated by ','
             > stored as textfile
             > location '/externaltab'
             > TBLPROPERTIES ("skip.header.line.count"="1");
OK
Time taken: 0.573 seconds
hive (hivedb)> select * from reuse_extrnl_tab limit 10;
OK
2014 Diabetes Mellitus (E10-E14) F Other Race/ Ethnicity 11 NULL NULL
2011 Cerebrovascular Disease (Stroke: I60-I69) M White Non-Hispanic 290 21.7 18.2
2008 Malignant Neoplasms (Cancer: C00-C97) M Not Stated/Unknown 60 NULL NULL
2010 Malignant Neoplasms (Cancer: C00-C97) F Hispanic 1045 85.9 98.5
2012 Cerebrovascular Disease (Stroke: I60-I69) M Black Non-Hispanic 170 19.9 23.3
2007 Mental and Behavioral Disorders due to Use of Alcohol (F10) M Not Stated/Unknown . NULL NULL
2011 All Other Causes F Not Stated/Unknown 14 NULL NULL
2007 Chronic Lower Respiratory Diseases (J40-J47) F Black Non-Hispanic 163 15.5 14.8
2012 "Essential Hypertension and Renal Diseases (I10  I12)" F Hispanic 101.0 8.2
2009 "Diseases of Heart (I00-I09  I11  I13  I20-I51)" NULL NULL
Time taken: 0.543 seconds, Fetched: 10 row(s)

Difference between Managed tables and External tables:-



Comments

  1. thakyou it vry nice blog for beginners
    https://www.emexotechnologies.com/courses/big-data-analytics-training/big-data-hadoop-training/

    ReplyDelete
    Replies
    1. Hive External Table >>>>> Download Now

      >>>>> Download Full

      Hive External Table >>>>> Download LINK

      >>>>> Download Now

      Hive External Table >>>>> Download Full

      >>>>> Download LINK ym

      Delete
  2. Hive External Table >>>>> Download Now

    >>>>> Download Full

    Hive External Table >>>>> Download LINK

    >>>>> Download Now

    Hive External Table >>>>> Download Full

    >>>>> Download LINK gg

    ReplyDelete

Post a Comment