Sunday, July 17, 2016

Hadoop cluster prerequisite setup in Linux server part2

supported databases

  1. Cloudera's recommendations are:
    • For Red Hat and similar systems:
      • Use MySQL server version 5.0 (or higher) 
      • Use MySQL server version 5.1 (or higher) 

    • For SLES systems, use MySQL server version 5.0 (or higher) and version 5.0 client shared libraries.
    • For Ubuntu systems:
      • Use MySQL server version 5.5 (or higher) and version 5.0 client shared libraries on Precise (12.04).
  2. For connectivity purposes only, Sqoop 1 supports MySQL5.1, PostgreSQL 9.1.4, Oracle 10.2, Teradata 13.1, and Netezza TwinFin 5.0. The Sqoop metastore works only with HSQLDB (1.8.0 and higher 1.x versions; the metastore does not work with any HSQLDB 2.x versions).
  3. Sqoop 2 can transport data to and from MySQL5.1, PostgreSQL 9.1.4, Oracle 10.2, and Microsoft SQL Server 2012. The Sqoop 2 repository is supported only on Derby.
  4. Derby is supported as shown in the table, but not always recommended. 

Supported Transport Layer Security Versions

The following components are supported by Transport Layer Security (TLS):

Components Supported by TLS

Role                         Port         Version
CMr Server         7182         TLS 1.2
CM server                 7183         TLS 1.2
Flume 9099         TLS 1.2
HBase Master 60010 TLS 1.2
NameNode 50470 TLS 1.2
Secondary NN         50495 TLS 1.2
Hive HiveServer2 10000 TLS 1.2
Hue Hue Server 8888         TLS 1.2
Impala Daemon 21000 TLS 1.2
Impala Daemon 21050 TLS 1.2
Impala Daemon 22000 TLS 1.2
Impala Daemon 25000 TLS 1.2
Impala StateStore 24000 TLS 1.2
Impala StateStore 25010 TLS 1.2
Catalog Server         25020 TLS 1.2
Catalog Server         26000 TLS 1.2
Oozie Oozie Server 11443 TLS 1.1
Solr                           8983         TLS 1.1
Solr                           8985         TLS 1.1
YARN RM            8090         TLS 1.2
JobHistory Server 19890 TLS 1.2

Resource Requirements

Cloudera Manager requires the following resources:

Disk Space - Cloudera Manager Server
5 GB on the partition hosting /var.
500 MB on the partition hosting /usr.

For parcels, the space required depends on the number of parcels you download to the Cloudera Manager Server and distribute to Agent hosts. You can download multiple parcels of the same product, of different versions and builds. If you are managing multiple clusters, only one parcel of a product/version/build/distribution is downloaded on the Cloudera Manager Server—not one per cluster. In the local parcel repository on the Cloudera Manager Server, the approximate sizes of the various parcels are as follows:

Cloudera Impala - 200 MB per parcel
Cloudera Search - 400 MB per parcel

Cloudera Management Service -The Host Monitor and Service Monitor databases are stored on the partition hosting /var. Ensure that you have at least 20 GB available on this partition. By default unpacked parcels are located in /opt/cloudera/parcels.

RAM - 4 GB is recommended for most cases and is required when using Oracle databases. 2 GB may be sufficient for non-Oracle deployments with fewer than 100 hosts. However, to run the Cloudera Manager Server on a machine with 2 GB of RAM, you must tune down its maximum heap size (by modifying -Xmx in /etc/default/cloudera-scm-server). Otherwise the kernel may kill the Server for consuming too much RAM.

Python - Cloudera Manager and CDH 4 require Python 2.4 or higher, but Hue in CDH 5 and package installs of CDH 5 require Python 2.6 or 2.7. All supported operating systems include Python version 2.4 or higher.

Perl - Cloudera Manager requires perl.

Back to part1

No comments: