[Apache Hadoop] Apache Hadoop / ํ๋ก ์ฑ๊ธ์๋ฒ ์ค์น 3๋ฒ์
์ฑ๊ธ๋
ธ๋์์ ํ๋ก์ ์ด๋ป๊ฒ ์ค์นํ๋์ง ์์ฑํ๋ คํ๋ค.
ํ๋ก์ ์ค์นํ๊ธฐ ์ ์ ๊ธฐ๋ณธ์ ์ผ๋ก os์์ ์ค์ ์ ๋ฐ๊ฟ์ผ ํ๋๊ฒ ์๋ค.
์ผ๋จ ๊ธฐ๋ณธ์ ์ผ๋ก os ๋ฒ์ ์ cent os7.x ์ด๋ค.
์ธํ๋ผ๋ฅผ ์ด์ํ ๋ root ๊ณ์ ์ ์ฌ์ฉํ์ง ์๋๋ค.
๋ค๋ฅธ ๊ณ์ ์ ์์ฑํ์ฌ sudo ๊ถํ์ ๋ถ์ฌ๋ฐ๊ณ ์ต๋ํ ๊ทธ ๊ณ์ ์ผ๋ก ์ค์น๋ฅผ ์งํํ๋ค.
ํ๋ก์ ์ค์นํ๊ธฐ ์ ์?
๋ฐฉํ๋ฒฝ์ ๋ด๋ฆฌ๊ณ selinux disabled ๊ทธ๋ฆฌ๊ณ ์ ์ ๊ณ์ ์ ์ถ๊ฐํ์ฌ ๋๋๋ก root๊ณ์ ์ ์ฌ์ฉํ์ง ์๊ณ ์งํํ๋ คํ๋ค.
์๋ฐ๋ open jdk 1.8๋ฒ์ ์ ์ค์นํ์ฌ ํ๊ฒฝ๋ณ์ ์ถ๊ฐํ๋ค.
ํด๋น ๋ด์ฉ์ ์ถํ์ ์ถ๊ฐ์์ ์ด๋ค.
ํ๋ก ์ค์น
์ค์น ๋ฒ์ : Apache Hadoop 3.1.1
tar.gz ๋ฒ์ ์ผ๋ก ๋ค์ด๋ก๋ ์คํ
ํ๊ฒฝ์ค์ ์ ํ๊ธฐ ํ์ด์ง ์ฐธ๊ณ ํ์ฌ ์งํํจ
๋ถ์ฐ ๋ชจ๋๋ก ์ค์นํ์ฌ ์งํํ ๋ ํ๊ธฐ ํ์ด์ง ์ฐธ๊ณ ํ์ฌ ํ๊ฒฝ ๋ณ์ ์ค์
Apache Hadoop 3.1.1 โ Hadoop Cluster Setup
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | $ sudo yum install openssh* $ wget https://archive.apache.org/dist/hadoop/common/hadoop-3.1.1/hadoop-3.1.1.tar.gz $ sudo tar xvzf hadoop-3.1.1.tar.gz -C /home/hadoop $ cd /home/hadoop $ vi .bash_profile #JAVA export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk ClASSPATH=$JAVA_HOME/lib/*:$CLASSPATH #HADOOP export HADOOP_HOME=/data/platform/hadoop-3.1.1 PATH=$PATH:$HOME/.local/bin:$HOME/bin:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin export PATH CLASSPATH $ source .bash_profile | cs |
ํ๋ก ์ค์น ํ ํ๋ก config ์ค์ ํ๊ธฐ [์ฑ๊ธ๋ ธ๋ ๊ธฐ์ค// ํด๋ฌ์คํฐ์ผ ๊ฒฝ์ฐ port, hostname ๋ฑ๋ฑ ๊ณ ๋ คํ์ฌ ์ค์ ํด์ผํจ]
hadoop-env.sh
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | $ vi $HADOOP_HOME/etc/hadoop/hadoop-env.sh export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk # To prevent accidents, shell commands be (superficially) locked # to only allow certain users to execute certain subcommands. # It uses the format of (command)_(subcommand)_USER. # # For example, to limit who can execute the namenode command, # export HDFS_NAMENODE_USER=hdfs # export HADOOP_CLASSPATH= export HDFS_NAMENODE_USER="user_name" export HDFS_DATANODE_USER="user_name" export HDFS_SECONDARYNAMENODE_USER="user_name" | cs |
yarn-env.sh
1 2 3 4 | $ vi $HADOOP_HOME/etc/hadoop/yarn-env.sh #YARN USER SETTING export YARN_RESOURCEMANAGER_USER="user_name" export YARN_NODAMANAGER_USER="user_name" | cs |
core-site.xml
1 2 3 4 5 6 | <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property> </configuration> | cs |
hdfs-site.xml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.rpc-bind-host</name> <value>0.0.0.0</value> </property> <property> <name>dfs.namenode.servicerpc-bind-host</name> <value>0.0.0.0</value> </property> <property> <name>dfs.namenode.http-bind-host</name> <value>0.0.0.0</value> </property> <property> <name>dfs.namenode.https-bind-host</name> <value>0.0.0.0</value> </property> <property> <name>dfs.client.datanode-restart.timeout</name> <value>30</value> </property> </configuration> | cs |
mapred-site.xml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.map.memory.mb</name> <value>1024</value> </property> <property> <name>mapreduce.reduce.memory.mb</name> <value>2560</value> </property> <property> <name>mapreduce.application.classpath</name> <value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>hostname:10020</value> </property> </configuration> | cs |
yarn-site.xml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.env-whitelist</name> <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>hostname:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>hostname:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>hostname:8031</value> </property> </configuration> | cs |
์ฌ๊ธฐ์ ๋ฐ์ดํฐ๋ ธ๋, ๋ค์๋ ธ๋์ ํฌํธ๋ฒํธ๋ ์ถํ ๋ค๋ฅธ ํ๋ก์์ฝ์์คํ ์ค์นํ ๋ ๋ ๋์ผํ๊ฒ ์ค์ ํด์ค์ผํ๋ ๋ถ๋ถ์ด์๋ค.
* 1GB = 1024 MB
* 1GiB = 1000MB