hadoop环境搭建(四)

完全分布式部署

一、配置集群

100

101

102

HDFS

namenode、

datanode

datanode

Secondary namenode(2nn)、datanode

YARN

nodemanager

resourcemanager、nodemanager

nodemanager

二、修改配置:

切换路径:cd /opt/module/hadoop-3.1.3/etc/hadoop/

 (1)配置core-site.xml,在cd /opt/module/hadoop-3.1.3/etc/hadoop/路径下,

vi core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop1000:9820</value>
</property>

<!-- hadoop.data.dir是自定义的变量,下面的配置文件会用到 -->
    <property>
        <name>hadoop.data.dir</name>
        <value>/opt/module/hadoop-3.1.3/data</value>
    </property>
</configuration>

(2)配置hdfs-site.xml,在cd /opt/module/hadoop-3.1.3/etc/hadoop/路径下,

vi hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
  <!-- namenode数据存放位置 -->
<property>
    <name>dfs.namenode.name.dir</name>
    <value>file://${hadoop.data.dir}/name</value>
  </property>

  <!-- datanode数据存放位置 -->
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>file://${hadoop.data.dir}/data</value>
  </property>

  <!-- secondary namenode数据存放位置 -->
    <property>
    <name>dfs.namenode.checkpoint.dir</name>
    <value>file://${hadoop.data.dir}/namesecondary</value>
  </property>

  <!-- datanode重启超时时间是30s,解决兼容性问题,跳过 -->
  <property>
    <name>dfs.client.datanode-restart.timeout</name>
    <value>30</value>
  </property>

  <!-- 设置web端访问namenode的地址 -->
<property>
    <name>dfs.namenode.http-address</name>
    <value>hadoop1000:9870</value>
</property>

  <!-- 设置web端访问secondary namenode的地址 -->
  <property>
    <name>dfs.namenode.secondary.http-address</name>
    <value>hadoop1002:9868</value>
  </property>
</configuration>

(3)配置yarn-site.xml,在cd /opt/module/hadoop-3.1.3/etc/hadoop/路径下,

vi yarn-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>hadoop1001</value>
    </property>
    <property>
        <name>yarn.nodemanager.env-whitelist</name>
        <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
    </property>
</configuration>

三、ssh免密登录

免密登录之前 每次远程访问主机都需要输密码:

1、在每个节点生成公钥和私钥,并拷贝

Hadoop1000   生成公钥和私钥:

[[email protected]] ssh-keygen -t rsa

然后敲(三个回车)

将公钥拷贝到要免密登录的目标机器上

[[email protected]] ssh-copy-id hadoop1000

[[email protected]] ssh-copy-id hadoop1001

[[email protected]] ssh-copy-id hadoop1002

Hadoop1001:生成公钥和私钥

[[email protected]] ssh-keygen -t rsa

然后敲(三个回车)

将公钥拷贝到要免密登录的目标机器上

[[email protected]] ssh-copy-id hadoop1000

[[email protected]] ssh-copy-id hadoop1001

[[email protected]] ssh-copy-id hadoop1002

Hadoop1002:   生成公钥和私钥

[[email protected]] ssh-keygen -t rsa

然后敲(三个回车)

将公钥拷贝到要免密登录的目标机器上

[[email protected]] ssh-copy-id hadoop1000

[[email protected]] ssh-copy-id hadoop1001

[[email protected]] ssh-copy-id hadoop1002

远程访问主机 命令:ssh hadoop1001(设置免密登录后,不用输密码直接进入hadoop1001)

 登出:exit  路径切换到etc:cd ..

 四、复制文件

复制文件到hadoop1001:

scp -r hadoop/ [email protected]:/opt/module/hadoop-3.1.3/etc/

复制文件到hadoop1002:

scp -r hadoop/ [email protected]:/opt/module/hadoop-3.1.3/etc/

 五、格式化namenode (namenode启动不了需要删除data文件和logs)

命令:hdfs namenode -format

六、集群单点启动

Hadoop1000:

hdfs --daemon start namenode

hdfs --daemon start datanode

yarn --daemon start nodemanager

hadoop1001:

yarn --daemon start resourcemanager

hdfs --daemon start datanode

yarn --daemon start nodemanager

hadoop1002:

hdfs --daemon start secondarynamenode

hdfs --daemon start datanode

yarn --daemon start nodemanager

本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
THE END
分享
二维码
< <上一篇
下一篇>>