HDFS Namenode ,FsImage,Editlogs Backup And Restore
How to perform HDFS metadata backup:
Backing up HDFS primarily involves creating a
latest fsimage and fetching & copying it to another DR location. This can
be done in there basic steps:
Note: These steps involves putting HDFS under safe
mode (ready-only mode), so the Hadoop admins need to plan for that.
1. Become HDFS superuser
1.  # su - hdfs
2. (Optional) If Kerberos authentication is
enabled, then do kinit as well
1.  # kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs@EXAMPLE.COM
3. Put HDFS in safemode, so no write operation will
be allowed
1.  # hdfs dfsadmin -safemode enter
2. Create new fsimage by merging any outstanding
edit logs with the latest fsimage, saving the full state to a new fsimage file,
and rolling edits
1.  # hdfs dfsadmin -saveNamespace
3. Copy the latest fsimage from HDFS to directory
on local file system. This file can be stored for backup purpose
1.  # hdfs dfsadmin -fetchImage <local_dir>
4. Get Namenode out of safe mode to allow write
operation and normal operations
1.  # hdfs dfsadmin -safemode leave
Explained above is a very basic level of HDFS
metadata backup.
Apart from this, one can also plan to
backup & maintain elaborated HDFS artifacts like the fsck output, directory
listing, dfsadmin report and all the fsimage + editlog + checkpoints.
Back up the following critical data.
- On
     the node that hosts the NameNode, open the Hadoop Command Line shortcut
     (or open a command window in the Hadoop directory). As the hadoop user, go to the HDFS
     home directory:
runas /user:hadoop "cmd /K cd
%HDFS_DATA_DIR%"
- Run
     the fsck command to fix any file system errors.
hdfs fsck / -files -blocks -locations
> dfs-old-fsck-1.log
The console output is printed to the dfs-old-fsck-1.log file.
- Capture
     the complete namespace directory tree of the file system:
hdfs dfs -ls -R / >
dfs-old-lsr-1.log
- Create
     a list of DataNodes in the cluster:
hdfs dfsadmin -report >
dfs-old-report-1.log
- Capture
     output from the fsck command:
hdfs fsck / -blocks -locations -files
> fsck-old-report-1.log
Verify that there are no missing or
corrupted files/replicas in the fsck command output.
- Save
     the HDFS namespace:
- Place
      the NameNode in safe mode, to keep HDFS from accepting any new writes:
hdfs dfsadmin -safemode enter
- Save
      the namespace.
hdfs dfsadmin -saveNamespace
| 
Warning | |
| 
From this point on, HDFS should
  not accept any new writes. Stay in safe mode! | 
- Finalize
      the namespace:
 hdfs namenode -finalize
- On
      the machine that hosts the NameNode, copy the following checkpoint
      directories into a backup directory:
5.  %HDFS_DATA_DIR%\hdfs\nn\edits\current
6.  %HDFS_DATA_DIR%\hdfs\nn\edits\image 
%%HDFS_DATA_DIR%\hdfs\nn\edits\previous.checkpoint
7.Get Namenode out of safe mode to allow write
operation and normal operations
1.  # hdfs dfsadmin -safemode leave
Restoring Name Node Metadata
This section describes how to restore Name Node metadata.
If both the Name Node and the secondary Name Node were to suddenly go offline,
you can restore the Name Node by doing the following:
- Add
     a new host to your Hadoop cluster.
- Add
     the Name Node role to the host. Make sure it has the same host name as the
     original Name Node.
- Create
     a directory path for the Name Node name.dir (for
     example, /dfs/nn/current), ensuring that the permissions are set
     correctly.
- Copy
     the VERSION and latest fsimage file to
     the /dfs/nn/current directory.
- Run
     the following command to create the md5 file for the fsimage.
$ md5sum fsimage > fsimage.md5
- Start
     the Name Node process.
Here is another way --- The Shahed Way
Put HDFS in safemode, so no write
operation will be allowed
# hdfs dfsadmin -safemode enter
Create new fsimage by merging any
outstanding edit logs with the latest fsimage, saving the full state to a new
fsimage file, and rolling edits
# hdfs dfsadmin -saveNamespace
Copy the latest fsimage from HDFS to
directory on local file system. This file can be stored for backup purpose
# hdfs dfsadmin -fetchImage
<local_dir>
Get Namenode out of safe mode to
allow write operation and normal operations
# hdfs dfsadmin -safemode leave
NOTES:
- Safemode will impact any HDFS clients that are
     trying to write to HDFS.
- The active NameNode is the source of truth for
     any HDFS operation.
- A good practice is to perform a back once per month but more often never hurts
 

 
 
 
 
 
 
 
 
 
 
 
        
    
Note: only a member of this blog may post a comment.