HDFS Namenode ,FsImage,Editlogs Backup And Restore
How to perform HDFS metadata backup:
Backing up HDFS primarily involves creating a
latest fsimage and fetching & copying it to another DR location. This can
be done in there basic steps:
Note: These steps involves putting HDFS under safe
mode (ready-only mode), so the Hadoop admins need to plan for that.
1. Become HDFS superuser
1. # su - hdfs
2. (Optional) If Kerberos authentication is
enabled, then do kinit as well
1. # kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs@EXAMPLE.COM
3. Put HDFS in safemode, so no write operation will
be allowed
1. # hdfs dfsadmin -safemode enter
2. Create new fsimage by merging any outstanding
edit logs with the latest fsimage, saving the full state to a new fsimage file,
and rolling edits
1. # hdfs dfsadmin -saveNamespace
3. Copy the latest fsimage from HDFS to directory
on local file system. This file can be stored for backup purpose
1. # hdfs dfsadmin -fetchImage <local_dir>
4. Get Namenode out of safe mode to allow write
operation and normal operations
1. # hdfs dfsadmin -safemode leave
Explained above is a very basic level of HDFS
metadata backup.
Apart from this, one can also plan to
backup & maintain elaborated HDFS artifacts like the fsck output, directory
listing, dfsadmin report and all the fsimage + editlog + checkpoints.
Back up the following critical data.
- On
the node that hosts the NameNode, open the Hadoop Command Line shortcut
(or open a command window in the Hadoop directory). As the hadoop user, go to the HDFS
home directory:
runas /user:hadoop "cmd /K cd
%HDFS_DATA_DIR%"
- Run
the fsck command to fix any file system errors.
hdfs fsck / -files -blocks -locations
> dfs-old-fsck-1.log
The console output is printed to the dfs-old-fsck-1.log file.
- Capture
the complete namespace directory tree of the file system:
hdfs dfs -ls -R / >
dfs-old-lsr-1.log
- Create
a list of DataNodes in the cluster:
hdfs dfsadmin -report >
dfs-old-report-1.log
- Capture
output from the fsck command:
hdfs fsck / -blocks -locations -files
> fsck-old-report-1.log
Verify that there are no missing or
corrupted files/replicas in the fsck command output.
- Save
the HDFS namespace:
- Place
the NameNode in safe mode, to keep HDFS from accepting any new writes:
hdfs dfsadmin -safemode enter
- Save
the namespace.
hdfs dfsadmin -saveNamespace
Warning
|
|
From this point on, HDFS should
not accept any new writes. Stay in safe mode!
|
- Finalize
the namespace:
hdfs namenode -finalize
- On
the machine that hosts the NameNode, copy the following checkpoint
directories into a backup directory:
5. %HDFS_DATA_DIR%\hdfs\nn\edits\current
6. %HDFS_DATA_DIR%\hdfs\nn\edits\image
%%HDFS_DATA_DIR%\hdfs\nn\edits\previous.checkpoint
7.Get Namenode out of safe mode to allow write
operation and normal operations
1. # hdfs dfsadmin -safemode leave
Restoring Name Node Metadata
This section describes how to restore Name Node metadata.
If both the Name Node and the secondary Name Node were to suddenly go offline,
you can restore the Name Node by doing the following:
- Add
a new host to your Hadoop cluster.
- Add
the Name Node role to the host. Make sure it has the same host name as the
original Name Node.
- Create
a directory path for the Name Node name.dir (for
example, /dfs/nn/current), ensuring that the permissions are set
correctly.
- Copy
the VERSION and latest fsimage file to
the /dfs/nn/current directory.
- Run
the following command to create the md5 file for the fsimage.
$ md5sum fsimage > fsimage.md5
- Start
the Name Node process.
Here is another way --- The Shahed Way
Put HDFS in safemode, so no write
operation will be allowed
# hdfs dfsadmin -safemode enter
Create new fsimage by merging any
outstanding edit logs with the latest fsimage, saving the full state to a new
fsimage file, and rolling edits
# hdfs dfsadmin -saveNamespace
Copy the latest fsimage from HDFS to
directory on local file system. This file can be stored for backup purpose
# hdfs dfsadmin -fetchImage
<local_dir>
Get Namenode out of safe mode to
allow write operation and normal operations
# hdfs dfsadmin -safemode leave
NOTES:
- Safemode will impact any HDFS clients that are
trying to write to HDFS.
- The active NameNode is the source of truth for
any HDFS operation.
- A good practice is to perform a back once per month but more often never hurts
Note: only a member of this blog may post a comment.